My Favourite Diff

(essays.jwatzman.org)

205 points | by jwatzman 1481 days ago

18 comments

  • kazinator 1481 days ago
    Recent GCC can diagnose the situation where a switch case falls through without a /* fallthrough */ comment.[1]

    C++17 has a comment-like language feature for demarcating desired fallthrough.

    Fallthrough in case processing is often useful; it's just not such a great default.

    In the TXR Lisp tree-case[2] construct, any case which returns the : object falls through to the next case.

    This is used all over the place in the library.

    --

    [2] https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Warning-Options...

    [1] https://www.nongnu.org/txr/txr-manpage.html#N-03D834A5

    • adrianmonk 1480 days ago
      > not such a great default

      This is a tangent, but it reminds me of a personal policy of mine: if I'm thinking about changing a default, I ask myself if there should be a default at all.

      The consequences of having the wrong default were serious enough that it even came to my attention. So by changing the default, am I potentially exchanging one set of problems for another?

      Sometimes it would be better overall if the user is forced to explicitly state which way they want it. Not always, of course, but I like to ask the question.

      • adrianmsmith 1480 days ago
        Right! I had a bug in a SQL statement to fetch tasks “order by priority” and then execute them in that order. A stupid mistake, but I must have looked at that code tens of times and I just couldn’t spot it. The bug was that that statement found then executed the lowest priority items first.

        It’s not clear to me that an “order by” should have a default direction. I don’t know why one direction is more likely to be used than the other. Without a default, I would have been forced to think about the direction and wouldn’t have had that bug.

        • Apanatshka 1480 days ago
          A similar example is in a "filtering" operation. I mostly have this in functional programming where you pass a predicate to filter on. But what's the perspective of the filter? Are you filtering poison from your water, or are you sifting the water looking for gold nuggets. I prefer to use functions remove and retain (respectively) to be more explicit.
          • kazinator 1480 days ago
            GNU Make has this naming problem too $(filter pat,list ...) and $(filter-out pat,list ...).
        • Someone 1480 days ago
          That’s an ambiguity in English. “Foo is priority #1, bar priority #2, so foo has the higher priority” is perfectly valid English.

          The only way to ‘solve’ that would be by banning priorities higher than 1, for example by numbering them 1, 0, -1,… or by using floats in the range (0,1] for priorities.

          • ken 1480 days ago
            I would say this is a problem with type systems. Programming languages (including SQL) mostly just have one type for “integers”, or possibly a few types for different numbers of bytes of storage.

            “Priority” and “array index” and “number of cantaloupes” and “age on 1-Jan-2020” may all be represented by integers but they’re all very different types.

            Hungarian tried to solve this, sort of, but it’s at the wrong level. It’s not going to know which ones of these sort in which direction.

        • brigandish 1480 days ago
          Isn't everything usually ABC and 123 by default?

          I suppose countdowns aren't but there's a clue in the name…

          • adrianmsmith 1480 days ago
            Right, priority was 1..5 where 5 was a higher priority. So “order by priority” delivered priority=1 items first, and processed them in that order, which was the opposite of what I wanted.

            It was absolutely my mistake, but “order by priority” reads so well in English, so my eye just couldn’t see the mistake.

            I mean I’d been programming for decades at that point, I knew the syntax, I know our data model, I stared at the code and I still couldn’t see it. (Eventually I did see it of course..)

            Since then I’ve been of the belief it would be better if SQL forced you to put “asc” or “desc” when using “order by”. Then you’d be forced to think about it.

            So I was agreeing with the previous poster, who said that defaults aren’t always a good idea.

            • gonzus 1480 days ago
              This is also an example of how difficult naming can be. This is arguably a "ranking" (1=low, 5=high), not a "priority" (1=high, 5=low).
              • adrianmsmith 1480 days ago
                That’s a very good point.

                However people say things are “high priority” which, at least to my mind, implies higher numbers, so I don’t think ”priority” is necessarily the wrong term here.

                • andrey_utkin 1480 days ago
                  right, and you execute tasks in descending priority order, not just the priority order.
            • aasasd 1480 days ago
              > “order by priority” reads so well in English

              I had my suspicions that programming reads very different to native speakers, and I guess here's a case when it's better to have English as a second language: I heard a ding-a-ling in my head immediately upon seeing the absence of ‘desc’.

      • renox 1478 days ago
        It's a good question but in this case, I wouldn't be surprised if 99% of the usage don't fallthrough.
    • johnny-lee 1480 days ago
      My perl script which scans C/C++ source code for typos would emit a warning (case 19) for a case statement that fell through to the next case statement. The script handled the 'fall through'-comment situation as well.

      You can read the 1999 Perl Conference paper for the perl script at https://sites.google.com/site/typopl2/afastandeasywaytofindb...

  • letientai299 1480 days ago
    _Often my biggest contribution isn’t being the only one who is able to fix something, but being the only one who realises something is fixable._

    I feel relatable. At all the placed I've worked, people realize the problem, sometime find the solution, but most of the time, they won't fix it, unless it directly affect their code/project/product. As a result, technical debt keeps bubble up and we have more and more issues.

    Right now, the only thing I can do is just to keep fixing issues as I encounter them. But that's for the code, I don't have any solution for the "people" problem.

    • xorcist 1480 days ago
      I call this the person with taste. One of the first questions I try to answer in new setting is to identity the person "with taste" around. They are usually a good entry point if you want to get something done.
    • nicbou 1480 days ago
      Many problems lack a Designated Person Who Cares. Everyone has something to do, a task to accomplish. There's never someone whose job is to improve the codebase and the tooling, even when it can benefit the team.
    • auxym 1480 days ago
      I have not worked in software for long (a year, a while ago) but my experience with agile at that time was that even if you realize that you can and should fix something, you still have do political representation to the scrum master, team lead and product manager so that an issue (sorry, user story) is created and gets prioritized (over features and other bugs) in a future sprint.

      Going rogue and spending an afternoon fixing stuff without an approved US (to charge time on) was not seen well.

      • julius_set 1480 days ago
        Unfortunately this is often the case at larger companies, unless you find a really cool manager / PM to work under who just trusts you — note this can be achieved over time.

        I’m talking from personal experience of going rogue :)

  • xelxebar 1480 days ago
    This reminded me of Raymond Chen's blurb article on his blog, The Old New Thing:

    https://devblogs.microsoft.com/oldnewthing/20110118-00/?p=11...

    For whatever reason that heavily influenced me, and now I almost always submit patches together with my bug reports. Sometimes its a whole lot more work, but I feel like this practice pays dividends in the amount you learn as well as the comraderie it fosters with the upstream devs.

  • MrStonedOne 1480 days ago
    One annoying thing about the war on case fallthrough at the language level is the lack of imagination.

    `continue` is a thing, compliments switch's existing `break` statement usage and is overall a much better solution for explicitly specifying fallthrough then returning special objects or c#'s abuse of goto.

    • Spivak 1480 days ago
      I think it's a pretty syntax idea but it means continue works opposite your intuition inside switch statements compared to everywhere else.

      I mean you could kinda argue that switch could conceptually mean something like

          for matcher, block in cases:
              if matcher.matches(subject) or continued:
                  eval block
      
      but then continue still feels magic since it bypasses the test.
  • JadeNB 1481 days ago
    Since there's not much context here: the article, which I enjoyed, ends with "Be the change you want to see in the world", and shows how the author exemplified this by coding a fix to a persistent source of PHP bugs at $WORK.
  • aasasd 1481 days ago
    As a technical aside (I know it's not the main point, but still): with the ubiquity of Git workflows employing pre-merge checking on the platform of choice (e.g. Github or a CI tool), this is rather easily done via a rule in an off-the-shelf linter. No `break` in a `case`? Can't merge it.

    As a language-design aside, `switch` in general is stinky stuff. Not just with the fall-through: it also violates the regular C-style syntax for no particular reason, having a mini-syntax of its own instead.

    But the most perverse thing I've seen done with `switch` is using it as `if`:

        switch (true) {
            case ($a == $b): ...
            case ($c == $d): ...
            case (itsFullMoonToday()): ...
            default: ...
        }
    • Someone 1481 days ago
      If that’s the most perverse you’ve seen, I guess you haven’t seen Duff’s device, which mixes it with a do…while loop (code copied from http://www.catb.org/~esr/jargon/html/D/Duffs-device.html):

         register n = (count + 7) / 8;      /* count > 0 assumed */
      
         switch (count % 8)
         {
         case 0:        do {  *to = *from++;
         case 7:              *to = *from++;
         case 6:              *to = *from++;
         case 5:              *to = *from++;
         case 4:              *to = *from++;
         case 3:              *to = *from++;
         case 2:              *to = *from++;
         case 1:              *to = *from++;
                            } while (--n > 0);
         }
      • tzs 1480 days ago
        I have no idea if this still works in modern C, but here's something kind of similar I've seen used in C circa 1986:

          switch (foo)
          {
            case 1:
              ...code just for case 1...
              if (0)
              {
            case 2:
                ...code just for case 2...
              }
              ...code for both case 1 and case 2...
              break;
            case 3:
              ...
        
          }
        
        used when two cases have a common tail, and you have sufficient space constraints that you do not want to duplicate the common tail code, and you have sufficient time constraints (and maybe space constraints on the stack) that you don't want to put the common tail code in a subroutine and call it from both cases.
        • fisherjeff 1480 days ago
          Oh my god. I’ve wondered about a way to do this but... this is just devious.
        • X-Istence 1480 days ago
          This is goto with slightly more code ;-)
      • aasasd 1480 days ago
        That's not perversity, that's suicide by art.
    • jwatzman 1481 days ago
      Post author here.

      One of the nice advantages we had in using our static analysis tool for this, with real code intelligence, instead of a simpler linter, was that we could do a much more clever check. The actual rule was "every case must be terminal", and we had a rich analysis for "terminal", so stuff like this would be considered legal:

          case blah:
            if (cond) {
              throw blah;
            }
            if (cond2) {
              some_noreturn_function();
            } else {
              return 42;
            }
      
      etc etc. My example is a bit contrived, and the wisdom of doing such complicated things in switch/case is questionable, but it was useful when dealing with an existing codebase.

      Another advantage was that our tool was designed for extremely fast analysis over the entire codebase, so errors were given to programmers as they were writing code, immediately. (200ms response time to any change. At that level you can run it on every keystroke, which we did.) Traditional linters can in principle do this, but in practise are often not written with this sort of performance requirements in mind.

      (And yes, PHP allowing arbitrary expressions in case labels was... probably not a good idea.)

      • muglug 1480 days ago
        FYI, phpcs now has something similar (since roughly 2016), but I imagine your implementation predates that by a significant margin (and yours obviously supports noreturn too).

        > PHP allowing arbitrary expressions in case labels was... probably not a good idea.

        Yeah, possibly the worst offender is the (somewhat common) "switch (true)" pattern: https://psalm.dev/r/4c168a9c5d

        Sadly I don't have the latitude you all do to govern how code gets written by users of my own static analysis tool – fall-through has to work there, too: https://psalm.dev/r/1d4ee59bd6

    • edflsafoiewq 1481 days ago
      A switch is a kind of computed goto. It's a fantastically useful construct when you need it. You don't usually need it. I'm not sure what mini-syntax you're referring to, the syntax is roughly what I would have guessed it would look in C if I hadn't seen it and you just told me its semantics.

      Most people treat switch as a peculiar form of match (or cond or whatever), probably because this would usually be much more useful to have than a computed goto, and in many derivative languages they've put in weird additions that seem to be an attempt to drag it halfway to that, things that don't really make any sense in C, like allowing dynamic expressions in the cases (such as your example), or forcing all cases to be at the top level.

      (I feel like I write a lot of "in defense of switch" comments on HN...)

      • Supermancho 1480 days ago
        > Most people treat switch as a peculiar form of match (or cond or whatever)

        Completely made up switch illustrates how I use it in every language if I can...Go has a succinct version of this.

            switch(True) {
                case ($x == 1):
                    $x++;
                    break;
                case ($x > 3):
                    $x--;
                    break;
                default:
                    break;
            }
        • edflsafoiewq 1480 days ago
          Why? It's longer, more complex, and more highly indented than a normal if-elif-else.
    • benhoyt 1480 days ago
      This is actually a feature in Go, but you leave off the "true" condition entirely: https://tour.golang.org/flowcontrol/11

      I quite like it, as it cuts down on the amount of brace and if/else if noise.

    • kazinator 1481 days ago
      That could be the work of a Lisp coder rebelling against having cond taken away, and having to fend with if/else/elif.
      • aasasd 1480 days ago
        Oh, no—that was the guy who loved MS and .NET despite working in a Linux-based shop, drove a Chinese-Russian SUV when not riding a roadbike in tight pants, supported Putin and swore about every ten words in both speech and chats, that's all while looking hella dorky. One thing I can't picture him doing is pining for Lisp.

        Now, grinding everyone's gears with his abuse of `switch`—that was right up his alley, what with already grinding them with his political commentary outside of code.

    • RealityVoid 1480 days ago
      What language is that? It's not C, because it wouldn't compile.
      • JadeNB 1480 days ago
        Looks kind of like Perl to me, except that Perl uses `given … when …`.
      • naniwaduni 1480 days ago
        The third example is valid C.
  • userbinator 1480 days ago
    On the contrary, I've heard of a story where a team decided to apply some new tool to their codebase which warned about a missing break, inserted the break so the warning disappeared, "fixed" the failing tests that now resulted, and then upon the next release was subsequently screamed at by numerous customers for the unwanted behaviour change.

    but in a very high-level language like PHP, there’s really no reason for switch to fall through at all

    I disagree. Switch fallthrough is very useful because it can reduce code duplication, especially when the logic has a ladder-like structure. Trying to impose increasingly draconian and such arbitrary rules is only going to lead to a self-fulfilling-prophecy where all the intelligent developers will get fed up and leave, and what's left are those which will continue to create tons of bugs some other way instead.

    • jwatzman 1480 days ago
      Author here.

      Yeah, you can't just mechanically insert the breaks :) I carefully went through each instance where my rule tripped, looked at the surrounding code, and figured out the right fix. Trying to automate that would indeed have lead to disaster -- and the lack of ability to automate this is why no one had done this in the past, it sounded like too much work. (It wasn't that much work.)

      And I agree switch fallthrough can be useful! We just required it be annotated from now on, to convey the intent, as opposed to doing it by accident.

    • nicbou 1480 days ago
      I don't think I would quit a company over a more stringent linter.

      To me, this is the same attitude that leads to safety hazards in other industries. "I don't need safety measures because -I- am not an idiot".

      Even the smartest developers make dumb mistakes. If you can eliminate some of those early with minimal friction, it's worth it. The earlier you catch a mistake, the cheaper it is. Linter rules are less of a hassle than bugs in production.

      • earthboundkid 1480 days ago
        The kind of person who complains about the linter is the exact reason your company needs a linter.
    • Cyberdog 1480 days ago
      It reduces code duplication at the cost of increasing complexity and error-prone-ness, and there are ways to reduce the duplication. For example, given the following snippets:

          switch ($foo) {
            case 123: 
              // do 1
            case 23:
              // do 2
            default:
              // do 3
          }
          
          if ($foo == 123) {
            do1();
            do2();
            do3();
          }
          else if ($foo == 23) {
            do2();
            do3();
          }
          else {
            do3();
          }
      
      Yes, the second one is more verbose and duplicates code, but I think it's also far more clear in terms of intention.
      • dalemyers 1480 days ago
        You can even take it a step further:

            if ($foo == 123) {
              do1();
            }
            
            if ($foo == 123 || $foo == 23) {
              do2();
            }
            
            do3();
  • evmar 1480 days ago
    I did a similar thing for Chrome back in the day, to have the compiler check an `override` annotation:

    http://neugierig.org/software/chromium/notes/2011/01/clang.h...

  • vortico 1480 days ago
    Just a reminder that in most compiled and JIT optimized languages with switches, this

      switch (c) {
        case 4: return 30;
        case 5: return 70;
        default: return 0;
      }
    
    compiles to the same instructions and therefore same performance as

      if (c == 4) {
        return 30;
      }
      else if (c == 5) {
        return 70;
      }
      else {
        return 0;
      }
    
    As a side note, it's easier to see accidental fallthroughs if you write switch statements like this.

      switch (c) {
        case 4: {
          return 30;
        } break;
        case 5: {
          return 70;
        } break;
        default: {
          return 0;
        } break;
    
    But in my opinion, the `if` statement is more clear in both cases, and only one level of indentation instead of 2.
    • comex 1480 days ago
      Huh. To my eyes, the switch version is "obviously" much more clear! It's significantly more compact, with less visual noise you have to mentally ignore. And having the possible inputs and outputs lined up vertically (especially with nothing else between them) makes it easy to understand the logic. Plus, using a switch indicates at a glance that the variable being tested is the same in all cases, whereas with an if-else-if chain there could be any weird expression hiding in one of the conditions.

      Eye of the beholder, indeed.

      • vortico 1480 days ago
        Perhaps a better comparison to the switch statement is

          if (c == 4) return 30;
          if (c == 5) return 70;
          return 0;
        • comex 1480 days ago
          Yep, that's a big improvement in my eyes, making it comparable from a readability perspective. In my opinion, that leaves more minor pros and cons of each approach, depending on the language. For example, in C and C++, as kazinator mentioned elsewhere in the thread, GCC and Clang now have an option to require all fallthroughs to be explicitly annotated, which effectively removes one of the main downsides of switch. And switch is better at detecting duplicate cases, as well as checking for exhaustivity if you're matching against enum variants. On the other hand, if you change the example so that the cases don't return (perhaps they assign to a variable instead), with switch you would need to add a `break;` after each one, increasing visual noise. If statements don't have that problem. Or if you change the example so that one of the values `c` is compared to is not a constant, some languages (C, C++, Java but not JavaScript, PHP) don't allow that with switch statements...
    • bla3 1480 days ago
      That's only true if there are <= 3 "clusters" of switch case values, else compilers do smarter things. See https://youtu.be/gMqSinyL8uk
    • enriquto 1480 days ago
      Are you sure? I'd guess that if the cases are an interval of consecutive integers it may do a "jmp [c]" sometime.
      • KMag 1480 days ago
        You're correct. Many compilers will emit jump tables if the domain is dense enough. Also, even if compiling to a bunch of conditional jumps, they'll usually emit them to require O(log N) conditional jumps (equivalent to a binary tree of nested ifs) instead of the O(N) linear arrangement suggested by the GP.
        • vortico 1480 days ago
          Clang compiles both switch statements and `if` statements to the same tree search code. E.g. https://godbolt.org/z/7M_3PU My point is that the optimizations you're describing for switch statements are also applied to a bunch of `if` blocks, at least in LLVM-based languages with an LLVM version from the last several years.
          • KMag 1480 days ago
            Sure, I didn't mean to imply that compilers never (or commonly didn't) detect that long if-chains could be optimized to jump tables, lookup tables, or if-trees. I was just answering the more narrow question the GGP was asking: if switch statements were always lowered to if-chains and never compiled to jump tables.
            • vortico 1480 days ago
              Ah I see, gotcha.
      • vortico 1480 days ago
        Looks the same to me. https://godbolt.org/z/CfhD5k Switches were invented mostly because they were an easy hint to the compiler to reduce to a tree search, but because compilers are smart today, they're obsolete unless you just prefer their syntax.

        Even if the cases are consecutive, most compiler optimizers are smart enough to determine that the ASTs are equivalent.

        • enriquto 1480 days ago
          > they're obsolete unless you just prefer their syntax.

          This is a good point.

          Sometimes I prefer the (ugly) syntax of switches than that of conditionals. The reason is that a chain of conditionals is non-symmetrical (there's no condition for the first "case"), while a switch is formally invariant to permutation of the cases. I have even seen shit like this:

              if (false) ;
              else if (c == 1) { first case ; }
              else if (c == 2) { second case; }
              ...
          
          just to preserve the symmetry along the cases.
          • vortico 1480 days ago
            Are you talking about the `else` being non-symmetric? You don't need it at all. I do this.

              if (c == 1) { first case ; }
              if (c == 2) { second case; }
            • enriquto 1480 days ago
              ...and hereby I give back my programming license and go back to civil life.
            • pintxo 1479 days ago
              This might result in another fallthrough scenario though, wouldn't it?
              • vortico 1479 days ago
                No, because `c` can't be 1 and 2 at the same time.
                • pintxo 1478 days ago
                  In this specific example. But you cannot rule it out in general.
  • joatmon-snoo 1480 days ago
    Google applies ErrorProne in the default javac toolchain to mitigate this for Java builds: https://errorprone.info/bugpattern/FallThrough
  • dlbucci 1480 days ago
    Recently, my coworker basically asked if he should set a TTL on a password for a job we were running. We said yeah, but our code would need changes to accommodate it before it expired and we didn't have time to make them.

    I joked that we'd make a JIRA ticket that we'd repeatedly deprioritize until our code mysteriously broke in 6 months, but then my coworker actually made an Outlook reminder for when we had a month left!

    I thought that was really smart, so I guess I learned a lesson about actually solving problems instead of just pointing them out like a smart ass.

  • CGamesPlay 1480 days ago
    What did you do in cases where the fall through was desired behavior? Did the resulting code (replacing a fall through with a nested if or something?) look better or worse as a result?

    The typical way to handle this at companies that don't have a static-type-checker department is to require a comment that says /* falls through */ at the end of the switch. Why was such a simple solution inappropriate here?

    • mplewis 1480 days ago
      Because code reviews are more fallible than automatic linters, and OP had the luxury of working somewhere that the noisy people didn't fight for an anti-feature.
      • CGamesPlay 1480 days ago
        I wasn't talking about requiring it in code review though. The OP didn't make an automatic linter that issued a warning when you did this implicitly, he removed the ability to do it entirely. With a linter, you can add a comment, like /* falls through */ to disable the lint warning. With the OP's change you can... use a series of if-else blocks? The alternative isn't listed, which is why I asked.

        Sorry about your last company, though. Sounds like it was annoying to work there!

        • jwatzman 1480 days ago
          OP here.

          Explicitly-annotated fallthrough was still allowed; I codified a particular annotation that the analysis tool understood and allowed. I removed implicit fallthrough, but explicit fallthrough can be quite useful sometimes.

    • GhostVII 1480 days ago
      Sounds like that's what they did:

      The cleanup diffs I’d written that afternoon mostly just annotated clearly-intentional fallthroughs

  • l0b0 1479 days ago
    Some of my favourite diffs were related to linters:

    - Linting files different from origin/master in a pre-commit hook.

    - Linting everything in CI.

    - Making the linter rules stricter (mypy does a great job of having several orthogonal strictness flags, so you can pick the set appropriate for your familiarity with the tool).

    - Implementing project-specific lint checks.

  • ChrisMarshallNY 1481 days ago
    Cool story.

    I use Swift. It does not have default switch fallthrough. After a couple of versions, I learned about the fallthrough statement.

    I remember being frustrated by it, at first, but, like so many Swift oddities, it rapidly became second nature; thus, proving out that one of the goals of Swift is to train developers to write code properly.

    • tcbasche 1480 days ago
      Go also has the same behaviour, with the fallthrough keyword. I find myself using a switch so rarely anyway...
    • saagarjha 1481 days ago
      I thought the first release of Swift had fallthrough?
      • ChrisMarshallNY 1480 days ago
        Maybe it did, but I didn’t learn about it for quite some time. I just learned to work around it.

        Swift is a deep river. I still learn new stuff every day, and I write Swift, seven days a week.

  • hkai 1480 days ago
    If the author discovered tslint/eslint he'd be surprised that this problem was long solved in the JS world :)
    • muglug 1480 days ago
      I believe this happened before tslint or eslint existed.
  • unnouinceput 1480 days ago
    So many cases about this and that, pro and cons of switch fall-through, while Pascal was King from beginning.
  • T3RMINATED 1480 days ago
    WinMerge is the #1 differ out there so stop spreading lies.