Show HN: Learn C and its lower levels interactively, in the browser

(item?id=19126544)

356 points | by vasyop 13 days ago

20 comments

  • philonoist 13 days ago

    You should put this in "Show HN: "

    This project is amazing!

    My hunch is that academia is already sniffing this and musing to adopt into their courses.

  • burfog 12 days ago

    Nice, but some of the missing features really hurt:

    I couldn't do a simple #define. This is pretty fundamental to typical C programming. One would also expect stringification, token pasting, and variadic macros.

    I added a goto, and it won't compile. Yes, this is a valid and useful part of C, and must be learned by any C programmer. If you doubt this, count occurrences of goto in the Linux kernel.

    I also couldn't do modern array initializers. I'll guess the same is true of modern struct initializers. These are important for producing maintainable code.

    There seems to be no support for the restrict keyword. This keyword makes a dramatic performance difference.

    • jamestimmins 12 days ago

      It seems like this comment is missing the point of this project, which is explicitly billed as an experiment to test the viability of an approach. It also sounds like you may have more extensive C experience than the target audience.

      But more importantly, every MVP is going to lack certain features that some folks deem essential. But when we focus on missing features, or worse, miss the point of a project ("you built X but what I really want is Y"), we discourage them or others from trying/sharing future experiments.

      • burfog 12 days ago

        I sort of am the target audience: I'm currently teaching C.

        The lack of important standard features means I can't just say "this is C".

        For teaching, normal data type sizes are important. Being able to have static data is important. Beginners need to see how these things work.

        I do applaud the effort. I think it is a great idea.

        • 12 days ago
          [deleted]
          • kazinator 11 days ago

            Beginners also need to know that a specific translation to machine or virtual machine instructions isn't the definition of the program's behavior.

            If this was a good idea, Kernighan and Ritchie would have filled the pages of that book with PDP-11 instruction sequences accompanying bits of C code.

        • kazinator 11 days ago

          > This [restrict] keyword makes a dramatic performance difference.

          Hyperbole; it's applicable in specific circumstances.

          There are ways to code without restrict to get the same performance.

          Sample code:

            struct node {
              struct node *next, *prev;
            };
          
            /* Original: pred and succ must not overlap, but this is not expressed. */
          
            void insert_after_A(struct node *pred, struct node *succ)
            {
              succ->prev = pred;
              succ->next = pred->next;
              pred->next->prev = succ;
              pred->next = succ;
            }
          
            /* Optimize with restrict: pred and succ do not overlap. */
          
            void insert_after_B(struct node *restrict pred, struct node *restrict succ)
            {
              succ->prev = pred;
              succ->next = pred->next;
              pred->next->prev = succ;
              pred->next = succ;
            }
          
            /* Optimize by hand in "load-store style". */
          
            void insert_after_C(struct node *pred, struct node *succ)
            {
              struct node *oldsucc = pred->next;
              succ->prev = pred;
              succ->next = aft;
              oldsucc->prev = succ;
              pred->next = succ;
            }
          
          Compiling with GCC, the first example generates 8 memory operations; both of the other two get it down to 7.

            insert_after_A:
                    movl  4(%esp), %eax    ;; %eax = pred
                    movl  8(%esp), %edx    ;; %edx = succ
                    movl  (%eax), %ecx     ;; %ecx = pred->next
                    movl  %eax, 4(%edx)    ;; {%edx:succ}->prev = pred
                    movl  %ecx, (%edx)     ;; {%edx:succ}->next = {%ecx:pred}->next
                    movl  (%eax), %ecx     ;; %ecx = pred->next !!! deja vu: this load was done a few instructions back!
                    movl  %edx, (%eax)     ;; {%eax:next}->next = {%edx:succ} 
                    movl  %edx, 4(%ecx)    ;; {%ecx:pred->next}->prev = {%edx:succ} 
                    ret
          
            insert_after_B:
                    movl  4(%esp), %edx
                    movl  8(%esp), %eax
                    movl  (%edx), %ecx
                    movl  %edx, 4(%eax)
                    movl  %eax, (%edx)
                    movl  %ecx, (%eax)
                    movl  %eax, 4(%ecx)
                    ret
          
            insert_after_C:
                    movl  4(%esp), %edx
                    movl  8(%esp), %eax
                    movl  (%edx), %ecx
                    movl  %edx, 4(%eax)
                    movl  %ecx, (%eax)
                    movl  %eax, 4(%ecx)
                    movl  %eax, (%edx)
                    ret
          
          Both the use of restrict in B and the technique in C have cut down the wasteful memory access. That access is done due to the suspicion that the object was changed by a prior operation due to overlap.

          Function C works by caching the pred->next value in a local variable and referring to that.

          None of the assignments through the structure type can possibly affect the value of aft; the structures cannot overlap with the local variable. (This is an implicit non-overlap restriction similar to what restrict expresses for the two arguments.)

          Once we establish aft, all of the pointers involved in the function are local variables; so none of the local->memb = val assignments raise any suspicion that the value of local has been overwritten, requiring it to be reloaded from memory. We code five accesses and got five (plus the two to load the arguments from the stack, making seven).

          Function B has the disadvantage that the behavior becomes undefined if pred and succ are pointers to the same node. Function C has no such problem. Even though the code is just as good as for B, the behavior is defined for overlapping pred and succ.

          restrict is C trying to keep up with Fortran. What are situations when we can't use this type of load-store coding to reduce memory operations? Why, array processing!

          Well, of course we can take the same approach in array processing; but the problem is that array processing is automatically unrolled by the compiler. Unrolling is hampered in some situations when we don't know whether the arrays overlap. If we simply introduce local variables into the loop, it still won't be unrolled. What we really have to do is manual unrolling. Manual unrolling is guesswork; whether unrolling helps or hurts depends on which specific member of which processor family we are compiling for (how big is its instruction cache and such).

          E.g.

             vector_add(double *sum, double *a, double *b, int n)
             {
                for (int i = 0; i < n; i++)
                  sum[i] = a[i] + b[i];
             }
          
          If we add restrict here, then none of the arrays overlap and this kind of optimization is valid. (Let's ignore the nuances of n not being divisible by 4):

              for (int i = 0; i < n; i += 4) {
                sum[i]   = a[i]   + b[i];
                sum[i+0] = a[i+0] + b[i+0];
                sum[i+1] = a[i+1] + b[i+1];
                sum[i+2] = a[i+2] + b[i+2];
              }
          
          If we code this ourselves, it's a lot of work, which could slow down the code if the unrolling turns out to be bad for our target CPU.
        • jventura 13 days ago

          I think it is quite interesting, although I think you do have too much information on the tutorial (too much clicking to do). I think it would be better if you could provide a much simpler example that it would allow people to learn how to use the tool by themselves. For instance, maybe start with what a push means, etc.

          The thing is that you have two levels of learning here: the compilation step and how the machine works. It could be better to start with how the machine works and eventually how to write programs for it..

          Keep up with the good work!

          • anon42428428313 13 days ago

            I know basic C but never digged into what happens below the surface. As someone who doesn't know these things I find the tutorial really well-made, with a clean and concise exposition.

            • smnplk 13 days ago

              I am also tinkering a bit (pun intended :)) with low level stuff, currently reading Low-Level Programming by Igor Zhirkov.

            • intenex 13 days ago

              This is basically the coolest thing ever huge insane thanks for putting this together man, you are a true hero

              • equalunique 13 days ago

                Awesome work!!!

                Reminds me some of Compiler Explorer: https://godbolt.org/

                • Bucephalus355 13 days ago

                  Have waited forever for something like this. So. Many. Things. Boil down to C.

                  Thank you thank you thank you

                  • ucha 13 days ago

                    Just a heads up: it works on Chrome but not on Safari

                    • azakai 13 days ago

                      Looks like it's a blazor app (mono compiled to wasm using emscripten), so it should run in all modern browsers - what error do you get on Safari?

                      • macintux 12 days ago

                        Mobile Safari gives me a blank page.

                        • snazz 12 days ago

                          Same here. Had to check that I had JavaScript turned on.

                    • q3k 13 days ago

                      This could be very useful to visually teach exploitation basics (buffer overflows, stack smashing, ROP).

                      • bogomipz 13 days ago

                        This is really fantastic. Looking forward to more tutorials. Thanks for sharing.

                        • mattsfrey 13 days ago

                          This is super cool and would be great for CS classes, well done!!

                          • Panoramix 13 days ago

                            This was great! Looking forward to them for loops.

                            • ayepif 13 days ago

                              This is a very well-thought out pedagogical tool. Can't wait for part 2!

                              • tyingq 13 days ago

                                Very cool. Maybe add malloc() and friends to missing functionality.

                                • vasyop 13 days ago

                                  there is new

                                  • unwind 12 days ago

                                    Not in C, there isn't.

                                    • 12 days ago
                                      [deleted]
                                • DyslexicAtheist 13 days ago

                                  wow, absolutely amazing!

                                  • thegabez 13 days ago

                                    Great tool! Thanks for making and sharing.

                                    • Siira 13 days ago

                                      These links are blank pages for me.

                                      • cwhsu 13 days ago

                                        You might have to wait a few seconds. It's blank initially, but it shows the code snippet and compiler in 3 seconds on my laptop

                                        • saagarjha 13 days ago

                                          I could not get this to work in Safari, but they do show up in Chrome.

                                          • sgt 13 days ago

                                            Same, it's not working in Safari.

                                      • GoToRO 12 days ago

                                        What is 1 in CALL 1?

                                        • Jahak 13 days ago

                                          Not bad

                                          • denniskane 13 days ago

                                            I have a similar kind of functionality in Linux on the Web, but dealing with parsing wasm (web assembly) files. There is a desktop environment, but I think it is best to use the shell environment at https://linuxontheweb.org/shell.os.

                                            First, do this:

                                            $ cp /site/root/system/wasms/games/binjgb.wasm .

                                            The system is built around the concept of command libraries. You are going to need to import the wasm library like this:

                                            $ import wasm

                                            This loads a fairly powerful command called parsewasm. To get an informational printout of the toplevel sections in that file, run the command:

                                            $ parsewasm --toplevel binjgb.wasm

                                            There are lots of other subcommands in parsewasm, but the coolest part is that you can list out the names of exported functions and then dump out the code bodies of whatever functions you want to inspect.