Noweb – A Simple, Extensible Tool for Literate Programming

(cs.tufts.edu)

140 points | by Tomte 1153 days ago

17 comments

  • gnuvince 1153 days ago
    Hijacking this topic to talk about something I've been thinking about lately: literate diffs.

    I find that the order of diffs given by git is not optimized for helping a reviewer understand the change. Sometimes the order of files will not be in the most logical way; sometimes unrelated changes (e.g., a text editor removing blanks at the end of lines) create noise; etc.

    I've been thinking that it would be interesting to have a tool where the author can take the diff of their commit(s), order them in a way that is conducive to understanding and explain each part of the diff. That'd be similar to having the author do a code walkthrough, but at the pace of the reader rather than the author.

    • shakna 1153 days ago
      If you're making use of something like git-send-email you can already do this easily.

      The patch format explicitly allows it to ignore "junk" information at certain points, so you can edit in comments all over the place. The format also lets you break up a diff, rearranging it semantically, and it'll get rebuilt later.

      Edit, to expand on the above:

      > patch tries to skip any leading garbage, apply the diff, and then skip any trailing garbage. Thus you could feed an article or message containing a diff listing to patch, and it should work..... After removing indenting or encapsulation, lines beginning with # are ignored, as they are considered to be comments.

      > With context diffs, and to a lesser extent with normal diffs, patch can detect when the line numbers mentioned in the patch are incorrect, and attempts to find the correct place to apply each hunk of the patch. As a first guess, it takes the line number mentioned for the hunk, plus or minus any offset used in applying the previous hunk. If that is not the correct place, patch scans both forwards and backwards for a set of lines matching the context given in the hunk.

    • jonahbenton 1153 days ago
      Take a look at the term "Semantic Source Diff", eg

      https://martinfowler.com/bliki/SemanticDiff.html

      Tools in this space date back to the 1990s. There is a recent upsurge of interest, a number of capable tools for different languages are currently available.

    • geofft 1153 days ago
      I would love if my VCS tool could also keep track of things like "This diff is the result of running sed -i s/this/that/g .py". I usually split out such mechanical changes anyway into a separate commit, but it would be clearer for reviewers to see that (most review tools show you the overall diff of the entire branch you want to merge by default, making you click further to see patch-by-patch changes), and it would also be easier for me* if the VCS could re-run the sed command when I rebased.

      (An obvious next step is Coccinelle-style semantic patches, but let's start with sed!)

    • memco 1153 days ago
      What you’re describing is already possible with Git: rebase and committing chunk/lines allows you to organize your changes coherently. The trick is finding ways to get into the habit of doing it that way and staying consistent with the whole team.

      Edit: i’m not saying that this is a solved problem. I think the parent’s point is valid. I am just saying that there are some tools that make this possible and I agree that there is a definite need for improvements in this area.

    • mikepurvis 1153 days ago
      Love it. Currently there's a gap where the diff is generated by your review platform, but it would be amazing if there was a way to submit your annotated/ordered diff and the platform would use it as the review starting point, provided it passed validation in terms of actually being a representative and equivalent diff.
    • jedimastert 1153 days ago
      I believe most literate programming tools are language-agnostic, so you could probably do that with this tool!
  • mmcdermott 1153 days ago
    Literate Programming is one of those ideas I keep coming to. There is an idea there that touches on something I find to be true about software development, namely that the communication of an idea to other humans is the most critical piece. There is a similar idea in Naur's paper "Programming as Theory Building."

    That said, I've never loved the LaTeX-centric nature of most tools. I don't like heavier markup systems while I am writing prose, which is why I wrote SpiralWeb (https://github.com/michaeljmcd/spiralweb) as a Pandoc/Markdown centric tool.

    • fspeech 1153 days ago
      I am less convinced now after an initial period of enthusiasm. I can get around things locally fairly well with or without assistance. What takes effort to comprehend a code base is its overarching organization and internal interfaces that often only exist in the heads of the creators.

      But if one is into literate programming it is definitely a must to check out the Leo Editor http://leoeditor.com

    • loa_in_ 1153 days ago
      I found Mr. Ross' funnelweb utility to have the best syntax. Unique and easy to read.

      http://ross.net/funnelweb/tutorial/index.html

      Unfortunately the only known implementation was last updated over two decades ago, and is written in pretty hard to understand C.

      I asked for permission and started a repository here: https://github.com/loa-in-/fw-utf8

      I currently have it unmodified there, except for disabled check for ASCII range. (this modification is included in initial commit, sorry, my bad). Otherwise code is the same.

      • nerdponx 1153 days ago
        It's unfortunate that Funnelweb itself wasn't written in a literate style!
        • rixed 1153 days ago
          For the record, I wrote portia[0], based on funnelweb and which accept a (mostly) compatible syntax, in a literate style. Its source/doc can be browsed [1].

          I still use it from time to time, especially for small, well defined projects, because I find it useful to have to argue with myself when designing a software. It's not so much about producing a nice documentation or a proper exposition of some idea, than it is about having to formulate all the reasoning, the alternatives, and the choices.

          [0]: https://github.com/rixed/portia [1]: http://rixed.github.io/portia/

          • loa_in_ 1153 days ago
            Your project is just enough working (and up to spec!) that I might just restart my (100% compatible) funnelweb resurrection in literate format. Thank you!
    • sidpatil 1153 days ago
      Obligatory shilling of Org-babel, for those using Emacs and Org-mode: https://orgmode.org/worg/org-contrib/babel/
      • kmstout 1153 days ago
        I've been using Org for a little over a year, and it's actually quite nice. To support a blog post last year (https://reindeereffect.github.io/2020/05/05/index.html) I did my own quick and dirty rendition of chunk labels, chunk chain navigation, and clickable references.

        Lately I've built a faster, mostly drop-in replacement for org-babel-tangle (that doesn't unnecessarily clobber files that haven't changed); and I'm finishing up a more complete chunk formatter for HTML export, along with usable chunk index generation. Once that's done, I'll quit nerd sniping myself on literate programming systems for awhile and finish up a missive on programming a Turing machine to solve the Towers of Hanoi.

        • stevekemp 1153 days ago
          I keep meaning to experiment with bable/tangle in Emacs.

          I setup a simple literate configuration of my init file via markdown, which worked out really well, but doing it "properly" in org-mode would be a nice evolution.

          With markdown I just search for code-blocks, write them all sequentially to a temporary buffer and evaluate once done. So it is very simplistic, but also being able to write and group things is useful:

          https://github.com/skx/dotfiles/blob/master/.emacs.d/init.md

    • chriswarbo 1153 days ago
      I use a couple of Pandoc scripts to run code during the render of my Web site: http://chriswarbo.net/projects/activecode

      I originally tried Emacs org-mode babel, but it didn't really fit the 'batch pipeline' flow I wanted.

  • svat 1153 days ago
    One of the problems with literate programming is that everyone who wants to write literate programs seems to want to write their own literate programming system. This was the conclusion of a short-lived (running to 5 issues) column of literate programs in the Communications of the ACM (1987–1990), by Christopher J. Van Wyk:

    > Unfortunately, no one has yet volunteered to write a program using another’s system for literate programming. A fair conclusion from my mail would be that one must write one’s own system before one can write a literate program, and that makes me wonder how widespread literate programming is or will ever become. This column will continue only if I hear from people who use literate-programming systems that they have not designed themselves.

    And it did not continue. Since then though, it appears that Noweb (and more recently, org-babel, and somewhere in between the Leo editor) is among the literate-programming systems that have been the most successful at getting others to use them!

    Separately, something amusing:

    When Donald Knuth came up with "literate programming" (partly because it had been suggested to him, by Tony Hoare IIRC, that he ought to publish as a book the source of the TeX program he was rewriting, so he was led to solve the problem of exposition) and the idea of programs as literature, he made a joke (or maybe he was half-serious, hard to say):

    > Perhaps we will even one day find Pulitzer prizes awarded to computer programs. (http://literateprogramming.com/knuthweb.pdf)

    That does not seem likely, but reality is stranger than one can imagine: a literate computer program won an Oscar! In 2014, an Academy Award (Scientific and Technical) was given to the authors of the book Physically Based Rendering (http://www.pbr-book.org/), itself a literate program. So we have this video of the award presentation, where actors Kristen Bell and Michael B. Jordan read out the citation and one of the awardees (Matt Pharr) thanks Knuth for inventing literate programming: https://www.youtube.com/watch?v=7d9juPsv1QU

    • taeric 1153 days ago
      I actually have made it a point of order to try and pick up all literate programs in book form that I can. This one, is particularly dense, and big. So I have not made too much progress. :(

      I have made better progress in the MP3 book. Which, I have enjoyed. Same for the Stanford Graphbase.

      It is frustrating, as I am not a fan of c, all told. And I have not found any lisp literate programs. If you know of any, I'd be very interested.

  • taeric 1153 days ago
    I picked up https://smile.amazon.com/gp/product/1541259335 recently. It is a somewhat short program for MP3 written in a literate way. Is a very good argument for the practice.

    At the same time, it is a decent argument against the practice. Most programs are not linear in the "why" and are instead many many many competing priorities for why something was done the way it was. Moreso if you consider codebases with more than a few contributors. Especially so if they are all conceptually contributing.

    Which makes sense if you think of most creative books. You will have many contributors, but the narrative is usually split between a very small number of authors. Most contributions are in supporting art, editing, or general feedback. To move programming to a similar space, would require working with contributions in a similar way. (Last is clearly an assertion.)

  • HerrMonnezza 1153 days ago
    Literate programming seems to becoming popular in the R community due to KnitR and Rmarkdown. This seems to have sparked a few similar-working tools with possibly broader scope and adoption. In my bookmarks I find:

    - knot [1]: tangles source code from a text file formatted using plain markdown syntax, can use any markdown converter for weaving into a printable document

    - snarl [2]: extends markdown code blocks with syntax used for tangling, its "weave" steps just removes the additional syntax and outputs plain markdown

    - pylit [3] [4]: a bidirectional converter: code to formatted text and back. Uses reST for formatting, and preserves line numbers which is useful when debugging. Not an LP tool strictly, as it doesn't define/rearrange code blocks so you have to write your script in the order the compiler wants it, not in the order that would make the best exposition.

    Both seem to preserve relative indentation of chunks, so would be useful for Python too.

    [1]: https://github.com/mqsoh/knot [2]: https://blog.oddbit.com/post/2020-01-15-snarl-a-tool-for-lit... [3]: https://github.com/gmilde/PyLit [4]: https://github.com/slott56/PyLit-3

  • fjfaase 1153 days ago
    I am working on a tool, which can take a collection of MarkDown files with fragments of C code, and can combine these into a single C file, where all fragments are placed in an order, such that they can be compiled. Because defines can change the meaning of code depending where you place them, there are some restrictions on the input files. An example of the type of input file I have in mind, is given at https://github.com/FransFaase/RawParser/blob/master/docs/gra... . The tool I am developing, and which is far from finished, can be found in https://github.com/FransFaase/IParse in the MarkDownC.cpp file.
    • qbasic_forever 1153 days ago
      Check out lit.sh, it's a super simple shell script that does what you're after: https://github.com/vijithassar/lit It's basically a one way markdown to commented source converter. One really slick thing is that since it just comments out the markdown you lose no information about line numbers and can interpret errors from your compiler, etc. with ease (something which is almost impossible with more complex literate programming systems like noweb).
      • fjfaase 1153 days ago
        I would like to do a little more than this, as this script would still require you to enter the code fragments in an order in which they can be compiled. I also would like to use ellipses ('...') to allow to extend the definition of types and initialization functions. One of the ideas of LP is that you can present your code fragments in a non-linear fashion, explaining the code from inside-out and/or to delay the implementation details till the end. The problem with keeping the lines in sync, can also achieved by adding #line statements in the generated code.
    • nerdponx 1153 days ago
      This sounds a lot like the R Markdown format. https://rmarkdown.rstudio.com/articles_intro.html
      • fjfaase 1153 days ago
        I understand that R Markdown format is not similar to the GitHub MarkDown format. (Correct me if I am mistaken.) I would like to use the basis format of GitHub, also because it compatible with the github.io documentation websites.
        • nerdponx 1153 days ago
          I believe it's mostly compatible with Github Markdown and CommonMark. I'm not sure if the code blocks will render properly on Github.
  • jsyedidia 1153 days ago
    Literate (https://zyedidia.github.io/literate/ and https://github.com/zyedidia/Literate) is a great option for anybody interested in Literate Programming.
  • iluvblender 1153 days ago
    I have been using http://leoeditor.com/ for couple of years and it is an essential tool in my workflow. Great to see other literate programming tools out there.
  • dhosek 1153 days ago
    I used to be a serious advocate of LP. I spent a lot of time in Pascal Web and later CWeb. I really appreciated how the latter emitted #line directives in its output so that compiler error messages (all too common in those pre/proto-IDE days) and debugger traces would refer to the original CWeb source code. I also tended to have a single cweb file generate both the .c and .h files for a program which unfortunately meant that I would end up recompiling code that depended on the .h even if there was no change to the .h file. Something similar could be useful now, but a lot of the tooling around LP was geared towards printing out source code to refer to rather than working with it on-screen (which made sense in those days of 80x24 text-only displays). The Pascal Web Changefile mechanism was pure brilliance. It was by far superior to the standard C practice of using preprocessor directives to manage compilation for different targets.
  • zimpenfish 1153 days ago
    Not only is the link to my cross references in TeX broken, I've no idea where that code currently is either. Although given the lack of complaining emails about it, I doubt many people are using noweb these days, much less with plain TeX.
  • daly 1153 days ago
    I've been writing literate programs for years

    Here is a video showing a literate form of Clojure:

    https://www.youtube.com/watch?v=mDlzE9yy1mk

    The literate program creates a new PDF and a working version of Clojure, including running a test suite. If you change the literate code and type 'make' it re-makes the PDF with the new changes and rebuilds/retests Clojure.

    and here is the source:

    https://github.com/robleyhall/clojure-small-pieces

  • HerrMonnezza 1153 days ago
    I have used Noweb a few years ago [^1] for a small Python program; there were two major issues with it, from my PoV:

    - I had to take care of writing each Python code chunk with the amount of indentation appropriate for where it had to end up, since Noweb does (did?) not respect relative indentation of chunks when tangling.

    - Debugging the resulting script was more painful than plain Python sources, as all the debugging info (line numbers, etc.) referred to the tangled code and not to the actual noweb source file I was editing.

    [^1]: Looking at the website, it doesn't seem to have changed much since then.

  • kragen 1153 days ago
    (BTW, Norman's server seems to be suffering under the load; https://web.archive.org/web/20210223015500/https://www.cs.tu... has your (Way)Back if you're suffering problems in accessing it.)

    I've been interested in literate programming for a long time; for my self-bootstrapping PEG parser https://github.com/kragen/peg-bootstrap/blob/master/peg.md I wrote my own noweb-like system called HandAxeWeb in Lua (5.x) https://github.com/kragen/peg-bootstrap/blob/master/handaxew.... It accepts input in Markdown, HTML, ReStructuredText, etc., and it's only a couple hundred lines of Lua.

    For HandAxeWeb (named following the convention of StoneKnifeForth—the intent was to make it simple enough that even fairly early stages of a bootstrap could be literate programs), I wanted to be able to include multiple versions of a program in the same document, because I think it's often helpful to see the temporal development of a program from a simpler version to a more complex version. The simpler version is easier to understand and helps you focus on the most fundamental aspects of the program. https://www.youtube.com/watch?v=KoWqdEACyLI is a 5'30" screencast (not by me) of explaining the development of a Pong program in this fashion. I think it's usually easier to understand a program in this fashion than by trying to understand the final complete version bit by bit, the way something like "TeX: The Program" forces you to do.

    Still, generally speaking, soi-disant literate programming tools—including my own—generally fail to take advantage of the most compelling aspect of the computer as a communication medium: its ability to simulate. When I dive into a new code base, it's never entirely by reading it—whether top-down, bottom-up, or in any other order. The cross-reference links added by things like CWEB (or, you know, ctags) are helpful, of course, but invariably I want to see the output of the program, which CWEB doesn't support at all! (Although Knuth's TeX: The Program does manage to include TeX output despite being written in CWEB, that's in a sense sort of a coincidence; this is not a feature CWEB can provide for any programs other than TeX and METAFONT.)

    Books like Algorithms, by Knuth's student Sedgewick, are full of graphical representations of the outputs of the algorithms being discussed, and this is enormously helpful—perhaps even more so than the source code; see https://algs4.cs.princeton.edu/22mergesort/ for some examples from the current version of the book, which is lamentably in Java. It's better still, though, when you can edit the code and see the results—when diving into a new code base, I tend to execute modified versions of the code a lot, whether in the debugger or with extra logging or what. Paper books can't do this, but that's no excuse for not doing it when we're writing for readers who have computers.

    Philip Guo's Python Tutor http://pythontutor.com/ provides dynamic visualization of the memory contents of, in theory, arbitrary code (in the supported languages, including of course Python, but also C, C++, Java, JS, and Ruby). There are things you can display with animation that you can't display in a static page, but Algorithms gets quite far with static printed images, and I think static visualization is better when you can make it work, for reasons explored at length in Bret Victor's http://worrydream.com/MagicInk/.

    Python Tutor doesn't scale to large programs, but Dorothea Lütkehaus and Andreas Zeller's DDD can control GDB to create such visualizations for anything you can run under GDB (or JDB, Ladebug, pydb, or perl -d). Unfortunately there's no way to share the output of either DDD or Python Tutor, except maybe a screencast, and despite having been around since 01995, DDD has never been popular, I suspect because its Motif UI is clumsy to use. https://edoras.sdsu.edu/doc/ddd/article20.html shows what it looked like in 02002 and https://youtu.be/cKQ1qdo79As?t=106 shows what it looked like in 02015.

    Of course, spreadsheets are by far the most popular programming environment, and they have always displayed the program's output when you open it—even to the exclusion of the code, mostly. I've experimented with this sort of thing in the past, with things like http://canonical.org/~kragen/sw/bwt an interactive visualization of the Burrows–Wheeler transform, and so it's been heartening to see modern software development moving in this direction.

    The simplest version of this is things like Python's doctest, where you manually paste textual snippets of output in the code itself, and a testing tool automatically verifies that they're still up-to-date; Darius Bacon's Halp https://github.com/darius/halp is a more advanced version of this, where the example output updates automatically, so you can make changes to the program and see how they affect the results.

    The most polished versions of this approach seem to have adopted the name "explorable explanations", and many of the best examples are Amit Patel's, which he has at different times termed "interactive illustrations" https://simblob.blogspot.com/2007/07/interactive-illustratio..., "active essays" (I think? Maybe I'm misremembering and that term was current in Squeak around 02003: http://wiki.squeak.org/squeak/1125), and "interactive visual explanations". I wrote a previous comment about this on here in 02019: https://news.ycombinator.com/item?id=20954056.

    However, Amit's explorables, like many other versions of the genre, de-emphasize the underlying code to the point where they both don't display the actual code and don't let you edit it. They're intended to visualize an algorithm, not a codebase.

    Mike Bostock, d3.js's original author, has created https://bl.ocks.org/ for sharing explorable explanations made with d3, and is doing a startup called ObservableHQ which makes things like this a lot easier to build: https://beta.observablehq.com/d/e639659056145e88 but at the expense of a certain amount of polish and presentational freedom. Also, unfortunately, ObservableHQ programs seem to be tied to the company's website—you can download their output, but very much unlike TeX, the programs will only be runnable until the company goes out of business. So if you aspire to make a lasting contribution to human intellectual heritage, like TeX, GCC, or d3.js itself, ObservableHQ is not for you.

    R Markdown (by JJ Allaire—yes, the Cold Fusion dude—and Yihui Xie, among others) is one of the more interesting developments here; as with noweb or HandAxeWeb, you edit something very close to the "woven" version of the source code (in a dialect of Markdown); but, in a separate file alongside, RStudio maintains the results of executing the code, which are included in the "woven" output, and may be textual or graphical. Moreover, as with Halp or ObservableHQ, these results are displayed in a notebook-style interface as you're editing the code. https://bookdown.org/yihui/rmarkdown/notebook.html has a variety of examples, and Xie is rightly focused on reproducibility, which is very challenging to achieve with the existing tooling. https://bookdown.org/ lists a number of books that have been written with R Markdown, and https://github.com/rstudio/rmarkdown explains the overall project.

    Of course the much more common notebook-style interface, and the one that popularized the interaction style, is Jupyter (influenced by SageMath), which mixes input and output indiscriminately in the same file and peremptorily makes backward-incompatible changes in file formats; the result is a lot of friction with version-control systems. Nevertheless, it supports inline LaTeX, it's easy to use and compatible with a huge variety of existing software, and it can include publication-quality visualizations, so there's a lot of code out there in Jupyter notebooks now, far more than in any system that purports to be a "literate programming" system. Notable examples include Peter Norvig's œuvre (there's a list at https://github.com/norvig/pytudes#pytudes-index-of-jupyter-i...). I find this a very comfortable and powerful medium for this form of literate programming; recent examples include https://nbviewer.jupyter.org/url/canonical.org/~kragen/sw/de..., https://nbviewer.jupyter.org/url/canonical.org/~kragen/sw/de..., and https://nbviewer.jupyter.org/url/canonical.org/~kragen/sw/de..., which are maybe sort of embarrassingly bad but I think demonstrate the potential of the medium for vernacular expression of vulgar software, as well as lofty Norvig-type things.

    Konrad Hinsen has written about the reproducibility and lock-in problems introduced by Jupyter, for example in https://khinsen.wordpress.com/2015/09/03/beyond-jupyter-what..., and has been using Tudor Gîrba's Glamorous Toolkit https://gtoolkit.com/ to explore what comes next. He's been hitting the reproducibility problem pretty hard in http://www.activepapers.org/ but the primary intent there is, as with the explorable-explanations stuff, code as a means to producing research ("How should we package and publish the outcomes of computer-aided research"), rather than maintainability and understandability of code itself. I think this is a promising direction for literate programming as such, too.

    • akkartik 1153 days ago
      Here are a couple more projects that may or may not seem like Literate Programming, but are motivated squarely by its ethos: to order code for exposition, independent of what the compiler wants.

      * https://github.com/snaptoken, the engine behind https://viewsourcecode.org/snaptoken/kilo. The key new feature here seems to be that fragments are always shown in context that can be dynamically expanded by the reader.

      * https://github.com/jbyuki/ntangle.vim -- a literate system that tangles your code behind the scenes every time you :wq in Vim or Neovim.

      * My system of layers deemphasizes typesetting and is designed to work within a programmer's editor (though IDEs will find it confusing): http://akkartik.name/post/wart-layers. I don't have a single repo for it, mostly[1] because it's tiny enough to get bundled with each of my projects. Perhaps the most developed place to check out is the layered organization for a text editor I built in a statement-oriented language with built-in support for layers: https://github.com/akkartik/mu1/tree/master/edit#readme. It's also in my most recent project, though it's only used in a tiny bootstrapping shim before I wormhole solipsistically into my own universe: https://github.com/akkartik/mu/blob/main/tools/tangle.readme.... Maybe one day I'll have layers in this universe.

      [1] And also because I think example repos are under-explored compared to constant attempts at reusable components: http://akkartik.name/post/four-repos

      • kragen 1153 days ago
        These are great, thank you!
    • kragen 1153 days ago
      About spreadsheets, I missed the editing window on this, but I wanted to point out that in addition to the plotting capabilities spreadsheets have included since at least Lotus 1-2-3 1.0A in 01983 https://www.pcjs.org/software/pcx86/app/lotus/123/1a/ you can use conditional formatting and the like to get useful algorithmic visualizations; as an example, consider http://canonical.org/~kragen/sw/dev3/minskyplot.gnumeric, which also uses a slider to allow you to alter algorithm parameters dynamically in an ObservableHQ-like way. Even with Lotus 1-2-3 on a 4.7-MHz IBM PC 5150, you could get a much quicker feedback loop for that kind of thing than you can get from reading a printed program, but it was considerably harder to share with other people.

      If you want to get that kind of historical end-user programming perspective, can load the disk image at http://canonical.org/~kragen/sw/dev3/lotus-123-1a-plotsin.im... into the PCjs emulator running Lotus 1-2-3 linked above (mount it as drive B:), /FR Retrieve PLOTSIN.WKS, and type /GV to view the graph, and you can also load the .wks file from http://canonical.org/~kragen/sw/dev3/plotsin.wks into modern Gnumeric or LibreOffice Calc—but they won't display the graph. (I was also able to mount a directory containing the files from that disk image on drive B: in Dosbox and load the spreadsheet into 1-2-3—but Dosbox's CGA emulation seems to screw up on actually displaying the graph, and I think PCjs is also emulating the speed of the machine, which is an important aspect of the user experience.)

      Of course spreadsheets are a pretty limited programming environment, and like modern explorable explanations, they're focused on presenting the results of the computation, or enabling you to apply it to new inputs, rather than focused on explaining the inner workings of the computation itself. But they do expose the inner workings, even if only by necessity, and for problems they can solve at all, they're often a much more convenient way to understand some algorithm than a static pile of source code.

    • someguydave 1153 days ago
      Thanks for your comment. Are you aware of a literate programming tool that primarily uses tags in source files to link with documentation in txt files? I guess I am thinking one could tangle the source with docs to produce the documentation, while the source is passed to the compiler unchanged.
      • akkartik 1153 days ago
        https://github.com/nickpascucci/verso works like this. There's a syntax for creating tags in source files, and exposition for tags lives in a separate file.
        • someguydave 1153 days ago
          yeah exactly, thank you.

          I found that the leo editor does this too but I believe you must used the gui to tangle/weave, I would prefer cli for automation.

      • kragen 1153 days ago
        That's a really interesting idea! The closest things I've seen along those lines are Javadoc and its numerous bastard progeny (most notably Doxygen), which omit the "documentation in txt files" part entirely, and "shadow blocks" in Forth systems, where if I understand correctly you'd put the textual documentation a fixed number of blocks away from the source code on disk. So, if that offset were 50, code block 42 would correspond to shadow block 92, and there was a short command in the editor to jump back and forth between displaying the code and the comments (screens were too small at the time to display both at once). But I never used these systems.
      • kmstout 1153 days ago
        There's a variation called "elucidative programming" [1] wherein the source is marked with "anchors" that the documentation can reference. Since source code lives in traditional source files, all the regular development infrastructure continues to work. When the source/documentation bundle is processed, the output is a two-pane coordinated view of code and discussion.

        [1] http://people.cs.aau.dk/~normark/elucidative-programming/

        • someguydave 1152 days ago
          yeah I like it but the view of the code is a little lame if you wanted a printout. I think I would prefer comment anchors in the code which would weave into code quotes in a document.
  • Tijdreiziger 1153 days ago
    The postcard collection [1] is definitely the part of the site I enjoyed most!

    [1] https://www.cs.tufts.edu/~nr/noweb/gallery/

  • smusamashah 1153 days ago
    Is there a simple intro an ELI5 to get into or learn literate programming?

    My current understanding is that if write a paragraph size of comments to explain each and every part of my code with its intent it will be called literate programming.

    • fn-mote 1153 days ago
      It depends a lot on what you imagine writing with literate programming, and, like all documentation, who you imagine the reader to be.

      When I (used to) write literate programs, the document I produced would be some kind of top-down view of the functionality. I would begin by explaining the kind of problem to be solved and include motivating examples. Then I would explain the structure of the solution and start writing each piece. At the end (perhaps an appendix) I would have the parts where the pieces assembled into the structure required by the compiler.

      One of the essential points of literate programming is that it lets you structure your explanation in a way that makes sense, while the literate programming tool outputs "chunks" restructured in a way that makes sense for the compiler.

      Perhaps your idea of paragraph sized comments seems silly because you're not imagining something that would be complex enough to comment that way? Imagine a physics simulation. A numerical linear algebra library. Perhaps a game where there are some complex interactions between certain entities that need to be spelled out so that they next person knows what the heck is going on.

      Of course there is a level of organization where people write separate design docs for everything, and some level of management has signed off on this or that... I don't think literate programming is for that level of coordination. I think it's for a smaller team, a more personal level of organization and exposition.

      BTW, I am pretty sure Norman Ramsey himself has said with many modern programming languages literate programming is no longer essential. The order of presentation of functions, for example, is not constrained in Java. In the olden days (ummm, yes, I know C and its derivatives are alive and well today...), you would need to generate header files and source code, so the signature in the `*.h` file had to match in function in the `.c` file. Better to keep them adjacent in the documentation, at least. But that isn't really the way things look today, at least in my world.

      • taeric 1153 days ago
        While I agree that things are less necessary in a modern language, I disagree that Java is "unconstrained." This is especially false when one considers that the "entry point" to your code is more dependent on how the framework you are using calls it.

        It gets laughable when you have codebases that have "gone all in (functional|object oriented|any other style)" where they seem to mistake the style for the goal, which should be to solve a problem. (I say this as someone that is pretty sure I have made those mistakes.)

    • mcguire 1153 days ago
      There's Knuth's TeX Book and METAFONT Book, and his Stanford GraphBase book. There's also his collection of papers, Literate Programming (https://www.amazon.com/gp/product/0937073806/ref=dbs_a_def_r...).

      Knuth has an LP web page (https://www-cs-faculty.stanford.edu/~knuth/lp.html), but it looks like the examples are out of date.

      Probably more useful is http://www.literateprogramming.com/; the CWEB Tool page has some examples and the PDF Articles page has ... articles.

      Here's an intro from Knuth: http://www.literateprogramming.com/knuthweb.pdf

      And then there's Physically Based Rendering at http://www.pbr-book.org/.

  • kmstout 1153 days ago
    Required reading for anyone who wants to extend Noweb to do new things is "The noweb Hacker's Guide":

    http://www.literateprogramming.com/noweb_hacker.pdf

    This came in handy when I wanted syntax highlighting in a woven document.

    • froh 1153 days ago
      noweb had two brilliant and one not so far sighted decision made.

      The first plus was, as you point out, the extensibility of noweb, the pipeline architecture, which transforms the literate input into a documented plain text token stream, then does token stream transformation where you can insert your own transformations, like indexing, syntax highlighting, macro expansions if you wished, and then it reassembles the transformed token stream into output documents.

      the other brilliant idea was to go for a minimalistic literate syntax and be language agnostic, for both the markup and the programming language.

      This design decision was a focus on the absolute bare minimum, the gist of literate programming, and it still was open to all magic via user plug ins.

      This decision also made noweb trivial to learn.

      However. How noweb then chose to move to "icon" as scripting and extension language escapes me.

      In my book, that was the design decision that killed it. And the rewrite to noweb3, lua based, remained in eternal 'beta'.

      and LP as a whole always struggled with IDE / editor support.

      literate programming as a discipline could resurrect with the advent of language server protocol. that might make literate programming accessible to contemporary IDEs again.

      • thyrsus 1153 days ago
        I enjoyed the ideas and clarity of icon, but I've never had a colleague who would invest the time in learning it. The bus factor is larger when the language barrier is high. The small community is reflected in icon's sparse ecosystem.
  • nopemcnope 1153 days ago
    Actually know the prof and used the Product. It’s a disaster.