M4 (computer language)

(en.wikipedia.org)

70 points | by laktak 2072 days ago

9 comments

  • kazinator 2071 days ago
    Here is a GNU Makefile snippet I produced many years ago. It lets you have files preprocessed with the shell, C preprocessor (GNU cpp) or M4:

      %: %.ct
      	cpp -P $(shell env | sed -e "s/.*/-D_ENV_'&'/") $< > $@
    
      %: %.st
      	( export __FILE__=$< ; echo "cat <<!" ; cat $< ; echo "!" ) | bash > $@
    
      %: %.mt 
      	rm -f $@.err
      	m4 -D__FILE__=$< $(shell env | sed -e "s/\([^=]*\)=\(.*\)/-D'_ENV_\\1=\\2'/") $< > $@
      	if [ -s $@.err ] ; then cat $@.err ; exit 1 ; fi
      	rm -f $@.err
    
    .ct files are "C preprocessor Templates". Environment variables are available as preprocessor symbols prefixed with _ENV.

    .st files are "Shell here doc Templates". The content of the file is treated as the body of a here doc. The name of the file is available as __FILE__; other environoment variables are present naturally under their ordinary names.

    .mt files are "M4 Template". The __FILE__ macro is available with the name of the file. The environment is available as _ENV_* namespaced macros, like in the .ct case.

    I used this in a from-scratch embedded distro for preprocessing files in /etc and such.

  • fusiongyro 2071 days ago
    M4 is an interesting macro language, but developed (IMO) a somewhat undeserved reputation for being awful because it is how you build autoconf inputs as well as sendmail configuration files. Guilt by association.

    M4 has some "interesting" quirks but I have used it to build a static site. I briefly was maintaining my blog using make and M4. I no longer remember what the precise reason was that I stopped doing that, probably just that I wanted a through-the-web editor for content. I could dig up the "code" if there were interest in that.

    • gizmo686 2071 days ago
      My experience with M4 is using it to write SELinux policy. The reference policy (which serves as a baseline to build build custom policy on top of) is essentially a DSL implemented with M4. Its actually a good language, and looks a lot like how we would have designed it if we were actually going to write it properly [0]. However, debugging it when something goes wrong (eg, what should be a syntax error) is atrocious. I'm sure it is great if you stick to using it for macros. However, at this point, I view it in the same light as Microsoft Excel: good at what it does, but too powerful for its own good. If your not careful, it will grow until you wish you had been using a proper programming language from the begging.

      [0] Except for the backtick as opening quote. Objectively, I think this is a good design decision for any language, but it takes a lot of getting used to.

      • lolikoisuru 2071 days ago
        >[0] Except for the backtick as opening quote. Objectively, I think this is a good design decision for any language, but it takes a lot of getting used to.

        The `quoting' in m4 is pretty nice since it makes the quotes easy to nest. Sure braces would also be an option but I don't see any fundamental difference between the two, just visual.

    • chris_wot 2071 days ago
      You should definitely put that in GitHub or the like :-)
    • wott 2071 days ago
      > M4 is an interesting macro language, but developed (IMO) a somewhat undeserved reputation for being awful because it is how you build autoconf inputs as well as sendmail configuration files. Guilt by association.

      Hmm...well, I remember my first Linux experience (1995): I rather messed with straight sendmail.cf than struggled with the M4 scripts which generated it. I never got those damn M4 scripts to do what I wanted. I guess M4 and I are incompatible.

    • keithpeter 2071 days ago
      https://datagrok.github.io/makebakery/

      I'm quite interested in using a directory tree of markdown files to generate a web site so that the directory names become index files showing lists of titles contained within the markdown files &c

      Looks like make (possibly with m4 macros) could provide that

    • lolikoisuru 2071 days ago
      > I briefly was maintaining my blog using make and M4.

      I also wrote a static blog generator with M4 and GNU Make recently. It was the second time I've used M4 and it was pretty great. The first was a templating (probably missusing the term) system for my dotfiles to make them work across many distros and machines I have.

    • tinus_hn 2070 days ago
      M4 used to be the easy way to manage Sendmail configuration.
  • softbuilder 2071 days ago
    About 5 years ago we had a nice talk on M4 from Bart Massey of Portland State. He gives a nice survey of the language and a demonstration: https://www.youtube.com/watch?v=ULZxHSPWn98
  • hzhou321 2071 days ago
    M4 is a general-purpose preprocessor. Its preprocessing ability focuses on the word level and it blends macros in appearance to the look of the target language on purpose. I especially having trouble with the latter. In editing, I would like to have a "macro" mode and "language" mode, and being confused about which mode I am in -- during writing or reading -- is not good. Then I also find the word level macros are too bottom-up. The most needed preprocessing is top-down management. `include` is an obvious example, `if-else` too. But a scoped block level facility is even more useful.

    As an alternative, I have developed and used MyDef: http://hz2.org/blog/mydef_general.html

  • 1wd 2071 days ago
    M4 is of historical interest, but nowadays it seem just using a templating system would be much better in almost all cases, no? Modern systems like Jinja2 or T4 are IMO actually more powerful, easier to to use, maintain and debug, and just plain nicer. Am I missing something?
    • enriquto 2071 days ago
      What is the difference between a macro language and a templating system? I do some m4 from time to time and it is actually very easy to use and really nice.
      • alxlaz 2071 days ago
        A macro processing language is pretty much a language purpose-built for describing how to copy text from here to there while performing certain changes in the process.

        A templating system is a set of language-specific constructs (a library, a module, whatever they happen to call in that language) that essentially deals only with the latter (i.e. describing how to perform the changes). It's generally up to you to deal with the "copy text from here to there" part, although most templating engines give you a specific interface that you have to adhere to.

        I suppose the difference is better illustrated by PyExpander ( http://pyexpander.sourceforge.net/ ) which is a macro processing language based on python vs. Jinja (http://jinja.pocoo.org/) which is a templating engine for Python.

        I don't think any general statement about which on is "better" can be meaningful. I suppose that, if you have a full project already written in one language, with all work performed by a single program, it's easier to get what you need via templating engine. If your project is already a collection of tools, whose outputs you need to tie together, it's often less effort to bring in a macro language than write your processing logic from scratch in a non-macro language just to leverage a templating engine. Assuming, of course, that you have someone who knows the macro language in your team ;-). If all your team knows is Jinja2, you're gonna get Jinja2.

        FWIW, I also do a little M4 from time to time (and a long time ago I also worked with GPP) and find both of them fairly easy to use.

      • 1wd 2071 days ago
        I'm not sure. I get the impression with such old-school macro languages (like M4) you have to use a lot of quoting tricks, evaluation order hacks, and use low level primitives like divert and dnl to build up even the simplest of useful things. Many scripts seem to start by inventing a looping construct.

        While modern templating systems (like Jinja2 or T4) use a proper programming language (like Python or C#) so using even high-level constructs and complex data models is trivial.

        • yayana 2071 days ago
          I think this last paragraph is a bit misleading:

          > use a proper programming language

          m4 is a small proper language that can be used directly. jinja2 is a DSL written in python that doesn't actually have access to much of python.

          m4 is harder to work with in the domain (quoting, escaping, and data modeling) since it is a general purpose language, but it is easier to do general purpose processing tasks with m4 if they violate the stereotypes of what should be done in the domain.

      • jolmg 2071 days ago
        I think the main difference is that a templating language is explicit in what causes an interpolation, while a macro language is implicit. For example, compare this use of the m4 macro language:

            $ m4 << EOF
            > define(name, Jane)dnl
            > hello name
            > EOF
            hello Jane
        
        with the equivalent using the erb templating language:

            $ erb -T- << EOF
            > <% name = "Jane" -%>
            > hello <%= name %>
            > EOF
            hello Jane
        
        In the m4 example, Jane, hello, or name might even be another macro, and you'd need to know that to know what the result will be:

            $ m4 << EOF
            > define(name, nombre)dnl
            > define(hello, hola)dnl
            > define(Jane, Joe)dnl
            > define(name, Jane)dnl
            > hello nombre
            > EOF
            hola Joe
        
        Imagine you source those first 3 lines from elsewhere, a file serving as a library. Things can get pretty confusing if conventions aren't established and followed. You'd need to explicit in what you don't want to interpolate to get the same understandability as a templating language:

            $ m4 << EOF
            > define(\`name', \`nombre')dnl
            > define(\`hello', \`hola')dnl
            > define(\`Jane', \`Joe')dnl
            > define(\`name', \`Jane')dnl
            > \`hello 'name
            > EOF
            hello Jane
        
        By quoting like this, you can look at any single line and know what's going on. name is not quoted in the last line so it "must" (mandated only by convention) be interpolated. "name" and "Jane" are quoted in the second-to-last line, so they're not interpolated, and they can only mean that "name" will be substituted by "Jane". This convention offers the same benefits of a templating language, only it's more burdensome and error prone.

        Now, as to what is better? I think templating languages are better for modifications of documents based on variables, like making the rows of an HTML table correspond with a listing of data. The only good use-case I can think of for macro languages is extensions of languages. Like making mini-compilers (or do they call them transpilers nowadays?) by writing m4 scripts. That's the only time I think it'd be better to have implicit interpolation, when you have more interpolations than not, and the document source language is generally understood to be something far different than the target language.

        This means that, since the majority of the time CPP (a macro language) is generally used for interpolation of data or code in what is generally understood to be C code as source and target language, I think it would have been better designed as a templating language. That way, you wouldn't have people joking about doing things like:

            #define TRUE FALSE
        
        which would have no ill effect in a templating language.

        On second thought, however, C being what it is (a low-level language with inflexible syntax and semantics), I can see that the intention of CPP was indeed probably to write extensions to the language, which would make it suitable to be a macro language.

    • kevin_thibedeau 2071 days ago
      I used m4 to add high level macros to an assembler [1]. You can't do that with templating without jarring syntax changes. m4's willingness to accept simple strings as macro invocations is it's strongest asset.

      [1] https://kevinpt.github.io/opbasm/rst/m4.html

  • ubercow 2071 days ago
    What's the best introduction out there for someone who's never touched M4 before?
  • ameixaseca 2071 days ago
    I suggest to anyone considering M4 seriously to have a look at its substitution behaviour and syntax rules.

    Not only the default newline handling causes rules to become cluttered the more complex your macros get but the resubstitution of already parsed tokens recursively and the quoting that needs to be considered to avoid this can cause surprises - specially as the number of macros grows bigger.

    Maybe for config files or configuring source files m4 is great, but for anything more complex that need to be easier to read and understood, I'd avoid m4.

  • lukeh 2071 days ago
    Around 1995 I built some configuration management software based on M4, because I thought it was cool! Actually, it didn't do much in the way of management, but it did allow the ISP I was working for at the time to provision a customer gateway in a single step. Saved me a lot of error-prone and rather boring hand-editing of configuration files... (and probably saved the company some money, who knows)
  • totalperspectiv 2071 days ago
    m4 is a fantastic utility. I only recently discovered it but was using it + make to create alternate builds of docker images.