Some Obscure C Features

(multun.net)

476 points | by Morhaus 1700 days ago

29 comments

conistonwater 1700 days ago
In numerical analysis, "Hexadecimal float with an exponent" is not an obscure feature, it's a really nice one! If you want to communicate to somebody an exact number that your program has output, you need to either tell them the decimal number to enough digits + the number of digits (i.e., "Float32(0.1)", which is distinct from "Float64(0.1)"), or you can tell them the same number in full precision in binary, in which case the floating-point standard guarantees that that number is exactly correct and does not depend on how you interpret it. It's really nice for testing numerical code, especially with automated reproducible tests. Completely unambiguous, and I wish more languages had that (I saw the feature in Julia first).
[-]
- jacobolus 1700 days ago
  I wish Javascript, etc. had hexadecimal floats. It’s annoying to worry about whether different implementations might parse your numbers differently, worrying about whether you need to write down 15 or 17 decimal digits, ...
  Often the numbers (e.g. coefficients of some degree 10 polynomial approximation of a special function) are not intended to be human-readable anyway. Such values were typically computed in binary in the first place, and the only place they ever need to be decimal is when written into the code, where they will be immediately reconverted to binary numbers for use.
  [-]
  - microcolonel 1700 days ago
    I mean, it's not too difficult to add them. You can parse, shove your result into a data view, Uint32Array, or whatever, then turn that into a Float64Array.
    [-]
    - jacobolus 1700 days ago
      Fair enough. You can also just base64-encode (or whatever) a big block of binary data. But having built-in support for hex floats would be nicer.
- nestorD 1700 days ago
  Note that you can uniquely identify all floats by printing them with printf(“%1.8e”):
  https://randomascii.wordpress.com/2013/02/07/float-precision...
  [-]
  - pascal_cuoq 1700 days ago
    As the parent already said, the nice thing about hexadecimal floating-point is that the standard MANDATES that you get exactly the float/double you have uniquely described. The assertion “you can uniquely identify all floats by printing them with printf("%1.8e")” on the other hand relies on the compiler using round-to-nearest for conversions between binary and decimal, which the standard does not mandate and which, even when the compiler documents the choice, is sophisticated enough that the compiler/printf/strtod may get it wrong for difficult inputs:
    https://www.exploringbinary.com/tag/bug/
- piadodjanho 1700 days ago
  Also, the printf can be used to display floating point in hexadecimal notation.
  The printf specifiers:
```
  %e Scientific notation           3.9265e+2
  %a Hexadecimal floating point -0xc.90fep-2
```
elteto 1700 days ago
The preprocessor trick of passing function macros as parameters is not that obscure. I have seen it used and I've used it myself. It is very useful when you have a list of static "things" that you need to operate on.
Say I have a static list of names and I would like to declare some struct type for each name. I also would like to create variables of these structs at some point, and I would always do so for the entire block of names. You could do something like this:
```
    #define apply(fn) \
        fn(name1) \
        fn(name2) \
        fn(name3) \
        ...
        fn(nameN)

    #define make_struct(name) struct name##_t { ... }
    #define make_variable_of(name) name##_t name;
    
    ...

    apply(make_struct);  // This defines all the structs.
 
    void some_function(...) {
        apply(make_variable_of);  // And this defines one variable of each type.
    }
```
Yes, it is not pretty (it is the C preprocessor after all), but it can be very useful and clean.
[-]
- multun 1700 days ago
  It can be both useful and obscure ;-) I realize it's not black magic, it's just uncommon.
  I used this trick to declare and generate code for an entire parser, it's lovely.
  I recently switched to the other style of C-macro though:
```
  #define X(A, B, C) ...
  #include <math/apply_operators.h>
  #undef X
```
  This style is less powerful but doesn't require the annoying \ at the end of lines.
- kzrdude 1700 days ago
  I would call that the X macro pattern, but the wiki article doesn't agree that it should pass the `fn` as the argument. Not sure if that's important.. https://en.wikipedia.org/wiki/X_Macro
  [-]
  - elteto 1700 days ago
    Maybe at some point macros could not be passed as arguments? I honestly don’t know. Passing it as a parameter avoids all that define/undefine business.
- nsb1 1700 days ago
  Here's one of my favorites for dealing with logging repetitive enum names and the like:
```
  enum foo {
      FOO_THING_ONE,
      FOO_THING_TWO,
      FOO_THING_THREE,
      ...
      FOO_THING_SEVEN_HUNDRED
  };

  // Using concatenate '##' and stringify '#' operators

  #define FANCYCASE(X) case FOO_THING_##X: str=#X; break

  const char *foo_to_str(enum foo myFoo)
  {
      char *str;
      switch(myFoo)
      {
          FANCYCASE(ONE);
          FANCYCASE(TWO);
          FANCYCASE(THREE);
          ...
          FANCYCASE(SEVEN_HUNDRED);
      }
      return str;
  }
```
  [-]
  - erichurkman 1700 days ago
    You lose your ability to grep, though.
- jcranmer 1700 days ago
  I'd say none of the preprocessor stuff on this list is obscure--I've seen all of them in projects before.
kazinator 1700 days ago
Most of these are due to the cruft added in C99 and later.
Compile-time trees are possible without compound literals.
More than twenty years ago, I made a hyper-linked help screen system a GUI app whose content was all statically declared C structures with pointers to each other.
At file scope, you can make circular structures, thanks to tentative definitions, which can forward-declare the existence of a name, whose initializer can be given later.
Here is a compile-time circular list
```
   struct foo { struct foo *prev, *next; };

   struct foo n1, n2; /* C90 "tentative definition" */

   struct foo circ_head = { &n2, &n1 };

   struct foo n1 = { &circ_head, &n2 };

   struct foo n2 = { &n2, &circ_head };
```
You can't do this purely declaratively in a block scope, because the tentative definition mechanism is lacking.
About macros used for include headers, those can be evil. A few years ago I had this:
```
   #include ALLOCA_H  /* config system decides header name */
```
Broke on Musl. Why? ALLOCA_H expanded to <alloca.h>. But unlike a hard-coded #include <alloca.h>, this <alloca.h> is just a token sequence that is itself scanned for more macro replacements: it consists of the tokens {<}{alloca}{.}{h}{>}. The <stdlib.h> on Musl defines an alloca macro (an object-like one, not function-like such as #define alloca __builtin_alloca), and that got replaced inside <alloca.h>, resulting in a garbage header name.
[-]
- rwmj 1700 days ago
  Compound literals are fun. You can use them to implement labelled and optional parameters to functions too:
```
    struct args { int a; char *b; };
    #define fn(...) (fn_((struct args){__VA_ARGS__}))

    void fn_ (struct args args) {
        printf ("a = %d, b = %s\n", args.a, args.b);
    }

    fn (0, "test");
    fn (1); // called with b == NULL
    fn (.b = "hello", .a = 2);
```
  (As written this has a subtle catch that fn() passes undefined values, but you can get around that by adding an extra hidden struct field which is always zero).
- quietbritishjim 1700 days ago
  I haven't heard of "tentative definitions" before. Couldn't you just replace it with a regular declaration i.e.
```
    extern foo n1, n2;
```
  Is there any benefit of tentative definitions over this?
  [-]
  - kazinator 1700 days ago
    Yes; the "extern" is potentially confusing when the definition is located in the same file below. They usually inform the reader of the code to look in some other file for the definition, and usually appear in header files.
  - Twisol 1700 days ago
    I'm not certain, but the `extern` variant probably doesn't reserve space at compile-time; it just says "the linker will know where to find these". So resolving those symbols might need to wait until link-time. The tentative definitions probably do reserve space (and hence an immediately-known address), and the later true definitions just supply the initial value to put in that space.
- multun 1700 days ago
  Oops I'm wrong, I meant compile time anonymous trees. It's very obviously possible by defining other variables.
- piadodjanho 1700 days ago
  Also variable assignment with identifier list:
```
   a = 0;
   x = (type_t) { .y = a++, .x = a++ }
```
  Now, figure it out the order of the field's assignment.
  [-]
  - kazinator 1700 days ago
    The order of evaluation among initializers is unspecified, which makes it UB.
    We don't have to use mixed-up designated initializers to run aground. Just simply:
```
  { int a = 0;
    int x[2] = { a++, a++ }; }
```
    This was also new in C99 (not to mention the struct literal syntax); in C90, run-time values couldn't be used for initializing aggregate members, so the issue wouldn't arise.
    Code like:
```
   { int a = 0;
     struct foo f = { a, a }; }
```
    was supported in the GNU C dialect of C90 before C99 and of course is also a long-time C++ feature.
    https://gcc.gnu.org/onlinedocs/gcc/Initializers.html#Initial...
    The doc doesn't say anything about order. I can't remember what C++ has to say about this.
compi 1700 days ago
I was reading the source code for a NES assembler written in pre-C99 C, and there was an odd C feature used in it that I haven't really seen anywhere else.
It was before C had built-in booleans and the author had defined their own, but true was:
```
    void * true_ptr  = &true_ptr;
```
true_ptr is a pointer to itself. So however many times you deference it:
```
    printf("%p\n", true_ptr);
    printf("%p\n", &true_ptr);
    printf("%p\n", *((void**)true_ptr));
    printf("%p\n", *((void**)*((void**)true_ptr)));
```
You get the same pointer:
```
    0x5555fefe715b
    0x5555fefe715b
    0x5555fefe715b
    0x5555fefe715b
```
I still think that it's neat that, even with ASLR, you have an address at compile time that you know won't collide with address space of malloc results, or the address space of your stack.
Also you can declare the pointer as const and the value it points to as const and, if your kernel faults on writing to readonly memory pages, you get a buggier version of a NULL pointer that only segfaults on write.
Also it takes a second to figure out why the position of the const matters even though the pointer's value is the value of the pointer, and why only one of these segfaults on write:
```
    const void * const_pointer = &const_pointer;
    void * const const_value   = &const_value;
```
[-]
- Sean1708 1700 days ago
```
  const void * const_pointer = &const_pointer;
  void * const const_value   = &const_value;
```
  I've never really understood why people put const before the type, to me the following is far more obvious:
```
  void const* const_pointer = &const_pointer;
  void* const const_value = &const_value;
```
- davelee 1700 days ago
  I've used this feature to create constants that are unique IDs.
  [-]
  - shultays 1700 days ago
    Wouldnt count macro work as well?
    [-]
    - davelee 1700 days ago
      I think that would be per file/translation unit. These values are unique across a linked binary.
- cryptonector 1700 days ago
```
  cdecl> explain const void * const_pointer
  declare const_pointer as pointer to const void
  cdecl> explain void * const const_value
  declare const_value as const pointer to void
  cdecl>
```
  The first must be the one that segfaults on write, IFF the compiler chooses to place it in the .text (as it should).
  [-]
  - wizzairflyer 1700 days ago
    My (admittedly naive) understanding of the ordeal leads me to believe that it is that the 1st would not segfault but the second will since declaring it as a const pointer will create additional memory constraints.
    Testing it on my machine with the following code seems to validate this hypothesis.
```
  //file: test.c
  #include <stdio.h>
  const void * const_pointer = &const_pointer;
  void * const const_value   = &const_value;
  
  int main()
  {
      printf("%p\n", const_pointer);
      *(int*)const_pointer = 0;
      printf("%p\n", const_pointer);
  
      printf("---------------------------\n");
  
      printf("%p\n", const_value);
      *(int*)const_value = 0;
      printf("%p\n", const_value);
      return 0;
  }
  
  Result:
  
  $ gcc test.c
  test.c:4:30: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  void * const const_value   = &const_value;
                              ^

  $ ./a.out
  0x55b29ebfc010
  0x55b200000000
  ---------------------------
  0x55b29ebfbdb8

  Command terminated
```
    As to why the first one doesn't also result in a segfault, I don't know.
    [-]
    - nybble41 1699 days ago
      In the first version, const_pointer is a non-const variable (so located in .data) holding a pointer to potentially constant data—you can't modify the data through that pointer without a typecast, but the actual location in memory may be mutable. That's why you don't get a segfault when you cast away the const and modify the data—the destination (const_pointer) is not const even though the const-qualified pointer would allow it to be.
      In the second case the const_value variable itself is const-qualified and thus located in .rodata, but the pointer itself is not const-qualified so nothing prevents you from attempting to modify the data through that pointer. This is why you get a compiler warning about discarding the 'const' qualifier in the initialization. Since const_value is in .rodata, writing to it through the pointer causes a segfault.
      As Sean1708 pointed out, it's more obvious what is going on if you place the 'const' qualifier immediately before the thing it's modifying, which is either the pointer operator or the variable name, never the type itself:
      void const *const_pointer = &const_pointer; void *const const_value = &const_value;
      What would something like "const int" even mean on its own, anyway? There is no such thing as a mutable integer. It's the memory location holding the integer which may be either mutable or immutable.
    - Sean1708 1700 days ago
      My assembly knowledge is limited, but it looks like both const_pointer and const_value get put into .rodata (read-only data). In both cases you're trying to change what is at the memory location that the pointer points to, in the first case it's the pointer that's in .rodata so you can change what it points to, but in the second case it's the value that's in .rodata so you can't change it.
      Edit: Actually I don't think the pointer is put anywhere, rather its value is stored in .data (non-read-only data) so it can be mutated without issue. Again though, my assembly isn't amazing.
- paulrpotts 1699 days ago
  Some early compilers such as THINK C for the Mac weren't very strict about typing when taking the address of an object. So for example you could do:
  unsigned int x;
  unsigned int * x_addr = &x;
  unsigned int x_addr_addr = &(&x);
  (or arbitrarily many levels of "address-of") and you'd just get the same address.
vq 1700 days ago
No mention of trigraphs? They are one of my favourite obscure C language features that I've never used.
Excerpt from GCC man page:
```
  Trigraph:       ??(  ??)  ??<  ??>  ??=  ??/  ??'  ??!  ??-
  Replacement:      [    ]    {    }    #    \    ^    |    ~
```
Missing backslash on your keyboard? No problem, just type ??/ instead.
[-]
- MawKKe 1700 days ago
  More than once have I written something like this
```
  if(condition)
    printf("WTF??! value: %d", value);
```
  ...only to have the compiler nag me about it. That's pretty much the only situation I've come across trigraphs :)
- cpeterso 1700 days ago
  Related to trigraphs are the alternative logical operator keywords like `and` and `or`. I'm surprised people don't use them more often because they're nicer to read than && and ||. In C, you must #include <iso646.h> but I think they're standard keywords.
  C++ code example on Godbolt: https://godbolt.org/z/ED6tXK
  https://en.cppreference.com/w/cpp/language/operator_alternat...
  [-]
  - paulrpotts 1699 days ago
    Hmmm... I would not prefer using "and" and "or" because that syntactic sugar obscures whether the bitwise or logical operations are intended. You get really used to reading && and || as "and" and "or" in your head after the first two decades of C programming : )
  - jandrese 1698 days ago
    I can safely say that I have never #included iso646.h in my life.
- johannes1234321 1700 days ago
  After long debates (since IBM needs them in EBCDIC machines, like z/Series) they were dropped from C++ (I believe C++17, but might already be C++14) so if you use them in a header this might cause incompatibilities.
- goatinaboat 1700 days ago
  IIRC these are a legacy of BCPL
  [-]
  - pdw 1700 days ago
    The story I heard was that the trigraphs were added during standardization because ISO 646, the international version of ASCII, did not require the characters [ \ ] { | }.
    Fun fact: ISO 646 is also the reason that IRC allows these characters in nicknames. IRC was created in Finland, and the Finnish national standard had placed the letters Ä Ö Å ä ö å at those code points.
    Edit: That doesn't explain the trigraphs for # ^ ~. I'm guessing some EBCDIC variants lacked those. Or some other computer vendor on the committee still supported some other legacy character set.
    [-]
    - jandrese 1698 days ago
      I had to use them once on a truly ancient amber screen serial terminal that lacked { and } on the keyboard. That was back in the 90s and the terminal was completely obsolete at the time but I needed to write a tiny hack program to solve an immediate problem. I remember only knowing about them in the first place from an odd compiler warning I'd seen on a different program.
  - CarlRJ 1700 days ago
    They weren't in K&R. I recall a story about a standardization meeting, on the way to ANSI C, where representatives from some European(?) country that didn't have some of the necessary punctuation on their country-specific keyboards, essentially snuck the trigraphs into the spec when the other representatives weren't looking.
    [-]
    - p_l 1700 days ago
      Trigraphs are also necessary on IBM Z if you don't want to (or can't) switch from most standard EBCDIC code pages to the special C code page
piadodjanho 1700 days ago
A more obscure feature is the uncommon usage of comma operator. We often use the comma operator in variable declaration and in for loops. But it can also be used in any expression.
For instance, the next line has a valid C construct:
```
   return a, b, c;
```
This is particularly useful for setting variable when retuning after an error.
```
   if (ret = io(x))
       return errno = 10, -1;
```
The possibilities are endless. Another example:
```
    if (x = 1, y = 2, x < 3)
     ...
```
But the comma operator really shines when used in conjunction with macros.
[-]
- ndesaulniers 1700 days ago
  The comma operatory leads to bad bad bugs (especially when mixed with misleading indentation) and if its use was punishable I wouldn't even be sad.
- sfoley 1700 days ago
  The comma character as a token in variable is decalararions is not the same thing as the comma operator.
  [-]
  - piadodjanho 1700 days ago
    You are correct. My mistake.
- Asooka 1700 days ago
  Also, the comma operator forces a left to right evaluation order.
  Surprisingly, you can override it in C++. I haven't seen anyone do it, but you can. If you find a good, productive override for the comma operator, please post about it.
  [-]
  - piadodjanho 1700 days ago
    So that's why it return the value of last expression!
- jakear 1700 days ago
  The comma operator isn’t very obscure at all. Even JS has it.
  [-]
  - kbenson 1700 days ago
    It's not the comma operator that is being noted as obscure, but particular usage of it, such as complex return statements that set a value and return another.
    [-]
    - jakear 1700 days ago
      Right, that’s also the case for comma in JS.
      [-]
      - kbenson 1700 days ago
        So, are you making the case that the examples present are common in JS as well? Because the whole point of the comment was in the uncommon usage of comma operator, as stated in the first sentence.
        And to be clear, even if it is common in JS, that still doesn't reduce the usefulness of the original comment, because we aren't talking about JS, we're talking about C, and the commonness of the features notes in the C ecosystem. There are plenty of obscure oft-ignored features on one language that are common in another. For example, taking someone's interesting C macro that allows some level of equivalence to functional map and apply and saying "that isn't very obscure, even lisp has that" is missing the point.
      - piadodjanho 1700 days ago
        I reckon that most language with grammar inspired in C describes the comma operator.
nothis 1700 days ago
All craziness but then:
>a[b] is literally equivalent to *(a + b).
Is this obscure? I thought that's pretty much the first thing you learn about arrays in C? It's pointers, all the way down.
[-]
- atq2119 1700 days ago
  Not really. The way it's usually introduced is that you get the same "reference" both ways. The fact that it's literally equivalent, and especially that there's no pointer type requirement on the left-hand-side, with the consequence of allowing ridiculous code like 2[array], is pretty obscure. Even more so because the equivalence doesn't work that way in C++ -- in general, features of C that aren't available in C++ tend to be not as widely known.
  [-]
  - eMSF 1700 days ago
    >Even more so because the equivalence doesn't work that way in C++
    What do you mean? As far as I remember, C++ is very similar in this respect when it comes to array and pointer types.
    [-]
    - atq2119 1700 days ago
      Only for the most primitive types, though. For anything else, operator[] is used in C++ instead, without an operator+ fallback. So for example, if your array is a std::array instead of a C-style array, saying 1[array] will not work.
    - asveikau 1700 days ago
      Perhaps they mean that operator overloading can break that equivalence.
      Which is true, but I would argue an implementation that makes those operators behave differently is likely ill advised.
- arcticbull 1700 days ago
  Agreed, and I never made the connection that array[index] could be written as index[array] haha.
  [-]
  - anticensor 1700 days ago
    Only if index and array have the types of same size. Because pointer arithmetic works in terms of object sizes, not in individual bytes.
    [-]
    - harry8 1700 days ago
      typedef struct thing { char a; long long int nothing; } thing_t; #define P(x) printf("%s %c\n", #x, x) int main(int argc, char **argv) { thing_t array[1024]; array[8].a = 'H'; P(array[8].a); P((8[array]).a); P((*(8 + array)).a); }
    - erik_seaberg 1700 days ago
      foo[3] *(foo + 3) *(3 + foo) 3[foo]
      They all do the same thing.
      [-]
      - arcticbull 1700 days ago
        Interesting! I assume because the literal is being coerced to be the size of the variable? After all:
        foo[3] == (void *)((usize)foo + (usize)(3 * sizeof(*foo)))
        [-]
        erik_seaberg 1700 days ago
        Yeah, that's how pointer arithmetic works. Adding an int and a pointer assumes an array and gives you a pointer to the nth element in that direction (which had better exist, for your sake). char pointers can be used for byte offsets if a byte is a char on your platform (and it's been quite a while since that wasn't true).
        You can also subtract two pointers into the same array and get the distance (in elements, not bytes). a[b - a] == b.
- bingerman 1700 days ago
  As the author continues, it becomes mildly weird only when you realize that you can write b[a] and it just works (tm). I've seen students saying the compiler somehow checks that the "a" is arrayish so the swapped version doesn't make sense.
- saagarjha 1700 days ago
  It's syntactic sugar that's more poorly abstracted than most people realize.
jimmoores 1700 days ago
These examples actually were more obscure than the usual list.
[-]
- dkersten 1700 days ago
  Yeah, I did know 3 of them[1] but the rest were completely new to me! Not that I’m an expert or anything, but I tend to enjoy reading about obscure C features. That sizeof can have side effects is... a bit crazy although the multiple compatible function declarations are horrifying.
  [1] Array designators, Preprocessor is a functional language and a[b] is a syntactic sugar.
- pickdenis 1700 days ago
  The multiple compatible function signature declaration got me. I pray that I'll never have to understand and work with code that looks like that.
- Sharlin 1700 days ago
  Amusingly enough, that’s because most of them are C99 features.
needs 1700 days ago
A little off-topic but here is a cool piece of code about a special case in C:
```
    void (*foo)() = 0;
    void (*bar)() = (void *)0;
    void (*baz)() = (void *)(void *)0; // Error
```
Can you guess why compilers reject the last line?
[-]
- eMSF 1700 days ago
  Only the first two expressions on the right are null pointer constants (integral constant expression with a value of 0, optionally cast as a void *), that can be used to initialize all pointer variables, including function pointers. The last one is merely a null pointer (to void), that can't be implicitly converted to a pointer to a function.
  C++ has stricter rules for null pointer constants, and thus only the first version is valid C++.
- icedchai 1700 days ago
  Most modern compilers don't even warn about the last line unless you're in pedantic mode.
bsder 1700 days ago
Although calling the preprocessor "functional" is being too pleasant. The C preprocessor was always a text substitution system, so macro as parameter is not that "obscure". Of course, I may have missed something subtle in the example.
It's also not clear how to use that preprocessor example.
[-]
- multun 1700 days ago
  It is never obscure when you know it. And sure, it was always meant that way. It's just rarely used this way, which is why it's in this list.
  Here's what the example yields once preprocessed:
```
  struct operator{
    int priority;
    const char *value;
  };

  struct operator operator_negate = {
      .priority = 20,
      .value = "!",
  };

  struct operator operator_different = {
      .priority = 70,
      .value = "!=",
  };

  struct operator operator_mod = {
      .priority = 30,
      .value = "%",
  };
```
- augusto-moura 1700 days ago
  see http://conal.net/blog/posts/the-c-language-is-purely-functio...
  old but gold
  [-]
  - juped 1699 days ago
    >The C ADT is implemented simply as String (or char *, for you type theorists, using a notation from Kleene)
    Still makes me laugh
- elteto 1700 days ago
  It can be very useful! I added a small comment with an example of how I used it once.
jcranmer 1700 days ago
sizeof actually doesn't evaluate its expression for side effects most of the time; only if the operand is a variable-length array is it evaluated.
[-]
- haspok 1700 days ago
  That is not guaranteed either:
  "If the size expression of a VLA has side effects, they are guaranteed to be produced except when it is a part of a sizeof expression whose result doesn't depend on it." (from: https://en.cppreference.com/w/c/language/array)
  In the example this is not a problem, because int[printf()] means you must call printf() to get the return value and determine the size of the array.
- _kst_ 1700 days ago
  And it's not even entirely clear what that means.
  The standard says:
  "If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant."
  But what does it mean to evaluate the operand?
  If the operand is the name of an object:
```
    int vla[n];
    sizeof n;
```
  what does it mean to evaluate `n`? Logically, evaluating it should access the values of its elements (since there's no array-to-pointer conversion in this context), but that's obviously not what was intended.
  And what about this:
```
   sizeof (int[n])
```
  What does it mean to "evaluate" a type name?
  It's not much of a problem in practice, but it's difficult to come up with a consistent interpretation of the wording in the standard.
  [-]
  - jcranmer 1700 days ago
    > If the operand is the name of an object [...] what does it mean to evaluate `n`?
    When you say "n", syntactically, you have a primary expression that is an identifier. So you follow the rules for evaluating an identifier, which will produce the value. C doesn't describe it very well, but the value of the expression is the value of the object. In terms of how it is implemented in actual compilers, this would mean issuing a load of the memory location, which is dead unless `n` is a volatile variable.
    [-]
    - _kst_ 1700 days ago
      Sorry, in the first example I meant to write (adding a declaration and initialization for n):
      int n = 42; int vla[n]; sizeof vla;
      not `sizeof n`). (It doesn't look like I can edit a comment.)
      Logically, evaluating the expression `vla` would mean reading the contents of the array object, which means reading the value of each of its elements. But there's clearly no need to do that to determine its size -- and if you actually did that, you'd have undefined behavior since the elements are uninitialized. (There are very few cases where the value of an array object is evaluated, since in most cases an array expression is implicitly converted to a pointer expression.)
      In fact the declaration `int vla[n];` will cause the compiler to create an anonymous object, associated with the array type, initialized to `n` or `n * sizeof (int)`. Evaluating `sizeof vla` only requires reading that anonymous object, not reading the array object. The problem is that the standard doesn't express this clearly or correctly.
      [-]
      - enriquto 1700 days ago
        Your example is unnecessarily tame. The value of n need not be known as compile time, e.g. it may be input by the user at runtime.
- saagarjha 1700 days ago
  Another reason why VLAs are considered anathema in many projects.
shift_reset 1700 days ago
The article has examples of array parameters like
```
    int b[const 42][24][*]
```
but you can get more fun with variably modified array parameters instead of just constants like 42. For example,
```
    double sum_a_weird_shaped_matrix(int n, double array[n][3*n]) {
        double total = 0;
        for (int x = 0; x < n; ++x) {
            for (int y = 0; y < 3 * n; ++y) {
                total += array[x][y];
            }
        }
        return total;
    }
```
has a variable and a more complicated expression in those positions.
But those variably modified parameters can have arbitrary expressions in them, like
```
    int last(size_t len, int array[restrict static (printf("getting the last element of an array of %zu ints\n", len), len--)]) {
        return array[len];
    }
```
C++ denies us this particular joy which could have made function overload resolution even more fun.
[-]
- stevenhuang 1700 days ago
  Another neat note IIRC is that array parameter sizes don't actually do anything, they just are there for semantic purposes and get treated as raw pointers.
  So if you do
```
    void func(int x[10]);
```
  You're free to call it like
```
    int k[5];
    func(k);
```
  And you won't get any warnings. Unsettling!
  [-]
  - shift_reset 1700 days ago
    That's what the static keyword means in those array declarators.
```
    void func(int x[static 10]);
```
    must be called with an argument that is a pointer to the start of a big enough array of int. I can't get recent GCC or Clang to warn on violations of this, though.
    [-]
    - spc476 1700 days ago
      There are cases where the compiler can't enforce it:
      void foo(int *p) { func(p); }
      How can the compiler know if `p` points to space for 10 integers?
  - ninkendo 1700 days ago
    The wording in the C FAQ is that arrays “decay” into pointers when you pass them to functions. Which they explain as the reason why you can’t know the size of a passed array (at least in standard C.)
    The C FAQ is pretty old though, I’ve always wondered how much of that advice changed in C99/C11... from cursory googling things don’t seem to have changed much.
    [-]
    - cryptonector 1700 days ago
      It's funny that K&R chose to have arrays "decay" to pointers, but to allow structs to be passed by value. Thus you can actually pass arrays by value if and only if you wrap them in a struct:
      struct foo { char a[5]; }; void f(struct foo x) { x.a[4] = '\0'; printf("%s", x.a); } int main(void) { struct foo x; memcpy(x.a, "too big", sizeof(x.a)); f(x) printf("%s", x.a); /* read past end of x, crash */ return 0; }
      I'm thankful they didn't make structs decay into pointers!
      [-]
      - jandrese 1698 days ago
        I figured that having arrays decay into pointers is one of those features that got grandfathered in because changing it would have broken the dozen or C programs that existed at the time. It's a real shame too because a working sizeof() for strings could have avoided a LOT of C exploits over the years.
- multun 1700 days ago
  I took the code sample of the article from a snippet intended to be as dirty as possible, and removed most madness out of it. But yeah you're right :D I just wanted to avoid adding one more item in that bullet list.
stefan_ 1700 days ago
A nice but I guess more of a linker feature is if you declare a function as __weak__, you can check it at runtime for == NULL to determine if the application was built with the function defined.
[-]
- multun 1700 days ago
  I would say most interesting linker features are non standard, so outside of the scope of this article :/
  [-]
  - cryptonector 1700 days ago
    True, however, it is precisely the stupid linker tricks that make C such an interesting and powerful language nowadays. Weak symbols. Interposition (LD_PRELOAD). dlopen() and friends. Filters. Direct binding / versioned symbols. ELF semantics in general (which make the use of one flat symbol namespace safer).
- _kst_ 1700 days ago
  That's non-standard.
Keyframe 1700 days ago
One of the more obscure, yet often employed, "features" is to use something else as a C preprocessor.
tasty_freeze 1700 days ago
This was c++ and not C, but it is a preprocessor pitfall.
I needed to compare and older and newer version of some file from the RCS, so I saved temporary copies named "new" and "old". diff told me what I needed to know, but I failed to delete those temp files.
Hours later I typed "make" to build my program and got all sorts of errors deeply nested in some library function. Did someone misconfigure the server I was on? OK, maybe it is an incremental build problem? etc. It took took long to figure out the problem.
It turns out that during compilation, as one of the library .h files was being scanned, it contained #include <new>, which picked up the junk file in my working directory instead of using the C library.
etaioinshrdlu 1700 days ago
I found the most interesting one to be compile time trees.
Does anyone have any good use cases for it?
[-]
- kccqzy 1700 days ago
  I don't like to think of it as compile-time trees. It basically only allows you to construct complicated structures but doesn't allow you to examine them at compile time (that would require constexpr functions in C++; can't be done in C). It's honestly not a very impressive feature.
- megous 1700 days ago
  Look at clk driver code for sunxi in mainline kernel. That uses it a lot.
  https://elixir.bootlin.com/linux/latest/source/drivers/clk/s...
- lasagnaphil 1700 days ago
  Maybe for making a declarative GUI system in C? I’ve never seen this before, so I’m not sure it could be done.
- saagarjha 1700 days ago
  A hardcoded trie for string matching, perhaps?
- multun 1700 days ago
  I used it to describe neural network architectures.
higherkinded 1700 days ago
Effectful sizeof, what a delightful feature!
Though the compile-time magic with structs and functional macros are so tempting that I feel like it's high time to do some C.
deckar01 1700 days ago
> VLA typedef ... I have no clue how this could ever be useful.
I used this feature recently. I had several arrays of the same size and type, and the size was determined at runtime. The VLA typedef let me avoid duplicate type signatures which I find more readable.
```
    int N = atoi(argv[1]);
    typedef int grid[N][N];
    grid board;
    grid best;
    grid cache;
```
JeromeLon 1700 days ago
No mention of the downto (-->) operator?
```
  int x = 10;
  while (x --> 0) {
    printf("%d ", x);
  }
```
[-]
- inquist 1700 days ago
  That's just the decrement operator and inequality operator!
  [-]
  - cvs268 1698 days ago
    not inequality, rather the "greater-than" operator.
    "!=" would be the inequality operator... :-)
haolez 1700 days ago
> a[b] is literally equivalent to *(a + b). You can thus write some absolute madness such as 41[yourarray + 1].
Wow. This has to be the best C obscurity that I've ever seen.
[-]
- multun 1700 days ago
  Come on, that might not be obscure for a C ninja master like you, but I'm pretty sure many people had no idea ;-)
  Notice how the list is sorted from actually obscure to less interesting.
  I even took the time to write a pseudo disclaimer above this one :D
  [-]
  - haolez 1700 days ago
    I wasn't being ironic :)
- _kst_ 1700 days ago
  https://stackoverflow.com/q/381542/827263
  https://stackoverflow.com/a/18393343/827263
- ChuckNorris89 1700 days ago
  As an embedded dev, I didn't know these things were actually obscure. Granted, we never used such things in production.
inlined 1700 days ago
The WTF part seems useful. E.g.
type quaternion float[4];
quaternion SLERP = {1.0, 0, 0, 0};
[-]
- kccqzy 1700 days ago
  Did you mean
```
    typedef float quaternion[4];
```
  ?
  That's not a VLA.
megiddo 1700 days ago
Hell, c11 has ghetto generics.
real-ycomb 1700 days ago
Ssaaa1aaaaaaaaawaawawsßaSAA
radamadah 1700 days ago
Are we calling these features, now?
[-]
- aspaceman 1700 days ago
  Most of the examples listed (hex floats especially) are very much features.
EGreg 1700 days ago
Here is an obscure c feature:
```
  int main()
  {
     int a = 8;
     {
       int a = 4;
       /* a is only scoped to this block */
     }
     printf("%d", a); /* prints 8 */
  }
```
It is also why C++ is not a strict superset of C
[-]
- eMSF 1700 days ago
  That's hardly obscure: it's Javascript that's the odd one out of the languages with C-like syntax, with its wacky function-level scoping instead of variable shadowing when using 'var', and falling back to global scope when not declaring a variable properly. Also, C and C++ behave identically with your given example.
  A perhaps obscure feature is that you can "unshadow" a global variable like this:
```
  #include <stdio.h>
  int global = 0;
  int main()
  {
    int global = 1;
    {
      extern int global;
      printf("%d\n", global); // prints 0
    }
  }
```
- dkersten 1700 days ago
  > It is also why C++ is not a strict superset of C
  Can you explain? That code in C++ also scopes ‘a’ to the block.
  EDIT: I see you’ve edited the code, but I think it’s still true in C++. I’ve often done that for RAII and unless I’m mistaken it works just as well when shadowing variables like you’re doing as when not.
  [-]
  - scott_s 1700 days ago
    Agreed. And I have also used it in C++ for RAII purposes. In C++, braces introduce a scope, and objects local to that scope will be destructed upon exit.
- codesushi42 1700 days ago
  ... is this a joke? It's called lexical scoping, and it's supported by more languages than not.
- scott_s 1700 days ago
  That's true in C++ as well: https://en.cppreference.com/w/cpp/language/scope
- wahern 1700 days ago
  Off the top of my head, I can't think of any language that doesn't permit this. See https://en.wikipedia.org/wiki/Variable_shadowing
  [-]
  - saagarjha 1700 days ago
    Your link mentions CoffeeScript as not allowing shadowing.
- jcranmer 1700 days ago
  If you want to tell if your compiler is C or C++, run this program:
```
   #include <stdio.h>
   int main() {
     printf("%d\n", sizeof('a'));
   }
```
  It's the second thing C++ mentions in its list of incompatibility with C (the first is "new keywords").
  A more obscure difference is this program:
```
   int i;
   int i;
   int main() {
     return i;
   }
```
  It's legal C but not legal C++.
  [-]
  - _kst_ 1700 days ago
    sizeof ('a') doesn't reliably tell you whether you're compiling C or C++. It yields the same result in an implementation where sizeof (int) == 1 (which requires CHAR_BIT >= 16). (The difference is that character constants are of type int in C, and of type char in C++.)
    So if sizeof ('a') == 1, then either you're compiling as C++ or you're compiling as C under a rather odd implementation.
    Both POSIX and Windows require CHAR_BIT==8. The only systems I know of with CHAR_BIT>8 are for digital signal processors (DSPs).
    If you want to tell whether you're compiling C or C++:
```
    #ifdef __cplusplus
    ...
    #else
    ...
    #fi
```
  - ksherlock 1700 days ago
```
    int main() {
      return 4//**/2
      ;
    }
```
    [-]
    - eMSF 1700 days ago
      Line comments have been part of the C language for a long, long time (added in C99); so much so that, especially when discussing the subject on the internet, more and more often they predate some of the younger participants.
      [-]
      - saagarjha 1700 days ago
        Personally, I didn't know that C lacked line comments until I ran into a project that compiled with -std=c89 -pedantic.
        [-]
        _kst_ 1700 days ago
        C doesn't lack line comments. The obsolete 1989/1990/1995 version(s) of C did lack line comments.
  - unwind 1700 days ago
    You need to use %zu to print values of type size_t, otherwise you get undefined behavior. So in C, don't run that program. :)
- truncate 1700 days ago
  I don't think that's obscure at all. That's standard lexical scoping. Do it all the time in functional languages.
- IanS5 1700 days ago
  Blocks are a common feature