A systems programmer will specialize in subdomains much like any other programmer, but there are some characteristic skills and knowledge that are common across most of them. Systems programmer coding style is driven by robustness, correctness, and performance to a much greater degree than higher up the stack. Most systems programming jobs these days is C/C++ on Linux and will be for the foreseeable future.
- Know how your hardware actually works, especially CPU, storage, and network. This provides the "first principles" whence all of the below are derived and will allow you to reason about and predict system behavior without writing a single line of code.
- Understand the design of the operating system well enough to reimplement and bypass key parts of it as needed. The requirements that would cause you to to build, for example, a custom storage engine also means you aren't mmap()-ing files, instead doing direct I/O with io_submit() on a file or raw block device into a cache and I/O scheduler you designed and wrote. Study the internals of existing systems code for examples of how things like this are done, it is esoteric but not difficult to learn.
- Locality-driven software design. Locality maximization, both spatial and temporal, is the source of most performance in modern software. In systems programming, this means you are always aware of what is currently in your hardware caches and efficiently using that resource. As a corollary, compactness is consistently a key objective of your data structures to a much greater extent than people think about it higher up the stack. One way you can identify code written by systems programmer is the use of data structures packed into bitfields.
- Understand the difference between interrupt-driven and schedule-driven computing models, the appropriate use cases for both, and how to safely design code that necessarily mixes both e.g. multithreading and coroutines. This is central to I/O handling: network is interrupt-driven and disk is schedule-driven. Being explicit about where the boundaries are between these modes in your software designs greatly simplifies reasoning about concurrency. Most common concurrency primitives make assumptions about which model you are using.
- All systems are distributed systems. Part of the systems programmers job is creating the illusion that this is not the case to the higher levels in the stack, but a systems programmer unavoidably lives in this world even within a single server. Knowing "latencies every programmer should know" is just a starting point, it is also helpful to understand how hardware topology interacts with the routing/messaging patterns to change latency -- tail latencies are more important than median latencies.
The above is relatively abstract and conceptual but generalizes to all systems programming. Domains of systems programming have deep and unique knowledge that is specific to the specialty e.g. network protocol stacks, database engines, high performance graphics, etc. Because the physics of hardware never changes except in the details, the abstractions are relatively thin, and the tool chains are necessarily conservative, systems programming skills age very well. The emergence of modern C++ (e.g. C++17) has made systems programming quite enjoyable.
Also, the best way to learn idiomatic systems programming is to read many examples of high-quality systems code. You will see many excellent techniques in the code that you've never seen before that are not documented in book/paper form. I've been doing systems work for two decades and I still run across interesting new idioms and techniques.
> Know how your hardware actually works, especially CPU, storage, and network.
I learned about MIPS CPUs in college, 20 years ago, with Patterson and Hennessy, and built one from logic gates. It was basically a high-powered 6502 with 30 extra accumulators, but split across the now-classic pipeline stages.
I understand that modern CPUs are much different than this. I understand that there are deeper pipelines and branch predictors and superscalar execution and register renaming and speculative execution (well, maybe a bit less than last year) and microcode. Also, I imagine they're not built by people laying out individual gates by hand. But since I can only interact with any of these things indirectly, I have no basis for really understanding them.
How does anyone outside Intel learn about microcode?
For the average systems programmer, 90% of the really useful CPU information can be found in two places that change infrequently:
- Agner Fog's instruction tables for x86, which has the latency, pipeline, and ALU concurrency information for a wide range of instructions on various microarchitectures.
- Brief microarchitecture overviews (such as this one for Skylake), that have block diagrams of how all the functional units, memory, and I/O for a CPU are connected. These only change every few years and the changes are marginal so it is easy to keep up.
Knowing the bandwidth (and number) of the connections between functional units and the latency/concurrency of various operations allows you to develop a pretty clear mental model of throughput limitations for a given bit of code. People that have been doing this for a long time can look at a chunk of C code and accurately estimate its real-world throughput without running it.
My belief is that most programmers today are stuck with a model of the hardware architecture that theuly learned in university, which is why most people are unfamiliar with the bottlenecks in current architectures and lack "mechanical sympathy".
> Most systems programming jobs these days is C/C++ on Linux and will be for the foreseeable future.
I don't disagree, but at least anecdotally a lot of the shops I've been involved with/worked with are really excited about Rust and Go. A previous employer that used C++ exclusively has even shipped a few Golang based tools, and is planning to introduce it into the main product soon. No new projects are being started in C++ either.
Definitely recommend learning C, but having Rust (or Golang) exposure will likely be helpful in the near future.
I do systems programming professionally, there are many reasons that Go is unsuitable for systems programming but all of them come down to having very little control over the resultant assembly that your program generates and very little possibility for abstracting away low-level details. AFAIK modern C++ is excellent for both of these but I only use Rust, assembly and the tiniest amount of C. Rust is absolutely excellent for writing code that looks high-level but compiles to high-quality assembly, if you know what you’re doing.
Yes this is a problem for some things, but at least when I was working in systems you'd be surprised how frequently GC doesn't matter, especially with all the huge GC improvements in recent versions (and future versions) and the trend toward breaking apart monolithic code bases.
The more general problem is that the Go implementation depends on its own runtime library and is not particularly suited for linking into other executables or running in unusual contexts, e.g. without threads or without an OS.
Maybe it is just me, but I've seen this issue multiple times where a programmer used a regular expression thinking they were clever, only to have it backfire later when there was some corner case not covered by their regex (which likely would have obviously been caught if they had just taken the time to write out each case as an if statement). You're usually just gaining complexity in exchange for fewer lines of code, and I'm not sure that is always the best tradeoff. Something to keep in mind when deciding whether or not to use regular expressions.
With that being said, your list of things to know are all things I have to know for my job (these are things almost all programmers should know, in fact) and I would not consider myself a low-level or systems programmer, just an application programmer.
This takes in the definition of a regular language as a set of regular expressions, and generates C code for finite state machine to parse the language. You can visualise the state machine in Graphviz to manually verify all paths, making it much easier to spot hidden corner cases, while being a lot quicker to code than a big pile of if statements.
Been using ragel for a while now, it’s awesome. Ragel is a bit like the vim of parsing: the learning curve can be pretty steep, but once you get it, you’ll be the parser magician. Parsing in general just becomes pretty easy with ragel.
Regardless of implementation (regex vs. conditionals), there should be sufficient unittesting to make sure that all the corner cases are tested. For embedded systems, the complexity tradeoff is relevant though, since a regex library will (probably) take up more code space than some conditionals.
In principle, I agree. In practice, it's easier to miss an edge or corner case in a regular expression than it is in a series of conditionals. That's just another consequence of the complexity tradeoff.
The only thing I would add to this list is security. Understanding how low level code can be exploit and how to code defensively to insure you don't cause bad things. Also the basics about cryptography. It isn't enough to just use a library that implements encryption you must know what and why. You be surprise how much software use encryption that is fundamentally broken causing it being useless to even use.
Regular expressions isn’t the issue, it’s the libraries. Regular expressions are, mathematically, the quickest and safest way to parse text, and they are linear in complexity. But most libraries aren’t...
Are you sure about your epoll statement concerning complexity? 1 thread per client connection that blocks is as simple as it gets to reason about in my experience, and if you don't care about the stack space the thread occupies (mostly virtual anyways) your contemporary Linux kernel handles lots of threads very well.
Sure, there is no complexity to speak of when your threads do not need to cooperate nor share any data.
But then someone has a bright idea that e.g. a global cache to share some state between all threads would be a good optimization, or that threads need to be able to store some global settings, etc. and complexity related to threads start to creep in to your application.
I'm an embedded software engineer. Here's my $0.02 on how to get started:
1. Get a platform you can tinker around with, and recover quickly in case of disaster. Raspberry pi is a good example.
2. Learn how to find and read detailed register-level manuals of the underlying chip you're working on.
3. Learn what a programmer's memory-model is.
4. Learn what peripherals are, and what they do. You may have to dig a bit into basic electronics.
5. Learn what user-space vs kernel-space is.
6. Learn what device trees are, and how to prevent kernel from taking over a peripheral.
7. Based on above information, write a user-space driver for a simple peripheral (e.g. GPIO or UART) in C (don't try to learn assembly, yet). You will not be able to use the peripheral interrupts in user-space.
8. Learn how to verify hardware functionality, and see your driver in action. (Warning: high dopamine levels are reported at this point).
9. Repeat steps 7 and 8 for all peripherals, and see how far you can go.
> 1. Get a platform you can tinker around with, and recover quickly in case of disaster. Raspberry pi is a good example.
I'd recommend something cheaper and simpler like the STM32F4 discovery boards. You won't get linux and will have to program via JTAG but the documentation for the STM32F4 is not as overwhelming as Broadcom's.
I like this suggestion. Their F3 disco boards are also nice.
For a related twist the OP might check out the Nordic Semi's M4 with radio SoC offering, the nRF52840DK. It has an onboard, breakout-able segger JTAG programmer and is easy to get going with Segger's freebie take on a GCC IDE.
Yes and no. Some of the general part features will be familiar (in the sense that an ARM Cortex M[X] is the same as any other Cortex M[X].
In reality every vendor's peripherals, and more importantly HAL and associated drivers and libs, will be different. But if you've done it once learning another vendor's way of life is a bit like learning a new programming language; it's just syntax.
Broadly most (IE communication, ADC, watchdog, etc) peripherals will be similar but there are definitely places where the differences will be larger. I'd expect those to be largely focused on clock trees, interrupt architectures, power modes, etc.
So short answer I agree with parent, get a cheap board and learn some stuff. If you like it, go from there. Do NOT choose a Cortex A though unless you want to really dig into embedded Linux. Systems at that level are way, way more complex and if the goal is to learn "everything about this processor" that will be especially difficult.
I would say absolutely yes. I started with the STM32 parts and have then used ARM Cortex devices from TI, Freescale, and Atmel. They all have different peripherals and of course different peripheral libraries from the vendor but overall they are conceptually similar. It can certainly be a pain in the neck to switch a project from one device to another (lots of gotchas with pin layout, peripheral minutiae etc etc) but starting a new project with a different device should not be too hard!
Except for the Kinetis DMA, I never understood that one!!
An ARM is an ARM withing certain limits so your knowledge transfers. Sadly, every chip has a different way of programming the periperhals. However, every chip has roughly the same core peripherals (I2C, SPI, UART, etc.), and those electrical specifications and how you use them for protocols doesn't change.
Maybe, programming a modern DMA based device running in a full blown OS with userspace, kernel space, an IO MMU, giving you bus IOVA addressing through a VM, with multiple command buffers, SRIOV, etc, while not wildly different isn't the same as doing IO read/writes into a fixed MMIO mapped device region. Doing some basic SPI programming is a start, but one has to understand there are a lot more layers that get added as the systems grows from a AVR level device to a modern server.
Emphasis on 8. If you are writing code that directly controls hardware, then your software is the abstraction.
Hardware is physical and can be quirky. Signals take real clock time to settle down and your CPU is probably much faster than the transition time. Hardware can and does have bugs just like software. A bad ground pin can make your hardware do “impossible” things.
You need to have much more awareness of the context your code runs in the further down the stack you go.
This is an odd ordering, mostly because of things like devicetree's are pretty arm/linux centric. In which case if your doing arm/linux just find a module you can unload or fail to build into your kernel. Real embedded programming is more like grab a cortex-m0, avr, etc and start bit banging.
OTOH, there is a danger that I've seen frequently that embedded systems guys don't seem to be able to make the conceptual leap to fully paged, SMP programming. Having sort of come up this way myself (via mircocomputers without MMUs and just a single processor) it doesn't make any sense to me, but it seems common.
People who have been exposed to a more common multi-threaded and virtualized environment seem to be able to pick up those pieces easier.
Oddly there is a Raspberry Pi 3 sitting on my desk today. I find myself a little reluctant to setup the BTD loop for it (BTD = build test debug), because at the end of the day I'm going to have a (physically small) linux box. Which is fine because I plan on installing a network proxy on it that only needs wifi. But how is the Pi a gateway to embedded programming? It has pinouts you can attach an oscilloscope to?
If by low level you mean embedded systems programming, you will definetly need to be proficient in C.
The other knowledge depends on the product that you will make. For example I work in the automotive industry, where you need to be interested in cars and how the different parts work. Knowledge of electrical engineering and control theory is also valuable. There is a good book about it if you wat to learn how cars works: Bosch Automotive Handbook.
If you go that path, bear in mind that you will not be involved in creating clever algorithm. When safety is involved, you need to program very simple and easy code. The complexity lies in how all the parts and ECUs are interacting with each other. On the upside, you do not need to constantly learn new languages and library, and you accumulate expert knowledge which is interesting for companies.
What if -and I'm sorry to hijack- by low-level OP means a position as a C/C++ developer (asking for ~a friend~ me)? I've always been insanely attracted to the C variants and messed around with them to minor extents, but what might someone _need_ to know to be competitive if they're trying to make a move to that area of SE?
Memory management is the major difference between them and most of the higher level stuff. You need a good grasp of who owns what in a program so you can free and close things when they aren't needed (and not before). Less so in C++ these days of course, but definitely in C.
It's worth picking up some basic gdb skills. Use of ddd can help with this. On windows you can use VS for most of this of course. Picking up the basics of valgrind will also help you.
Get comfortable with the preprocessor. Get comfortable with Makefiles. Get comfortable pulling in library headers and binaries as needed.
I was (mostly) a C programmer for over a decade, but that's about all I can think of right now!
And someone below has just triggered me - FFS use stdint.h!
Pointers, stacks (one in ever 23.7 bugs is a stack smashing bug), bit bashing and endianness, types and coercion at the byte level (see also: pointers, bit bashing), C strings, the stupid rules about when a variable's value is actually written to memory that need to die in a fire, memory allocation/clearing/copying/ownership/freeing, ALWAYS CHECK RETURN CODES, what the heck an lvalue is.
But it is in german, and i believe that a native will not fully understand it if he did not work in the field in germany.
I'm french and studied in the UK. I'm sometimes lost when my colleague use french technical terms, i have to ask the concept behind it to be able to identify the english term which i learnt during my studies.
> However, as an English speaking engineer, I found many of the discussions rather clumsily written. I'm guessing that it was translated from the German by someone who doesn't thoroughly understand the subject matter.
I highly recommend Computer Systems: A Programmer's Perspective for learning C. In particular chapter 3 of that book is what made it all click for me. Understanding how C is translated down into assembly is incredibly useful for understanding pointers and reasoning about your code.
A quick word of warning: be sure to treat C and C++ as two completely separate languages that just happen to have similar syntax. Yes, you can use C++ as "C with classes" (I and many others sure have at times), but you're doing yourself a disservice most of the time if you do.
C enum values are convertible from int; C++ enum values aren't. This is one of the biggest differences in fairly idiomatic C code and has been the case for a very long time (i.e. not dependent on newer C features not being in C++).
Can I be pedantic? If I'm wrong someone will correct me and I'll learn something.
That used to be true, but becomes less true with each new version of the respective language specs. Sometimes the differences are obvious because legal identifiers in C can be keywords in C++, sometimes the difference is subtle, like with the rules regarding type punning.
But the major point is that there are almost always safer or more ergonomic ways to do things using C++ features that are not present in C.
Programs written in this mindset might be the number one reason why C has a bad reputation. Of course, if you just emulate what other languages automate for you, you should better write in these languages.
But in reality, why C is still the best programming language for large projects (IMO) is exactly that the programmer is allowed to choose a suitable structure, such that the program can fulfill the technical requirements. Other languages force the project into a structure that somehow never fits after a couple thousand LOC.
What good programs are written in C that don't have well-structured memory management the way C++ does it with RAII?
"But in reality, why C is still the best programming language for large projects (IMO) is exactly that the programmer is allowed to choose a suitable structure, such that the program can fulfill the technical requirements. Other languages force the project into a structure that somehow never fits after a couple thousand LOC."
Yeah, this doesn't make any sense. The reason is, C++ doesn't impose anything on your program structure that C doesn't, while C, with the limitations it has, imposes a tax on all sorts of ways of structuring your program.
For example, you can't practically write a program using a futures library (such as the Seastar framework) in C. And every program you write in sensibly written C can be translated to C++. The exception might be really really small-scale embedded stuff that doesn't allocate memory.
> What good programs are written in C that don't have well-structured memory management the way C++ does it with RAII?
MISRA C standards, popular in embedded projects especially automotive, ban the use of memory management altogether.
The whole point of RAII is that the compiler manages it for you as far as it can. This is impossible in C because you have to do it manually. You might end up writing malloc() at the top and free() at the bottom of functions but that's the opposite of RAII.
Note that they ban the use of dynamic allocation at run-time except via the stack, but that you are still allowed to allocate from the heap as long as that heap allocation is static for the life of the system. This avoids a whole host of problems related to heap exhaustion that result from allocation timing causing heap fragmentation.
It also eliminates a whole lot of uncertainty in the timing.
If you're running in an automotive environment, you're probably real time; that is, you have to finish your processing before, for example, the next cylinder comes into firing position. You have to hit that, for every cylinder of every rotation of the engine, for any rotation rate that the engine is capable of reaching. You can't be late even once.
Now in the processing you have a malloc call. How long will the call take? Depends on the state of the heap. And what is that state? Depends on the exact sequence of other calls to the heap since boot time. That's really hard to analyze.
Yes, you can get a memory allocator that has a bounded-worst-case response time, but you also need one that absolutely guaranteed always returns you a valid block. And the same on calls to free: there must be a guaranteed hard upper bound on how long it takes, and it must always leave the heap in a state where future allocations are guaranteed to work and guaranteed to have bounded time.
And, after all of that, you still have a bunch of embedded engineers scratching their heads, and asking "explain to me again how allocating memory at all is making my life easier?"
So embedded systems that care about meeting their timings often allocate the buffers they need at startup, and never after startup. Instead, the just re-use their buffers.
RAII is a disaster. Piecemeal allocation and wild jumping across the project to do all these little steps (to the point where the programmer cannot predict anymore what will happen) is not the way to go.
Then all the implications like exceptions and needing to implement copy constructors, move constructors, etc. in each little structure.
As to what C project doesn't just emulate RAII: Take any large C project and you will likely find diverse memory management strategies other than the object-oriented, scope-based one. Also, other interfacing strategies than the "each little thing carries their own vtable" approach. The linux kernel is one obvious example, of course.
But I also want to reference my own current project since it's probably written in a slightly unusual style (almost no pointers except a few global arrays. Very relational approach). https://github.com/jstimpfle/language. Show me a compiler written in RAII style C++ that can compile millions of lines of code per second and we can meet for a beer.
> The reason is, C++ doesn't impose anything on your program structure that C doesn't
Of course you can write C in C++ (minus designated initializers and maybe a few other little things). What point does this prove, though?
It wasn't really implemented for performance, and maybe the language is more complicated -- no doubt it's a lot slower. On the other hand, I can look at any function and see what its inputs and outputs are.
My compiler isn't optimized for performance, either! I didn't do much other than expanding a linear symbol search into a few more lines doing binary symbol search. And I've got string interning (hashing).
I've mostly optimized for clean "mathematical" data structures - basically a bunch of global arrays. This approach is grounded on the realization that arrays are just materialized functions, and in fact they are often the better, clearer, and more maintainable functions. If you can represent the domain as consecutive integer values, of course. So I've designed my datastructures around that. It's great for modularity as well, since you can use multiple parallel arrays to associate diverse types of data.
But anyway, your language looks impressive I must say.
Rust makes intuitive sense to anybody that knows C++ and Haskell (deeply enough to have seen the ST type constructor). There are some natural, healthy memory management decisions you might make in C++ that you can't do in Rust, but that's life. The obvious example would be if you want one struct member to point into another. I like Rust traits or Go style interfaces over classes. cppreference is far far better than any Rust documentation, which aside from Rust by Example, is an unnavigable mess. Oh, and it would be nice if C++ had discriminated unions.
I don't know enough about D to really comment on it (I read the black book on it 9 years ago, and the examples crashed the compiler), but... it has better compile times, right? There's a module system? I'd have to look at the D-with-no-GC story, figure out how hard it is to interop with C++ API's. I think I'd have better feelings about D (or C++) if it didn't have class-based OOP.
This seems like the wrong direction; C++ style projects are either more heavily indirected or make heavier use of compile-time reasoning with the type system. While you can pretend that a structure full of function pointers is a vtable (and the Linux kernel does a lot of this), it's not really the same thing.
Treating C as a sort of "portable assembler" is a lot better, although it runs into UB problems (see DJB on this subject).
The view of C programming that I'm describing is mostly compatible with the concept of treating C as a portable assembler.
I think there is a world of wild and crazy C++ (like, boost::spirit-grade, or std::allocator-using) that you're imagining, that is not what I am thinking of. If you took C, and added vector<T> and hash_table<K, V>, added constructors and destructors so you don't have to call cleanup functions, you'd get a language which most sensible non-embedded C programs would map to, which then maps upward to C++.
Maybe some templated functions like std::min<T> and std::max<T> and add_checking_overflow<T> would be nice to have too.
It depends on whether you think ELASTICARRAY_DECL is within the scope of portable assembler. (It gives you type safety!) (And I don't know what advanced assembly languages can offer in terms of that -- maybe they do too.)
Do C first. Achieve a small functioning project of your own, like the calculator in the back of Kernighan and Pike. This will give you a good understanding of pointer orientated programming and the C tooling and debug strategies.
Then pick whether you want to start from the "front" or "back" of C++; i.e. learning the language in chronological order or in reverse. C++17 idiomatic style is very different from the 99 style that most existing C++ is written in.
I would suggest picking a codebase to work on and learning its style and subset of C++ first.
For the language, K&R is good, and Expert C Programming: Deep C Secrets by Peter van der Linden is a great second book.
But all the fun stuff happens when you start talking with your OS, so get a book about that too. If you are planning to develop on Linux, Michael Kerrisk's The Linux Programming Interface is excellent. Much of it will be familiar to someone used to shells and the terminal, but there will be plenty of new ideas too, and even the stuff you know will get a much deeper perspective.
The Linux Programming Interface is an excellent book. Beyond that, if you're looking to go deeper into the libc in Linux I would recommend taking a look at the man pages. They're very comprehensive, especially the pages in sections 7 and 8 which explain the operating system and administration tasks.
C is definitely doable though. Loads of good recommendations of you search for them. Everyone will recommend K&R. |This book also goes through a lot of the lower-level things, debugging, etc: https://nostarch.com/hacking2.htm.
>before you rewrite your C book, perhaps you should take the time to actually dig into it and learn C first (inside and out.)
which is uninformed, as Zed wrote Mongrel and Mongrel2 in C. Saying he doesn't know C is ludicrous. He might have a different approach to C, but then argue this view instead of claiming your way is the only way. The author of that blog post is saying the book is bad because it is not the way he writes C. Not because it is objectively bad.
Also, replies to that post like "Just for info K&R stands for "Kernighan" and "Ritchie", please, don't compare the bible of C with "just another book on C". It is a blasphemy." are hilarious. People are just parotting "read K&R!!" off of each other. The term Stockholm syndrome is overused but it is very appropriate for people who think C is actually good.
Please, please, please don't suggest people use the k&r book to learn C. It is one bad practice after another and many of the suggestions in that book have given C much of its reputation for buffer overflows, stack smashing, etc.
It's important to realize that algorithms which scale very well on high performance and distributed systems are frequently not the best algorithms to use in an embedded system. For example scanning an array for each lookup instead of using a hash is usually good enough when your array size is relatively small(10 items to several-thousand). And it can mean the difference between spending a week or two coding a hash map versus solving the really important problem of building whatever product you need to build.
Embedded programming is all about forgetting all the really complicated algorithms you might have learned in school, because they usually don't matter, and when they do it's more important to be able to gather performance data than it is to do something fancy.
Traditional concurrency is still a very real concern because DMA engines, specialized coprocessors like the TI N2HET, and offload engines like modern audio codecs all have their own internal firmware that your system must interact with. Even getting the system to boot and to transition to low power states requires you to understand clock trees, clock gating, power supply states, and how to interact with any PMICs. Getting a real-time clock working is a similar story.
Raspberry Pis are good for dipping your feet into a Linux system that uses a specialized bootloader, but if you want to do anything truly embedded you're going to have to go deeper than that and work with a system like the PIC32 or even the RTUs on the Beaglebone Black.
If you're not using a JTAG at least occasionally then you're probably not close enough to the hardware to be considered "embedded."
There are good general rules in a few comments here, but I think it's ok to say that in 2018 the specialization arrived to a point where system/low-level is too large to reply in a good way? One thing is to write games engines, another is to write networking code, yet another embedded systems, device drivers and so forth. There are certain common aspects, but the details of what you need to know change significantly.
On that, unless you ABSOLUTELY love games and game engine design... the field is relatively flooded and comparative pay may not be great. Not to discourage, but making a decent living isn't a bad thing.
I think for low level systems it’s important to be knowledgeable about the design of OSes. For example, how memory works, caching and paging, how programs are executed, how threads and processes are managed, how communication with peripherals is typically done ... a good way of learning this could be implementing a cpu emulator of some kind in rust, which will probably give you a good idea of what areas you need to explore further. There have been a few floating around on HN lately.
I only found Drepper's paper 4-5 years ago, but it would have been a godsend back when I learned how modern computers work. It took a lot of inefficient effort to scrounge together the basics, having a single source that covers most of the essentials is immensely valuable as a starting point. Agner Fog's manuals are another piece of essential reading and reference that I recommend everyone go through at least once.
And yes, the title isn't hyperbole. You can skim the parts describing specific tooling, but the fact that the description of the memory hierarchy and how it works hasn't been internalized by every programmer is a travesty.
I've read through part of it and I find it quite useful when I try to reason about how something ought to perform. I try to do this before measuring and checking if I'm right. Then I try to see where I went wrong (I'm seldom right) and again having an understanding of how the computer works really helps here.
I, honestly, don't know if it will help me in terms of my career because at my current job, it's not something I can put to use. However, I want to learn this because it is fun (for me) to understand things close-to-the-metal.
* don't listen to anybody telling you how it's done
The start of the art is BS and we need a revolution. There will be so many people who tell you that you can only do it in C and if you don't, you can't be taken seriously. The result is that everybody is keeping to C and nobody invests time in bringing new ideas to the field. The chip vendors stubbornly ship you really, really bad C SDK's and have no interest in doing better, for reason beyond my comprehension.
As you mentioned Rust - there is a Rust working group for embedded targets and they do cool stuff, but it's hard and a lot of work. Also, hardware is basically a huge block of global mutable state, so that is a problem to wrap ones head around.
But eventually we have to get rid of "C is the only serious option" which is an argument made by people who know how to write "safe" C and perpetuated by people who can't, but act like it.
[Before you react - I know this is an "extreme" statement and it's not 100% accurate and there is much more nuance to it - it is exaggerated for comic effect ;)]
regarding crappy SDK's from vendors - it is indeed strange they do not put more effort in this. I mean software is a huge cost in embedded as with other fields. My only theory is that it is usually the hardware team that decide what chip to use!? Would be interesting to know if respective tools for CAD/EE/hardware design is any better? (Hard to compare though)
What is your definition of system programmer? If you mean closer to hardware, i’d recommend its worth the switch. Jobs are moving higher in stack and lower layers are more and more stable and abstracted away. Most companies dont need low layer developers off the shelf software is good enough for them. Exception being embedded system teams or customized hardware (network, storage, server) teams. Majority of things they do are age old techniques.
Only reason to go lower in stack is if you really ~like~ love hardware and exploring nitty gritty of how things work. In that case, pick any OS book and get ready to go down the rabbit hole :)
These are valid points for embedded systems without latency/power requirements, but I want to add that many embedded systems require rolling custom assembly and writing everything else in C to avoid the overhead of other languages and meet battery life and/or performance requirements.
Maybe look into mainframes such as the i series. You won't experience churn there. And there will be an increasing need for younger developers in this area in the near future as many of the current developers will be retiring.
Also, you won't be required to work on pet side projects, push them to Github, blog about them and post Show HN comments to build up street cred for future job interviews.
But I have no idea how one breaks into this field. Maybe that's worth an Ask HN or a Stack Exchange post.
There's still quite a few local govt that I've worked with that still use AS/400s / iSeries type stuff for all financials. One just bought a brand new one a year or so ago because it was cheaper to buy a new one than replace it with any other system.
I learned enough to know when to call IBM for support, but with that experience it wouldn't be too difficult to find another job managing one.
I remember 12+ years ago, my old boss bitching about working on an AS/400, then he'd hire his brother (who worked on them for years) to come out to do very simple stuff for quite a high premium.
I just wish they weren't so proprietary, but at the same time, I am glad they are. It's been a love hate relationship.
Now I work with an AIX system. Unix under the hood, but IBM still has a stranglehold on it. Both tanks though.
I saw one at Turkey Hill yesterday, with the text banner. The banner was pretty sweet. You'll find them all over the place. I'm pretty sure a lot of places are still using these systems like department stores and whatnot.
A bit on the more unusual side, one thing my son and I found very instructive on "lower level" concepts is the fact that within PICO-8, which is already a very user friendly high-level environment, you can use peek(), poke(), memset(), memcpy and similar functions on any memory address that PICO-8 uses. Literally all the capabilities that PICO-8 gives you, it exposes memory for it and explains the conventions it uses to read/write that memory, so that you could do it yourself. We were able to make a simplistic "paint" program that draws a pre-drawn cursor sprite and changes pixels whenever you drag the mouse, all using memset(), peek() and poke(), and avoiding line(), spr() and doesn't use any Lua tables. It entirely does this by reading and writing memory. It was a fun experience in learning certain useful low-level memory concepts.
PICO-8 limits the textual size of your input Lua file, so I'd imagine that it might be possible to squeeze more content into your game with a small interpreter and packed binary format represented as a printable string.
You don't have to write everything yourself, but just saying be critical and think ahead when choosing which libraries to use and how you use them. One could even consider cherry picking (e.g. functions) from other libraries/projects that you are allowed, licensed to do so to keep your codebase as small as possible for various reasons, including security.
Learn to get an intuition for how your code performs.
When is it appropriate to allocate memory from the heap? If you're in a rendering or audio processing routine in realtime context, avoid it at all cost.
Think about which parts of the code could profit from optimization. Does learning an assembler for a specific platform pay off or can the compiler do a sufficiently good job with -O3? Use profilers to identify performance bottlenecks.
Think about portability. Does the code have to compile with ancient C89 compatible compilers [Mine has to, and I would be excited to see a Rust to C transpiler]? Can you choose your compilers by yourself? Are they provided by your customers?
There are many different types of jobs. I will assume you mean low level development on an otherwise normal operating system like Linux.
I would start by learning how Linux userspace is constructed. Dig up old Linux From Scratch docs, build your minimal system, experiment with it. Try to understand how different stuff is put together. How elf/static/dynamic libraries works, etc. Look at your running processes (there shouldn't be many if you do LFS) and try to explain everything you see (what this process does, how it does its job, etc.)
Learn how kernel communicates with userspace. What are devices? What are syscalls? What syscalls are available? How filesystems work? How devices work? Etc.
Learn what is the job of Linux kernel. How memory management works, what is virtual memory, what is the difference when programming in user/kernel space.
The best way to learn system programming is definitely not getting up to your ears in a single open source project. You need variety of knowledge because you want to understand in breadth how the system works. Once you get a job or you figure out something interests you more there will be good time to specialize (say you want to specialize in security or containerization etc.)
Low-level software often moves quite slowly and backwards compatibility is important. Embedded systems have to be maintained for decades. So knowing how do handle legacy code, proper dependency management, release management, and requirement management is quite important. Similarly the cost of bugs is usually quite high, because updates are hard. So writing testable code, testing and other QA activities are more important.
I would never ship a product today without remote signed upgrade ability.
All devices that plugs into a windows PC can use Windows Update for example to upgrade firmware. Any device that sits at a remote location probably ought to have a GSM modem in (adding only $1 to the cost of the device) for tracking uptime and firmware updates. Any device which offers an API to other devices should have 'provide firmware update' as part of that API.
Obviously security is a concern with updates, but by signing update files, and making sure downgrades aren't allowed, the security benefits of being able to patch vulnerabilities outweigh the disadvantages of the device manufacturer being able to produce evil update files.
Those who say that your lightbulb/toaster/USB hub don't need automated software upgrades are naive. Software on them will typically consist of many libraries totaling perhaps hundreds of thousands of lines of code. Security vulnerabilities will be found in that code, and even if the security of this particular device isn't of concern, it can be used as a jumping off point to attack other networked devices or for data exfiltration.
Lots of providers are happy to provide worldwide service for free, as long as you pay $10 per gigabyte. I generally budget 1 check-in per day, of about 250 bytes, so a coin cell can easily power it for a few years with a total service cost for 2 years of just a few cents.
I would say learning a bit about the security implications involved. You don't have certain things given to you like other languages.
You have to watch what compiler flags you use, like someone turning off stack cookies, not using clang's sanitizers. Check out https://clang.llvm.org/docs/AddressSanitizer.html it would have prevented the Heartbleed vulnerability if it existed at the time.
You need proper bounds checking everywhere! You should fuzz your code with something like AFL and if you don't have the time to setup test cases for it, just send your program random junk and see if you can get it to seg. fault.
Multithreading is hard, and detecting multithreaded bugs is even harder. Random monkey testing can sometime's help find these, but they are very hard indeed. Monkey testing is just literally if a monkey was smashing your keyboard and your program was open, what would happen?
Know the most vulnerable function calls by taking a look at banned.h from Microsoft's SDL. It looks like it's no longer on the official website, but the author put it up here Take a look at https://github.com/x509cert/banned/blob/master/banned.h. Sometimes you can't avoid using these functions but know why they can be considered bad.
A proper Makefile is your team's best friend. The same can be said with one build machine for your whole team. It is easier now than ever to build and distribute it between your team with Docker. This is just a personal opinion, but I think downloading all the exact library versions you need once and putting them in a Docker image will save your team some pain in the future.
> You have to watch what compiler flags you use, like someone turning off stack cookies, not using clang's sanitizers. Check out https://clang.llvm.org/docs/AddressSanitizer.html it would have prevented the Heartbleed vulnerability if it existed at the time.
If that was true, then so would Valgrind have, and Valgrind was in wide use at the time.
It was however more complicated than that. If my memory serves me right - OpenSSL had its own memory management.
One important thing for an embedded engineer (just like any software engineer!) to know is how to select an appropriate solution for a given problem. This includes both hardware and software, and do it yourself vs off the shelf.
I think it can be summed up with some questions to ask at different points in a project:
- Should I use a microcontroller or a processor? If a microcontroller should I use a simple 8 bit or more featured 32 bit?
- Do I need an operating system like Linux, RTOS like FreeRTOS, or bare metal?
- Are there existing code modules out there to help kick start the project? Like SD card libraries, ethernet middleware, etc
- Should I design a custom PCB or look at development kits, or off the shelf electronics?
- Where should I design in flexibility in the project? What requirements can be solidified to simplify the design?
Knowing about what is out there helps pick appropriate solutions to problems, which will save the most time in the future.
What do you want? Do you want a system to play with for your own learning? Or do you want to build a shipping product?
If for your own learning, do you want a full-powered environment? Or do you want a simple system that you can learn all of, even if it's more of a toy?
If it's a shipping product, do you care more about ease of development, or about total parts cost? (The difference is often quantity that you expect to ship - 10 cents in part costs matters if you expect to make 100 million of them.)
- become good at either C or C++ or both. Lots of fun jobs/projects involve codebases that happen to be written in those languages.
- become fearless and systematic with assembly. You don’t have to be great at it. You just have to have had enough experience looking at it and writing it that you can hack it if you have to.
- learn to read code quickly and accurately. The only way to do that is a lot of practice. It’ll be hard at first but as you practice your reading speed will go up by >10x and eventually it’ll feel like second nature.
- become great at working with large code bases. What makes “systems code” so interesting is really just how big “systems” are. This kinda goes along with the bit about reading code - reading is how you survive in large code.
In embedded and systems-level programming, something I find indispensable is knowing what values fit in what types (for example, an 8-bit type can represent a maximum of 256 values; a 16-bit type can represent 65536 values).
This comes up incredibly frequently, and having it be second nature will benefit you.
A lot of the time, this gives you a starting place for how your inputs and outputs should look, what your method signatures should be. For example, if your data can can fit in a uint8_t, why waste bytes on a uint32_t?
This isn't the be all and end all, especially since modern architectures use larger word sizes/RAM is plentiful/bandwidth is cheap, but it can be a helpful way to frame the conversation when you're architecting software (especially network protocols).
Also the ability to understand and check the performance impact of these choices. For example at a old job on an embedded motor controller the previous engineer had learned 'floats are slow never use them on a microcontroller' and went and removed them from the code wherever he found them. However in the motor control update loop this meant using longs (or possibly even long longs, I don't remember) in a couple calculations. A quick check with a pin toggle showed this was significantly slower than just using floats.
I'm an ME who went into embedded systems, so I'm not sure what qualifies as low level for a web developer and if checking timing using an oscilloscope or saleae is feasible for the type of work OP wants to do. But even just looking at the assembly would have made things obvious in my example.
EDIT- This made me realize that a rudimentary understanding of the assembly language for your architecture will be very valuable. Doesn't have to be enough to actually write any code in it, but it's great to be able to take a look at what was generated when things are acting weird.
Fixpoint arithmetic is going to be significantly more efficient than floating point in many cases, except where you need to use relatively-uncommon operations/functions (which are hardware-accelerated in the FPU), or in cases where the same value might span multiple orders of magnitude (i.e. a direct win for the floating point representation). The inner loop of a microcontroller is not generally one of these cases, though exceptions might be out there.
I was surprised it was faster, from what I remember to avoid floating point operations several large values had to be multiplied together then divided which required using a long long to avoid overflow. The division operation was quite slow. Thinking back I'm not 100% sure it was in the inner control loop but at the very least it was required to generate the new motor position.
Anyway a bit toggle tells all, and if you're making speed optimizations you should be checking that things actually got faster.
RAM is plentiful even on many microcontrollers now, but it becomes much less plentiful when you start storing thousands of data points and need to start bit packing boolean flags and reducing the integer size to squeeze as much as you can in the data structure! Or you need to stream as much of that data over a serial link as possible as quickly as possible!
So I agree, knowing how data is stored is critical.
A topic not covered thus far: what happens when the computer is powered up? Dig into writing your own boot ROM/FLASH. Nothing like having to research the various CPU registers that require initialization during boot to give you an intimate understanding of what the core of the system is.
A project I worked on that was also an interesting challenge was a RAMless monitor ROM. Write a program that will boot up the monitor (not the OS) and provide an interactive console that uses no RAM at all. Commands are to poke or dump RAM, read/write IO ports, probe the bus for devices etc. You can’t use the CALL instruction because you have no stack. Test your code by pulling the RAM modules.
System isn't necessarily low-level. System code is and can be quite high level, the distinction to just calling it "application" or "application level code" is that it caters mostly to system (no direct user interface) or applications (such as system libraries) instead of end users.
Then you can have "low level" as in close to the metal, for example drivers that talk to a hardware. That being said even these are (and should be written mostly) at fairly high level (especially in the user space) and it's only a relatively small part that has to mess with HW specifics.
And then of course there's some embedded IoT stuff that has also different layers of software although for some people working anywhere in the embedded stack would qualify as "low level".
Generally speaking you'll want to lean on a few core technologies and APIs such as POSIX (on linux), have a decent understanding of the von neuman architecture and a solid grasp of C/C++. Having a decent idea of how different subsystems such as network or TCP/IP stack works and such can be useful depending on the domain.
> System isn't necessarily low-level. System code is and can be quite high level, the distinction to just calling it "application" or "application level code" is that it caters mostly to system (no direct user interface) or applications (such as system libraries) instead of end users.
That's exactly my point as well. Well said - ha at least better than I would be able to explain myself.
I'd start with everything by Hennessy and Patterson. That should cover most of how CPUs and memory systems work.
I grew up with Tannenbaum for networking, though most people went toward Comer and Stevens. Maybe there's something even more current.
I honestly don't know anything as good for storage, which is funny since it's my own specialty. I can try to cobble together a reading list if you'd like.
Something on operating systems. Vahalia for an overview, Bach for System V, gang of four for BSD. Yes, study something besides Linux ... but do study Linux as well. Not sure which books are good for that.
Something on compilers. Is the dragon book still the go-to reference here?
Even if you don't work in those specific areas, that should give you the grounding to study more. I'd also throw in something on databases, but don't know a specific book/author. Distributed systems much like storage: definitely learn, can't think of a good single source, can create a list if you want (I think I did one for a colleague not too long ago).
I would be keen to see the list on both storage and distributed systems.
For compilers the dragon book has fallen out of favor (especially as a first compiler book/source). Modern Compiler in ML (don't touch the Java or C versions) and Engineering a Compiler tend to be the goto books now.
My two cents - keep the books/papers/resources on the shelf and go out and build an embedded project. Sure, you could read about interrupts, registers maps, chip datasheets, etc. but the rate of learning will be much greater if you learn by building.
For example, why not build a home automation system?
Start with a Raspberry pi and start hooking up peripherals...light switchtes, timers, sensors, etc. You'll stumble across buses like I2C/SPI, you'll learn about networking. You'll figure out what registers are and learn what interrupts are and what they mean.
You'll get lazy rewriting communication code and stumble across messaging frameworks like MQTT to communicate with devices on your network. You'll run out of money using Raspberry Pis for each new device you build, and you'll find cheaper ways of doing things like desiging devices using MSP430 or ESP8266s.
You'll make mistakes, and you will learn. Best of luck on this new adventure!
Since your goal is to have a career switch I would recommend you the following approach:
- Essential books and papers are good as a reference to learn and improve but focus first on hands-on stuff.
- Get involved in an open source project that fits the 'system level' category, learn how it works and try to contribute to it (again, put your hands-on)
- Start with development in C language, a good tool to build would be a TCP client and server (kind of 'echo' program). The goal of this will be to learn: how to compile a C program, basics of memory management (malloc, free, realloc), networking (create a socket, bind, listen, connect), transfer data (write, read) and so on.
- If possible, learn using a native Linux environment.
All of the suggestions above will pay off if you are persistent and keep trying learning by building; of course, getting involved in an open source project is a must if you want to have a successful career.
Linux Device Drivers 3rd Edition is a bit dated but is still a wonderful introduction to systems programming in C. It's for Linux, but it presents information that is useful when working with any system.
Low-level systems programming is hard and can be a bit boring at times. I agree that debugging is very different from debugging on higher level systems. Most of the time an oscilloscope or an LED is the best debugging tool. Also, in well written low level code you tend to have more code making sure that everything is okay than you have actually doing stuff, which can be tedious. It takes a certain kind of personality to enjoy it.
> Linux Device Drivers 3rd Edition is a bit dated but is still a wonderful introduction to systems programming in C. It's for Linux, but it presents information that is useful when working with any system.
Write a kernel extension. The Mac OS documentation is very comprehensive and helpful. It's a different type of programming to most people's day-to-day stuff. And it's very rewarding to see it load up into the OS (or crash the whole system as the case may be).
Last I checked, most embedded interviews (presumably barring Google etc.) can still be passed if you work through most of K&R (which is quite easy and accessible), know a bit about OS's, and memorize some trivia about locks. Worth reading about some of the deep C nuances as well if your interviewer frequents ##C on freenode.
Linked lists, bitcounting problems (which is covered in the first few chapters of K&R), etc. are all still popular.
The biggest issue you might have is the difference in salary and available positions you're going to see going from FANG Backend Dev at 5 bazillion a year or whatever it is now to 80-120K.
That while they are grinding their professional life on Kernel development to barely make ends meet, others with Geocities-level HTML skills are making 10 times more money with Clickfunnels selling crap nobody needs.
My current go to for this is using SQLite. It's basically made for this purpose. If that doesn't serve, I like the idea of Apache Avro, but some of it's C++ bindings are a little lacking in my opinion.
This is a fantastic first choice, particularly as it sets you up for using a more "real" database for sharing data/scaling in the future.
OTOH, you have to know when no to use it and step up (down?) to something that is text editor hack-able (XML!?) or has barn burner I/O abilities (yah actually just dumping raw buffers with regularized binary data to disk). Or for that matter is used to exchange data with other apps with other services (JSON, and the long list of other data dependent formats, although for at rest exchange I have to point at XML again).
Watch Mike Acton’s CppCon 2014 talk on Data Oriented Design. His advice may not be at all applicable to whatever field you end up in (and it’s a deliberately contentious talk) but IMO it’s essential for a systems level programmer to have enough understanding of the related concepts to be able to argue about them convincingly.
You should learn C. Ideally, you could learn C++ as well, but I think it's best to learn C first. And then treat C++ as a completely separate language. Rust is a good thing too, but it's a little too soon for it to be broadly useful.
Then you should learn Unix. From an understanding point of view, I think it's probably better to learn something like FreeBSD, NetBSD, Xv6 ... Linux is very pragmatic, and very general, and so it doesn't have the purity that smaller, more focused or curated systems have. Once you have a handle on Unix, look at other OSes: Plan9, Minix, FreeRTOS, L4, etc.
Then networking: I suggest starting with IP, TCP, ARP; then step down to the physical layer: Ethernet, Token Ring, 802.11, hubs, and switches; then static routing, RIP, OSPF, and BGP; maybe look at mesh routing. Then some application layer stuff: DNS, NTP, SSH, NFS, LDAP, HTTP, etc. Reading the RFCs is really valuable, and they're remarkably accessible up until say 2500 or 3000 or so.
Security: symmetric and asymmetric crypto, Kerberos, SSL, SSH, OAuth, etc. Read up on pen testing, social engineering, defensive programmings, fuzzing, etc.
Databases: both relational and otherwise. SQL. Wrap your head around how a filesystem and a database are the same and how they're different.
Messaging: some sub-set of protocol buffers, Cap'n'Proto, Avro, Thrift, XDR; brokers vs. p2p; pub-sub vs. directed. There are hundreds of system you can look at here: pick a few that look different.
Learn about complexity analysis, distributed consensus, locking, concurrency and threads.
so far as tools go, you need to understand a debugger (how to use it, and how it works), packet capture and analysis (Wireshark is good), profiling and performance analysis.
That's probably a decent coverage for the software side. The exact focus will differ depending on embedded/real-time vs enterprise, etc.
From the hardware side, I think it's worth starting with an 80's or earlier, 8 or 16-bit system, and learning how it works to the digital logic level. What a simple microprocessor actually does: fetching, decoding, execution units, etc. A Z80 or 6502 or similar system is a pretty simple circuit, and it's worth really grokking how it works.
From there, you can move forward to more complex CPUs, newer memory architectures, newer buses, etc. But it's much harder to dive straight into a modern x86 or ARM CPU and try to understand how it works.
It's a this point that reading Drepper's memory article, and the "everything you should know about latency" article(s), etc, really start to be useful, because you've got a solid grounding in what's underneath them.
You don't need to do this all at once, or before you start working more on backend or systems level code: I'd guess it took me close to 10 years to feel like I had a decent grasp on most of it.
The other posts are good, but I will give you the advice someone gave me (that I ignored) when I was considering the same thing: Consider doing it as a hobby. In general web stuff pays better. This seems crazy to me, but that's where the demand is.
I feel there is the BS there too. You need to get board support packages. Sometimes the memory map is wrong. If you can get board up it might be a timing issue and you have to solder onto the Trace etc.
The number of web jobs is higher. The number of web programmers is higher, too. I don't think you can tell from the number of jobs how the pay situation is going to play out. (Look at the number of fast food jobs. It's huge. That doesn't result in high pay, though...)
I really can't tell whether an entry-level systems programmer gets paid better than an entry-level web programmer. But it seems to me that in web programming, you hit a wall at about ten years, where more experience quits translating into more pay. In embedded, you can find at least some jobs where 30 years experience gets you more pay than 20 years experience.
As with all these things, "sucks" is relative, and there are low paying jobs for anywhere in the stack. There's certainly fewer employers that need full time low-level software developers. So perhaps the contract/freelance approach is the way to go. It's certainly possible to achieve ~$1.5k/day consulting rates for low-level software development if you're an expert in a niche.
For High Throughput compute, and cache correctness. Here are the primers I can give. None of these are required. Strong langauge, cs, interpersonal skills, team inter-interoperability, and commitment to quality (via unit and integration testing, and skill working with liners) can get you a lot further.
The resources I'm linking are supplementary to the above, and you'll likely encounter them in the wild. But they'll help you build a base of knowledge, and give you terminology to search for, and work with.
- If you plan on working with linux these is an excellent reference: http://man7.org/linux/man-pages/dir_section_2.html remember there is no magic in Linux, everything eventually has to go through a system call. So if you learn the systemcalls, you can can learn how things work :)
The word "system" is used a lot to describe many different collections. For example, the molecules that act on glucose to create ATP is considered a "system," as in something like an RF chip that has Tx/Rx on the same piece of silica. Can someone help provide a definition for what a system means in the Hacker News context? What is "low-level" as well.
There are post-mortem crashes where the only thing I can look at in the disassembly. It's not if they write in it, but if they know how to read it. It's more useful, particularly in some cyber security subsets.
"Systems Programming" at the minimum, requires knowledge of Language internals, Runtimes, Compiler Toolchains, OS internals, Drivers, Computer Architecture/Organization, CPU/Multiprocessors and Networking. There are other domains like Security, Parallel/Distributed Processing etc which are cross-functional or orthogonal to the above and can be focused on as needed.
But be warned, earning potential-wise you may lose out(unless you get lucky) to the latest whiz-bang Web/Mobile technology/framework brouhaha. That is the nature of the market. However i consider that the satisfaction of learning and "knowing" how things work, more than makes up for the small loss in earning potential. This technology is also more fundamental and stable and thus will not go away anytime soon.
I have found the following papers/books (somewhat different from the most commonly cited) useful in my study;
Languages: Fluency in C is a must. It is THE "portable assembly" language and is available on everything from 8-bit MCUs to multicore servers. You can also link C modules to everything under the sun thus allowing you to extend almost any other language. C++ is also needed; but judicious usage as a "better C" rather than heavy-duty OOP/Generic/Template-mataprogramming madness.
- Computer Systems: A Programmer's Perspective 3rd ed. based on x86-64. You might also want to get the 1st ed. which is based on 32-bit x86. These books cover the HW/SW interface and thus includes almost all the topics under "Systems Programming".
- The C Companion by Allen Holub.
- Inside the C++ object model by Stan Lippman.
Tools/Toolchain: Knowledge of the GCC toolchain is a must.
- The Definitive guide to GCC by Hagen
- ELF: A Programmer's perspective; paper by H.J.Liu
- C++ Under the hood; MSDN article by Jan Gray
- The Practice of Programming by Kernighan and Pike.
- UNIX Systems for Modern Architectures: Symmetric Multiprocessing and Caching for Kernel Programmers by Curt Schimmel.
- Linux Kernel Development 3rd ed. by Robert Love.
- Essential Linux Device Drivers by Venkateswaran.
- Embedded Linux Primer 2nd ed. by Hallinan
- The Unix Programming Environment by Kernighan and Pike
- Advanced Unix Programming 2nd ed. by Rochkind
- Modern Processor Design: Fundamentals of Superscalear Processors by Shen and Lipasti
- Computer System Design: System-on-Chip by Flynn and Luk
- Foundations of Multithreaded, Parallel, and Distributed Programming by Andrews
- The Art of Multiprocessor Programming by Herlihy and Shavit.
Bare-metal Embedded: Where the "rubber meets the road"
- Embedded Systems Architecture: Explore architectural concepts, pragmatic design patterns, and best practices to produce robust systems by Lacamera
- Patterns for Time-Triggered Embedded Systems by Pont.