Learn Rust with entirely too many linked lists (2019)

(rust-unofficial.github.io)

388 points | by goranmoomin 1497 days ago

17 comments

  • pornel 1496 days ago
    Learning Rust by writing a linked list is like learning Python by writing a CPython extension. Sure, you'll learn a lot, and it may even be useful, but it's not the typical experience of using the language.

    I've seen people completely confused that such simple "CS 101" thing is such a mess in Rust. But nobody actually writes linked lists in Rust:

    • The borrow checker wants to have clear single ownership, and doesn't support reference cycles. Lists happen to have mixed ownership. I'm not sure if it's even possible to prove at compile time that a list will never have a cycle, but it seems at least in the Sufficiently Smart Compiler territory.

    • Safe linked lists are in the std lib if you really need them, but std has also plenty of other containers that are more efficient on modern hardware.

    Compile-time safety checks can't prove all valid programs are valid (halting problem). Instead of going after the enormously complex problem, the borrow checker rules are actually pretty simple. It's a feature, not a defect. Simplifying the problem to single ownership with shared-immutable vs exclusive-mutable makes it much easier to reason about borrow checking. If something doesn't fit these rules, you either use a different approach, or use `unsafe`, and then wrap it in a higher-level safe abstraction that fits the model.

    • peteretep 1496 days ago
      > But nobody actually writes linked lists in Rust

      It feels entirely like you didn't even start reading this. He spends a whole long first page bitching about Linked Lists and why nobody uses them in Rust.

      • cormacrelf 1496 days ago
        I come back to this from time to time as a reference for doing the uglier things in Rust. Writing Iterator<Item=&T> with raw pointers, recursive drops, etc. It's better than reading the std source code, because it's got a guide right next the code. I think I'm using it as intended. The title might be suboptimal/confusing for what it's trying to do. It's less a resource for newcomers than an illustration of someone bumping their head against the wall to do hard things.
    • slaymaker1907 1496 days ago
      Also, the way destructors work make it such that most custom linked lists are prone to stack overflows unless they use unsafe.
      • umanwizard 1496 days ago
        For what it's worth, this is also true of C++, where the most "natural" way to write a linked list is `struct Node { unique_ptr<Node> next; }`
    • Igelau 1496 days ago
      I think you missed the point. The guide is fairly upfront about it being a bad idea and a very atypical approach to learning Rust.
  • Animats 1496 days ago
    Doubly-linked lists are hard with Rust's ownership system. I've argued for Rust having backpointers as a built-in type the borrow checker understands. You have to maintain the invariant that if A points to B, B points back to A. A is an owning pointer, and B is a non-owning pointer locked in a relationship with A. Easy to check if the language lets you say that's what you're doing. Rust lacks that.

    If you have safe backpointers, most tree-type data structures with backpointers can be constructed. A nice feature to have.

    • dpbriggs 1496 days ago
      You can somewhat emulate this with Weak pointers:

      https://doc.rust-lang.org/std/rc/struct.Weak.html

      They don't count against ownership, but do bring some extra headaches of their own (referencing counting overhead, etc).

    • josteink 1495 days ago
      > I've argued for Rust having backpointers as a built-in type the borrow checker understands. You have to maintain the invariant that if A points to B, B points back to A. A is an owning pointer, and B is a non-owning pointer locked in a relationship with A.

      Or you could simply give it a new type/semantic: owner.

      You could even use a familiar unix-shorthand for it: ~. Thus Node<T> will have a Parent: ~Node<T>, which you can pass by ~self.

      And graphs would suddenly be nice to work with.

      > Rust lacks that.

      I can’t be the only one who finds that ironic?

      And I say that as someone who likes rust.

      • steveklabnik 1495 days ago
        It is not particularly ironic; there's no actual design here. It's easy to say that a feature should exist (and your parent has said this, about this, repeatedly) but it's much harder to make sure that it's something that has the correct semantics (and your parent has never actually produced this, even though they've been asked, repeatedly.).

        (See the sibling comment by dan-robinson for just some of the issues here)

        • josteink 1495 days ago
          The absolute simplest semantic I can come up with here (too simple?) would be for a struct containing a non-null owner-pointer to be bound by its owner’s ownership/lifetime-scope.

          For most commonly used graph-types this should lead to pretty simple one-directional ownership chains which should be doable (although probably not trivial) for a compiler to enforce.

    • magicalhippo 1496 days ago
      As a non-Ruster, any particular reason they didn't include this? A lot of the code I've done over the years involve graphs, which have a lot of circular pointer and/or backpointers. I find it kinda weird they'd make such a common thing a PITA in a new language.
      • dodobirdlord 1496 days ago
        Rust's ownership rules aren't for the purpose of making the user's life hard, they are the least restrictive system that could be devised that allow the compiler to uphold Rust's safety guarantees. It was not too long ago that it was common knowledge that a programming language either has a garbage collector or has manual memory management, or possible both. Safe Rust has neither, but not without the effect of making awkward certain kinds of programming that are fundamentally difficult to make safe.

        The Rust question to ask here is: Do you really need pointers, specifically? Or would some other reference mechanism work? You could put all of your graph nodes into a linear data structure like an array, and have them point to each other by holding a list of indices instead of a list of pointers. Or you could give all of your nodes unique keys and keep them in a table, and have them hold references to each other by key. The compiler will not try to prove the correctness of your graph algorithm, and in the event of programmer error that leads to dangling references your program will have to handle the scenario of following an index or a key and not finding a value, so bugs will not introduce memory unsafety.

        There's also ongoing work on memory arenas in the nightly compiler, and I believe some libraries. Putting a graph into an arena is a good way to appease the compiler, because the entire arena will be freed at the same time, ensuring that hanging pointers will never exist between graph nodes.

        • dmitrygr 1496 days ago
          > You could put all of your graph nodes into a linear data structure like an array, and have them point to each other by holding a list of indices instead of a list of pointers

          Rust practitioners keep proposing this solution. It is like you have never heard of caches or do not understand that modern CPUs have multiple special prefetchers for pointer chasing and no prefetchers for "rust fanatics"

          This solution will be SIGNIFICANTLY slower on any modern CPU. By a wide margin! Since you'll keep having to wait for main memory as the cache prefetchers have no idea about this insane scheme and will not prefetch the data for you. They will for actual pointers.

          • adito 1496 days ago
            > This solution will be SIGNIFICANTLY slower on any modern CPU. By a wide margin!

            I'm not an expert and would know more. Can you point out on any benchmark demonstrating those claim? Would love to see the actual number.

            • fredsanford 1495 days ago
              Search for ArrayOfStructures or StructureOfArrays to see benchmarks similar to what you're asking for
          • aw1621107 1496 days ago
            > It is like you have never heard of caches

            But isn’t that part of the point of putting nodes into an array? So the nodes are guaranteed to sit next to each other in memory, and so are quite likely to be in cache?

            > do not understand that modern CPUs have multiple special prefetchers for pointer chasing and no prefetchers for "rust fanatics"

            Doesn’t array indexing reduce down to pointer chasing anyways? Or are you saying that the prefetchers can’t see through “nested” array accesses (e.g. nodes[nodes[i].out[0]].data vs node.out[0].data)?

            • dmitrygr 1496 days ago
              Yes that is what I am saying. See Intel manuals for patterns they can detect. Mostly it is strided accesses and use of pointers at fixed offsets of a structure which the previous pointer pointed to.
              • aw1621107 1496 days ago
                So I guess it comes down to whether the graph would fit into cache and the particulars of the access patterns? If the graph fits into cache as an array the prefetchers would be more or less unnecessary since everything is already there. Beyond that, I probably don’t have enough experience or knowledge to make a good guess.
                • dmitrygr 1496 days ago
                  Yes. If the entire thing fits yes, but then it is small enough that you might as well manage it in Basic or JavaScript. No real programming required
          • dodobirdlord 1495 days ago
            I don’t do much graph programming, and the few pieces of graph programming I’ve had to work with that needed to be high-performance fit in L1/L2. But for most non-pathological graphs (treelike, few backlinks from nodes deep in the graph to nodes shallow in the graph, nodes are small) you could arrange a statistical locality property that child nodes are usually located at nearby greater indices. Most operations on the graph would sweep left->right in small strides likely to be either within the a cache page or onto a page that has been prefetched by the linear memory access pattern prefetcher.

            Certainly this is more awkward to write than pointer linking, but approximately the same performance should be achievable for sparse or treelike graphs. If you need to work with dense graphs larger than the size of the cache performance will suffer a lot, but this meets the criteria to justify using unsafe, at which point the implementation would look a lot like a C implementation.

      • eximius 1496 days ago
        Rust has a concept of ownership and borrowing. (I think) everything is fine when you just have immutable references everywhere. But as soon as you start taking mutable references (of which only one can exist at a time) or ownership (stronger than mutable references), then cycles break things.

        You can get around it with Interior Mutability, it's just slightly more verbose.

        Also, it isn't really that they made an easy thing hard. Properly maintaining the invariants with cyclical references is hard and unsafe. Rust exposes that difficulty in a very literal way

        • wruza 1496 days ago
          Complete blind guess here, but maybe a safest language could “borrow” from accounting systems. Basic idea is that you don’t need to validate every step as long as the outcome of a specific set of operations is atomically valid. Some SQL have it, like checks on foreign keys and column check-exprs on commit, not on insert (that’s how you make back-refs by {insert; insert}, and not {insert; insert null; update}, which is similar to dl-lists and graph problem). Or you check that sums on both sides match, but not before “committing” ops into a book.

          Half of subj code uses sort of atomics (mem-replace), but they are too small to not leak complexity to “userland”. If they just replaced that with

            transaction {
              ...shuffle values...
            }
          
          and checked at “}”, it would be much easier to reason about when programming, instead of building microbridges everywhere. If code is not long and/or threaded, analyzer could calculate “balance” in a reasonable time just by looking at careless source code.

          >Properly maintaining the invariants with cyclical references is hard and unsafe. Rust exposes that difficulty in a very literal way

          But a solution to this problem is articulated easily: adjust corresponding nodes if this one goes away. It could help with that instead of just exposing, like tagging such types as heavily linked and demand/derive an [unoptimal] algorithm that would ensure correctness or define “corresponding” and “adjust” at least.

          ps. I’m not familiar with rust, nor with discussions on it, maybe that was already discussed and refused or proven unreasonable at early design stages.

          • eximius 1495 days ago
            That's kind of how unsafe rust works, except that the compiler doesn't check that you maintained the variants. Your unsafe code is expected to conform such that safe code interacting with it is safe.

            > But a solution to this problem is articulated easily: adjust corresponding nodes if this one goes away. It could help with that instead of just exposing,

            Articulating things easily doesn't mean they are easy. Now every value needs a backreference to every value that holds it and, during its destructor (which isn't guaranteed to run), it has to make those held references invalid?

            I know just enough to know how difficult the problem is, in the general case. The folks thinking about these problems for their day jobs have looked into a lot of simple solutions and they fall apart in cases that are too common or too valuable to no support.

            • wruza 1495 days ago
              Unsafe is not any kind of transaction, it is a deal under the bridge, quite the opposite. Actix author fell victim of that recently, iirc. Anyway, I’m not arguing, it is clear how microcontrol works. My concern is why to do it so hard for a developer to reason about referential balance, when a machine exists.

              >Now every value needs a backreference to every value that holds it

              If it didn’t, wouldn’t that be out of scope of graph/dl-list discussion?

              It is interesting that solutions fall apart, because it seems like a warehouse-level problem to me. Any code path is just +1 -1 countable references with some matching-branching in the end. Are these design discussions archived somewhere, like a mailing list / technical rationale talks?

              ps. I did not mean anything like “rust magically freeing graphs”. Only that “transaction {shuffle} check” thing. Like in physics, N spin in, N spin out, what’s where is not important, since nothing lost.

      • moth-fuzz 1496 days ago
        Historically, the classic graph structures are adjacency lists and adjacency matrices. Those are quite easy to represent in Rust if you use indices instead of pointers. Node/Pointer type graph structures aren't particularly efficient anyway, even in C or C++.
      • Animats 1496 days ago
        This is an enforced invariant. This results in proposals that it should be done by adding a system for general invariants and design by contract features. Those are then rejected as too complicated and something for the far future.

        So, the usual answer is "use unsafe code." What could possibly go wrong?

      • habitue 1496 days ago
        It might actually be possible now with Pin. Could anyone actually knowledgeable about this comment?
        • renox 1496 days ago
          I don't know much about Rust (and nearly nothing about Pin) but if this was true, don't you think someone would already have written a library providing doubly linked list without using unsafe?
      • lrem 1496 days ago
        In your graph, the nodes are not owning each other, are they?
        • magicalhippo 1496 days ago
          No, but they do point to each other. For example, each node can have an array of all the edges, and each edge has two pointers to each node it connects. Via this you can walk around cycles for example. And typically these can change when nodes are inserted or removed.
    • dan-robertson 1496 days ago
      I would guess backpointers are tricky because they are types that need to care about the record field they are put inside in some other type. I would guess that it would be a lot of work to fit something like that into the language and make sure it’s correct. And it would surely become harder if backpointers were allowed between objects of different types.

      One way one can implement a circular doubly linked list is to confine all pointer manipulation to precisely two operations:

      1. Creation of a node x such that x.next = x = x.prev

      2. Given two double-links A <-> B (ie A.next = B, B.prev = A) and C <-> D, swizzle the pointers so A <-> D and C <-> B. Note that if the links are equal then the operation is trivial. Note also that these links must be in the direction of the circular list (ie if you have a circular list A<->B<->C, you can’t “reverse” the A<->B link to give this operation a B<->A link)

      One way to think of this is that every finite permutation (ie disjoint product of cycles ie set of disjoint circular linked lists, but also as permutations act on themselves as a way of rearranging the links to transform sets of disjoint doubly linked lists to other sets of doubly linked lists) is a product of transpositions (which correspond precisely to operation 2; and operation 1 corresponds to the identity element which may be thought of as a product of every cycle of size 1 rather than a product of 0 cycles).

      Suppose you construct the type of the “double pointer”. You use some magic or unsafe code to make sure that the creation operation follows these rules (I don’t really know any rust but I think this rule could be enforced with some careful helper functions and linear types. But rust doesn’t have linear types so maybe there isn’t a type system way to enforce that creation is done correctly or even that it is validated at runtime).

      But now what happens to the ownership with the magic swizzle operation? If the links are from different lists then it is splicing them together and the nodes should (I guess) now have the same owner/lifetime. If the links come from the same list then the operation splices them into two separate lists, so I guess the ownership should be split. The problem is that you don’t really have a way to do that because there isn’t really a way to know or specify at compile time whether two nodes are definitely in the same list or definitely not in the same list. (I guess you could enforce that every element has a pointer to some “owning object” for the list it’s in but now all your operations that were constant time are linear time).

      Adding or removing elements is a special case of these operations.

      It isn’t sufficient to always treat the result of the swizzle operation as if the lists have become the same list because if they have separated then you wouldn’t know to delete half of the list.

      Maybe the reply is just that somehow the typesystem should allow backpointers but somehow not this swizzling operation. But I don’t see how that helps with the ownership problems of doubly linked lists.

      Maybe the answer is that rust should get (or already has) some way to always know whether two things “definitely or definitely don’t alias when you include the transitive closure of all their pointers; but actually only some of their pointers,” but that sounds hard to define and even harder to have the type checker able to resolve.

  • kybernetikos 1496 days ago
    I really value this guide.

    Writing data structures in a language is one of the standard ways that I convince myself that I understand it. Rust really messes hard with that mindset.

    This guide talked me through all the things I'm not used to worrying about in garbage collected languages, showed me the error messages that I'd been banging my head against, explained what they meant, and did so with humor.

    • Igelau 1496 days ago
      This Gonzo approach of running headlong into a bad idea is refreshing.
  • jasondclinton 1496 days ago
    I went through this a few years ago for fun. It's a great guided tour of the compiler errors when working with complex memory safety needs. The most important thing to know about these is that you would almost certainly never do these in a real project: lists are in `std` but even then the vast majority of cases should use `Vec`.
    • grok22 1496 days ago
      I am learning Rust due to it's memory safety features and comments like this rub me the wrong way for some reason -- they give me an uneasy feeling that I will run into unexpected limitations in the language because these corners are not exercised enough. Because people who know the language are saying that the kind of data-structures that I deal with day in and out in my day-to-day work are not the "preferred" thing to do.

      I do system software/networking software and I deal with a lot of linked lists and trees and these comments make me feel that Rust may not be a good language for my use-case in-spite of me wanting the memory safety features.

      • btilly 1496 days ago
        Don't worry. The language has lots of support for the style you wish to work in.

        As for the specific data structures, I recommend setting aside your knowledge of their utility and consider carefully why they are being criticized. Then run some experiments to convince yourself of what is true. If you come to the conclusion that the criticisms make sense, then you can proceed with the knowledge that the people in question have thought about the issue and came to good decisions. This should inspire hope that they have thought about other issues as well and may have valuable insights.

        For the record, I concluded many years ago that some variant on arrays/vectors/whatever outperforms linked lists in most situations by a lot. (There are times when arrays/vectors/whatever can't work. But they are less common than most imagine.) And changes in hardware have made that more true over time. Trees, on the other hand, have a flexibility that lends itself to many problems that it is hard to find other structures for many use cases. But even so if performance is critical, it is worth jumping through a lot of hoops to make sure that all of the pointers that make up a tree are physically close together in memory to improve the efficiency of your caches.

      • jdsully 1496 days ago
        The comment applies word for word to C++ as well. Linked lists are just bad datastructures for modern CPUs.

        I don’t use rust so I can’t speak to the quality of the std lib, but I don’t think its fair to use the parent comment as evidence either way.

        • blub 1496 days ago
          No, it doesn't apply to C++ because the corners of the language required to write lists and trees are exercised enough and don't suffer from unexpected limitations.
          • jdsully 1495 days ago
            The comment originating this discussion provides no evidence in any direction on the quality of the rust standard library. Merely that you wouldn't use these in practice, nor would you in C++.
      • pas 1496 days ago
        Do you currently use C? C++ perhaps?

        I mean basically no other language that currently comes to mind fiddles so much with the basics.

        In Rust, just as in Java, people write highly optimized safe and fast data structures and others build upon that.

        In C/C++ it seems every project reinvents the wheel to a large degree.

        Or am I mistaken?

        • bobbiechen 1496 days ago
          >In Rust, just as in Java, people write highly optimized safe and fast data structures and others build upon that.

          I'm currently taking a class on API design with Josh Bloch (Java Collections, Effective Java, etc.), who pointed out that this wasn't the case until the late 90s or so - he pulled out an example of a KWIC system [1] described in a paper [2] from 1971:

          Parnas 1971: This is a small system [that] could be produced by a good programmer within a week or two.

          And then Josh proceeded to show his implementation of the same system, which he had written in <2 hours just by making use of the built-in collections, regular expressions, sorting, string formatting, etc.

          It's really cool that these standard data structure implementations exist and are so accessible, making people orders of magnitude faster than before.

          [1] https://en.wikipedia.org/wiki/Key_Word_in_Context [2] https://prl.ccs.neu.edu/img/p-tr-1971.pdf

        • grok22 1496 days ago
          I use C. And you are right, in my field there is a tendency to a large degree to re-invent the wheel of data-structures :-). Sometimes justified, but most times not. But mostly because C doesn't usually have well-known, industry-standard libraries for common data-structures. Or even if they do, it's kind of trivial to implement basic data-structures (not saying they will be bug-free!) instead of relying on some random distribution of a library from the Internet.

          BTW, most times, the problem is not performance, but tight control of memory usage.

          • johnisgood 1496 days ago
            Yeah, and there is a reason for why there are not many "industry-standard libraries for common data-structures" out there in C. I think the reason (or one of the reasons) is that there are zillions of ways to implement them, and "one size does not fit all". I often have to ditch the standard library's implementation of this and that in other languages, too, because they are not fine-tuned enough for my use case.

            That said, there are lots and lots of C libraries installed by default on Linux distributions (via the distribution's own package manager!) that are reused by other projects.

          • throwaway17_17 1496 days ago
            I think your point about control of memory usage is spot on. Most conversations tend toward cycle count and execution speed and fail to consider memory usage at all (I know I’m certainly guilty of it).
        • rascul 1496 days ago
          With rust and cargo it's often trivial to use a third party lib from a centralized location that has a rich collection of submitted libs to choose from. It's also fairly easy with python/pip and node/npm. There's nothing quite like it on the same scale and as widely used with c/c++, which means it's more difficult to pull in third party libs. It's often easier to just implement the stuff yourself. Or at least that's my experience. I'm not familiar with java.
          • johnisgood 1496 days ago
            > There's nothing quite like it on the same scale and as widely used with c/c++

            I use my Linux distribution's package manager. It works fine for C, and it is fairly easy, too. You only have to learn to use one package manager; your system's package manager. Plus I am totally fine with "reinventing the wheel" if that wheel is just 2 lines of code. :)

            Heck, I would even go out on a limb here and claim that it is on the same scale and as widely used with C as it is with Rust or Python, if not more. A typical Linux distribution contains quite a lot of C libraries alone upon which other projects written in C (or other languages, for that matter) depend. If the program you install via your system's package manager depends on a C library, it will get installed (obviously), and it is often reused by many other programs. Most C developers I know have the tendency to make their program depend on as fewer dependencies as possible. "Zero dependencies" is usually a "feature" or a selling point, and a good one at that, IMO. I prefer this over having a package manager for all programming languages separately, depending on over 300 dependencies of which 80% is just 2-10 lines of code and so forth. Additionally, take for example this: if I want to cargo build two projects that depend on the same crate, it fetches and builds it twice, or at least it did a year ago. I found it to be odd. It is a waste of space and time. There are ways to solve this.

            • rascul 1495 days ago
              The distro package manager isn't integrated into the prominent c/c++ build systems such as autotools and cmake. There's a good bit more friction involved in making sure the relevant libs are installed and put into the place the build system can figure out since those build systems aren't also integrated into the third party package discovery and installation. I'm not saying cargo makes this perfect, I'm saying it's a lot easier to add a line to Cargo.toml vs forcing the users or other developers to ensure this is done via external mechanisms.

              'cargo build' vs './configure; apt install something-missing; ./configure; apt install something-missing2; ./configure; make'

        • int_19h 1496 days ago
          It is definitely not the case for C++. It has a decent standard library - when it comes to collections, richer than many other languages, in fact - and then there's Boost for more exotic needs.
      • mkesper 1496 days ago
        It's just not needed to implement trivial data structures by yourself if it's already implemented before. Also linked lists might be sub-optimal, see other comments.
      • dodobirdlord 1496 days ago
        Linking my comment from elsewhere.

        https://news.ycombinator.com/item?id=22393723

    • Dylan16807 1496 days ago
      Linked lists are inherently niche on modern hardware.
      • Animats 1496 days ago
        Yes. Lists were fine when memory was random-access. Today, if you're linking all over memory, cache misses dominate performance. So contiguous data structures are preferred.
        • chapium 1496 days ago
          This isn't a linked list issue at all, but a problem with constructing the nodes. If you construct list nodes in a contiguous location cache misses are a non issue.
          • mikepurvis 1496 days ago
            But that's fundamentally a function of the usage pattern. If the usage is "construct a bunch of nodes" and then nothing, then you should be using a vector-type structure anyway.

            The promise of a linked list is being able to iterate and make constant-time insertions/deletions wherever you want. But the more such mutations occur, the more fragmented your memory will end up, and the more you'll get hit by the cache issues—especially if your list is large, exactly where the purported benefits of a linked list are strongest.

            • Jasper_ 1496 days ago
              > If the usage is "construct a bunch of nodes" and then nothing, then you should be using a vector-type structure anyway.

              Not if the nodes are variable-sized.

              • mikepurvis 1496 days ago
                Again it depends a lot on the usage pattern. If you're going to be dereferencing a pointer off to some allocated-elsewhere object on every little operation or comparison then sure. But if there are sorts or searches or other operations that can take place against a fixed-size "index" object, then it may still make sense to have a vector-like or pre-allocated pool of those objects and only pay the cache hit cost when you do major operations on a particular instance.

                Which is really just the cache doing its job. The linked-list anti-pattern is when you're constantly paging in big speculative chunks of memory only to access a single pointer and then move on.

          • estebank 1496 days ago
            That's no longer a linked list, it's a vector.
            • mikepurvis 1496 days ago
              I think the argument is that if the nodes are fixed size then you can preallocate a big block of them slab-style, and for the pointers you just use indexes into that array rather than "real" pointers.

              Potentially there's a small memory savings as well if you can use 2 or 4 bytes for the index instead of a full 8 byte pointer. But you're taking on a lot of complexity and tuning by going this route so you'd really have to test and make sure the gains were worth it.

              • viraptor 1496 days ago
                A bigger saving is in the allocation metadata. Every time you allocate something on the heap, you also need to save information about the size, maybe some flags and padding to align on a page. And if the entries are small, you also get better cache locality.

                Putting things in an array works around all of that (unless you need to handle tombstones)

                • Dylan16807 1496 days ago
                  A reasonable allocator should already be grouping allocations of similar size together. In a whole lot of cases this means your metadata is no more than a single byte per object, and the padding is no more than you'd have inside an array.
                  • viraptor 1496 days ago
                    True, it really depends on the allocator.
      • senderista 1496 days ago
        Yes, along with binary trees, but there are cache-friendly alternatives like unrolled linked lists and B-trees.
      • varjag 1496 days ago
        Seriously? Are trees also niche?
        • Tuna-Fish 1496 days ago
          Yes. For the vast majority of uses, you should probably use a hash table instead.

          There are still niche uses for both linked lists and trees, but the sad fact is that the relative time it takes to chase a random pointer compared to doing literally anything else keeps getting worse, and is already at the point where frequently linearly copying kilobytes of arrays is almost always faster than using a linked list.

          • btilly 1496 days ago
            ...time it takes to chase a random pointer compared to doing literally anything else...

            When I have had to do serious stuff with linked lists, trees, and so on, I found that it was much, much faster to assign an array of nodes, and allocate the linked list out of that close together. There was a lot of complexity in doing so but colocating data to match my access pattern was a big win.

          • dntbnmpls 1496 days ago
            I wouldn't say niche. B-trees and B+ trees with doubly linked lists connecting the leaf nodes are used in every major database server for indexes. Hash tables are generally used by optimizers to handle unindexed data in queries. Doesn't make hash tables niche.

            > and is already at the point where frequently linearly copying kilobytes of arrays is almost always faster than using a linked list.

            That's only half the story. Sure reading arrays in sequential order is fast. What about inserting or deleting items within an array? What are the costs to having to constantly resize arrays? It is very expensive.

            At the end of the day, it's about finding the right data structure for the data and your needs.

          • varjag 1496 days ago
            If your data/algorithm calls for a certain dynamic structure, let's say a red-black tree, replacing it with array is pointless.

            Last time I looked, kobjects in Linux kernel were still dynamically linked in a bunch of ways, and tree-like data structures are widely used.

            • staticassertion 1496 days ago
              I think it's probably fair to call kernel development niche? And the fact that Linux uses linked lists a lot is not necessarily a strong case for linked lists, but just a matter of very specific constraints - like perhaps not wanting to optimize too much for underlying hardware?
              • varjag 1496 days ago
                I used Linux kernel as an example of easily accessible, extremely well known, performance tuned open source project. Of course dynamic data structures are used in countless other applications, like DBMS, web servers and web browsers. Really in nearly any non-trivial codebase.
            • adrianN 1496 days ago
              Sure, but algorithms that explicitly need a linked structure for performance are very few compared to algorithms that have the same or better asymptotic with a hash and a vector or for which the input sizes are such that the large constants pointer chasing causes don't make up for worse asymptotics.
              • varjag 1496 days ago
                If the algorithm calls for a balancing tree, you'd end up re-implementing it across of bunch of (linked) hash tables or over an array treated like a heap. Explicitly, or if one does not know what they are doing, implicitly. In either case you would have no performance advantage.
                • btilly 1496 days ago
                  If you know what you are doing, you may well realize a performance advantage.

                  For example LevelDB (which is conceptually based on Google BigTable's design) looks a lot like a balancing tree in its access patterns, but is a lot faster than a variety of alternatives. And it is faster in part because sequential data is stored sequentially in order instead of using a data structure that results in random access patterns.

                  And no. It doesn't use linked lists.

          • int_19h 1496 days ago
            A hash table is vulnerable to collision attacks, if keys are derived from any kind of untrusted external input. Trees might be slow, but they're consistently slow.
            • deckard1 1496 days ago
              non sequitur. No one mentioned security, nor do any elementary data structures concern themselves with such higher level things. If you're allowing unlimited untrusted data to control internal data structures, you're going to have a bad time regardless of which data structure you're using.
              • eximius 1496 days ago
                And rusts default hashing implementation is hardened against these attacks
              • dlubarov 1496 days ago
                > nor do any elementary data structures concern themselves with such higher level things

                Not explicitly, but they concern themselves with worst-case performance, and tolerable worst-case performance implies security against a certain class of attacks (like HashDoS).

              • int_19h 1496 days ago
                Security is a thing that is implicitly pervasive. If it doesn't apply, that should be explicit.

                And yes, elementary data structures concern themselves with such things all the time - because, regardless of how they "must" be used, in practice they do get used on untrusted inputs on the time, simply because it's the easiest thing. Have you noticed how many languages and standard libraries have switched to random seeds for their hash tables lately?

                And no, it's not true that you're going to "have a bad time regardless of the data structure". You're not going to have a bad time with an RB tree, regardless of where your inputs for keys come from - because it is a data structure that doesn't have a quirk of extremely bad perf on pathological inputs.

            • viraptor 1496 days ago
              This has been mitigated in almost every language already by starting the hash calculation with a random per-process seed. It's a theoretical issue of hash tables, but we've got known practical solutions.
        • Dylan16807 1496 days ago
          What kind of tree? A binary tree isn't great for many uses, and doesn't excel at much. But a nice fat B+ tree removes the vast majority of pointer-chasing latency.
          • kybernetikos 1496 days ago
            I've got quite keen on B+ trees. It seems like the world where we had fast main memory and slow drive storage is not very different to the world where we have fast cache memory and slow main memory.
        • yazaddaruvala 1496 days ago
          Trees and Lists are types of data structures. Both are very useful in all contexts.

          Linked List is a special kind of List. It typically implies a non-sequential data layout.

          With all of the modern layers of abstraction, non-sequential memory access typically means poor cache friendliness, i.e. poor performance.

          Hopefully that helps explain why Linked Lists are considered niche, I.e. specific to embedded programming or in very special cases when benchmarks provide hard data to use a Linked List.

          • varjag 1496 days ago
            Tree is a special case of a linked list.

            Linked list is definitely sequential (it's a list!), and it can even be sequentially allocated in memory, depending on allocator implementation.

            • dpbriggs 1496 days ago
              That's not necessary true. You can implemented binary trees as an array, so are trees a special case of arrays?

              There can be sequential and non-sequential implementations of ADTs.

        • saagarjha 1496 days ago
          B-trees aren’t too bad.
    • monadic2 1496 days ago
      > The most important thing to know about these is that you would almost certainly never do these in a real project: lists are in `std` but even then the vast majority of cases should use `Vec`.

      Not relevant to rust, no?

      • steveklabnik 1496 days ago
        That is true of Rust. The linked list is bad though, even irrespective of this argument about the utility of lists.
  • SAI_Peregrinus 1496 days ago
    Lots of these comments seem confused about the purpose of this. It's not about using linked lists (that's easy, they're in std::collections), it's about implementing them. So the common non-kernel/embedded use cases are well supported, just use std::collections::LinkedList! But if you're in a no-std context then you might need this.
    • Igelau 1496 days ago
      A lot of them seem to have not clicked the link and just took the opportunity to demonstrate how much Rust jargon they know.
  • _bxg1 1496 days ago
    When I was first getting started with Rust this tutorial was eye-opening. It gave me a clear view into the somewhat impenetrable world of working with Boxes (pointers to the heap) directly, in the context of the borrow-checker.

    It also convinced me that you usually just want to use the standard library data structures if you can :P

  • zackify 1496 days ago
    Can’t wait to follow this. I’ve been going through the official book these past couple weeks. As someone who’s never used systems languages (mainly a js / node person) I was able to write a redis-like database quickly on top of TCP. I’ve also picked up async / await and how it works in rust.

    There’s so much to learn but the compiler is surprisingly helpful. I loved Typescript and it is like TS on steroids. When I get stuff compiling I have so much confidence.

  • matthewaveryusa 1496 days ago
    >You're doing some awesome lock-free concurrent thing.

    I can confirm lists are pretty rarely used. The four times I've used a linked list in 10 years of professional programming were:

    -quick&dirty hashmap in C

    -threadsafe queue

    -keeping track of a set of objects that can't be copied ( threads)

    -LRU cache

    In C++ at least, the nice thing about a list is you can push/pop in the front/back without a reallocattion or invalidating iterators.

    • big_chungus 1496 days ago
      I've used linked lists a little more because I mostly do C rather than C++. Sometimes I end up using it as a quick-and-dirty vector when I don't want to bother writing something or using a library, though I typically use some kind of block/arena allocator vs. just malloc.
    • chongli 1496 days ago
      Linked lists are used a lot in game programming where the worst-case behaviour of std::vector and similar structures is undesirable. Linked lists may be slow for a variety of reasons but they're simple and predictable which is very nice when you're trying not to miss the 16.67ms per-frame window.
      • estebank 1496 days ago
        I thought it was the other way around: games using arrays, non arbitrarily growable vectors to precisely be able to iterate over everything quickly in a deterministic fashion. Specially because a lot of what games have to deal with is "visit every node".
        • Jasper_ 1496 days ago
          An array of pointers you call a virtual method on is indistinguishable in "pointer-chase" time (with current caches, at least), to an embedded linked list. Embedded linked lists have better insertion/deletion costs, and are easier to manage, so they're used quite frequently in games.
          • jfkebwjsbx 1496 days ago
            That model (bag of polymorphic objects) is avoided nowadays in everything from games to simulation software.
        • chongli 1496 days ago
          They use arrays for static vertex, texture, etc. data to send to the GPU. They don't use arrays for things that are being created and destroyed all the time, such as units in a strategy game, they use lists for those things.
          • estebank 1496 days ago
            But do they use linked lists for them? I would have assumed you would use an arena for something like that, precisely because you already need to iterate over all of them and can represent any graph as keys into a generational index, so that reading stale keys is trivially handled.

            Again, I'm not a game dev and this is just my understanding after reading up on these topics after watching https://youtu.be/aKLntZcp27M.

            • chongli 1496 days ago
              They certainly used linked lists all over the place in StarCraft, WarCraft, and Diablo [1]. I don't know that many game developers would use complicated data structures like the one you've described here. Linked lists are fantastic for insertion/removal in arbitrary places.

              Game development is not really about showing off with cutting-edge CS research, it's about getting things done. Maybe your arenas with generational indices would be better, but they could take a long time to figure out and lead to a big mess that doesn't go anywhere.

              [1] https://www.codeofhonor.com/blog/tough-times-on-the-road-to-...

              • ssalazar 1496 days ago
                Arenas aren't "cutting edge CS research," theyre used widely in latency-sensitive applications like gaming. This article [1] describes how to implement one for a game engine. Basic versions are easy to understand and provide good results.

                Starcraft/etc. are 20+ year old games and software development patterns as well as gaming hardware have changed in fundamental ways since then. Conventional wisdom nowadays is that cache misses on every list iteration are a lot worse than shuffling a few contiguous bytes around or marking things inactive when removing items.

                [1] https://www.gamasutra.com/blogs/MichaelKissner/20151104/2582...

              • jfkebwjsbx 1496 days ago
                You are wrong in several fronts:

                + Linked lists are a thing of the past.

                + Game engines use complex data structures and algorithms. Some are cutting-edge research implemented from papers, specially in graphics.

                + A generational index is not complex.

                Yes, game development (as opposed to game engine development or video/audio rendering) is a mess. That does not mean the actual technical fields involved are simple.

                • renox 1496 days ago
                  While I don't disagree on your first two points, I disagree with you third: there's nothing complex about generational index.
              • Narishma 1496 days ago
                Those are pretty old games written when caches were small and the CPU-memory speed gap wasn't as large as it is today.
          • jfkebwjsbx 1496 days ago
            No, they don't. You are talking about the state of affairs 20 years ago.
      • SAI_Peregrinus 1496 days ago
        Used, yes, butbnot implemented. std::collections::LinkedList is a doubly-linked list suitable for games. It's basically just the no-std crowd that might need to implement their own.
  • ubercow13 1496 days ago
    This is great as an intro to Rust, I preferred this as a Rust starter tutorial to the main rust book or any other tutorial I tried
    • Waterluvian 1496 days ago
      I'm loving both. The book is comprehensive and teaches you even the most basic concepts, so it's great for a broader skill range.

      But I love this one too because it digs into some CS archaeology and that helps me dig into the theory and history. I doubt I'll ever have to implement a linked list but knowing how they work and are implemented and their advantages and drawbacks is great.

      Tangentially, there's a wonderful game called Human Resource Machine which teaches linked lists and other assembly-like programming without you even realising it.

      • ubercow13 1496 days ago
        The book is very comprehensive but it didn't click for me as a beginner, I found much of the exposition left me with unanswered questions while it went on to cover more ground. I'm not sure if those questions were answered later but I got lost very quickly. Maybe it's aimed at people with more C++ background?

        This linked list tutorial answered every question I thought of almost exactly as I thought of them, which made it a joy to read. I also liked Rust By Example [1] over the book. Based on my experience I'd recommend this tutorial and then implementing something using Rust By Example as reference. But everyone learns differently!

        [1] https://doc.rust-lang.org/rust-by-example/

  • wcrichton 1496 days ago
    Another great space of exercises in this vein are in-place operations on binary trees.

    Specifically, rebalancing [1] proved particularly tricky for my students. I think it's a good litmus test for whether you understand ownership, borrowing, and algebraic data types.

    [1] http://cs242.stanford.edu/f19/assignments/assign6/#22-bst-in...

  • geraldbauer 1496 days ago
    A little more light-hearted but the same style to learn Ruby with Entirely Too Many Fizz Buzzes. See https://yukimotopress.github.io/fizzbuzz
  • naasking 1496 days ago
    I think the doubly linked list is to Rust's ownership semantics as the double slit experiment is to Quantum Mechanics: each exposes a seemingly counterintuitive behaviour, and truly understanding why requires a depth of understanding indicative of true mastery.
  • caconym_ 1496 days ago
    A while ago now, this was the tutorial that got me to the point where I could implement linked structures in Rust. Highly recommended.
  • winrid 1496 days ago
    I'll try this at some point. My first couple forays into Rust made me miss Java and C.
    • rascul 1496 days ago
      If you're anything like me, rust was a bit difficult at first due to ownership, lifetimes, traits, and the borrow checker. At some point though, things finally just clicked in my head and now I have the understanding I needed and my difficulties went away. After that point, I began writing safer code in every language because I was using the same ideas that rustc brute forced into my brain. It took probably a few weeks or so before I got it. Now I rarely have (those) issues, and when I do it's probably because I'm trying to do something weird or I was drinking and missed a place where I obviously should have used .clone() or maybe a reference. I'm just a hobby coder though, perhaps at a beginner-intermediate level with a background of writing unsafe python and c. Your mileage may vary.
      • winrid 1496 days ago
        I guess it just doesn't solve problems for me yet. The problems Rust solves I don't run into. Even with big Java apps with lots of concurrency and crap I hardly shoot myself in the foot. Oh well.
        • rascul 1496 days ago
          Seems to me that if rust doesn't solve problems you have any better than the other languages at your disposal, it might not be the best choice to solve those problems for you. That is very reasonable to me.
          • winrid 1496 days ago
            Right! Also I suppose I'm feeling very disagreeable today. :)
    • winrid 1496 days ago
      Downvotes by the Rust fanatics again! :)

      Edit - it's just funny because it always happens. Say anything against Rust - get downvoted. It kind of reminds me of the really hardcore Linux community.

  • 0xdead 1495 days ago
    I stopped reading when the author started trashing linked lists. If you don’t like linked lists, just say so instead of stating a lie as a fact.
  • ajross 1496 days ago
    > Mumble mumble kernel embedded something something intrusive.

    And this is why kernel/embedded/something/something developers don't take Rust as seriously as you want them to.

    You can't simultaneously declare your language the best choice for system software development and treat the long-evolved patterns of those paradigms as a joke.

    There are very good reasons for intrusive data structures, not least of which being their ability to operate in contexts where no heap is available. If you don't understand them or don't want to talk about them or want to limit your discussion to situations with different requirements, then say so.

    • roblabla 1496 days ago
      The actual context of your quote:

      > Just so we're totally 100% clear: I hate linked lists. With a passion. Linked lists are terrible data structures. Now of course there's several great use cases for a linked list: > > - You're writing a kernel/embedded thing and want to use an intrusive list.

      So I’ve got no clue what you’re railing about. The project specifically acknowledges that there is a need in kerneldev for those data structures.

      I’m a kernel dev using Rust for my kernel. I use both intrusive linked list and growable vectors in it. The thing is, the sentiment expressed in the article really resonates in me: in most cases, growable vectors are a better choice, performance-wise.

      • throwaway17_17 1496 days ago
        The quote that the GP is talking about is included below, which copy/pasted from the project page, and the Mumble mumble line is the heading for a paragraph:

        ‘’’Mumble mumble kernel embedded something something intrusive.

        It's niche. You're talking about a situation where you're not even using your language's runtime. Is that not a red flag that you're doing something strange?

        It's also wildly unsafe.’’’

        Also, prior to this author claims the following, where the first line also a section heading:

        ‘’’ I can't afford amortization

        You've already entered a pretty niche space’’’

        These fiat rulings based on one an authors generalization of what is ‘niche’ are what I assume GP was commenting on. These are the kinds of dismissals that some developers take issue with, as GP states.

        • saagarjha 1496 days ago
          Kernel development is niche. Most of the time you don’t need linked lists.
          • varjag 1496 days ago
            When all you have is Rust everything starts to look like adjustable array.
            • fluffy87 1496 days ago
              Rust has singly linked, double linked ( in its minimal std library) and intrusive linked lists - all with APIs that guarantee a lack of memory errors and data races at compiler. I don’t know any good kernel developer that wouldn’t like those guarantees and I work in the Linux kernel professionally full time.
            • saagarjha 1496 days ago
              What’s an adjustable array? A vector?
              • varjag 1496 days ago
                A vector is a one-dimensional array. Adjustable arrays can have arbitrary number of dimensions, although am not sure if it's a thing in Rust.
                • dpbriggs 1496 days ago
                  I've spent years programming rust and I'm not sure what you mean. Do you mean arrays of tuples, or nested arrays?
                  • varjag 1496 days ago
                    I mean a multi-dimensional array.
                    • dpbriggs 1496 days ago
                      Thanks! Why does everything look like a multi-dimensional array for you in rust?
                      • varjag 1496 days ago
                        An array (an ADT) can be arbitrary dimensional, with one-dimensional case often called vector (and two-dimensional called matrix). An array can also be adjustable, both in one and multi-dimension variants.

                        So a vector can be both adjustable and non adjustable, and some language do have both versions. Some language have adjustable one-dimensional array/vector as the only dynamic aggregate/ordered datatype.

                        And to answer the post I replied to originally, a vector in rust is a one-dimensional adjustable array. To answer your question above, no that's not what I meant, sorry for not being clear enough from the beginning.

                        • dpbriggs 1496 days ago
                          Thanks for clarifying.

                          I think we're just using different definitions. The only real definition of an array I've encountered in work and school is that it's just a one dimensional collection of elements, usually of static size. A vector depending on context is usually the same thing as an array but you conveniently change the size dynamically. Lists can whatever you need it to be given the context, just needs to be sequential.

                          Of course you can represent higher dimension structures by linearizing indices (x + row_size * y, etc).

                          I think people are getting confused as most don't consider arrays to be arbitrarily dimensional without some scheme.

                          Completely off topic but you've reminded me of this great article: https://hypirion.com/musings/understanding-persistent-vector...

                • staticassertion 1496 days ago
                  That just sounds like a graph.
                  • varjag 1496 days ago
                    How in the heaven an array can sound like a graph?
        • imtringued 1496 days ago
          If you can't tell when to break or not break rules then maybe you are not a kernel developer? It's almost like a King only following his own laws instead of being the one making them. Why are you a King again?
      • ajross 1496 days ago
        (Had to edit in a quote to what I was replying to because you fixed the original):

        > The actual context of your fake quote:

        Good grief, that was a verbatim quote! Here's a link to the exact text I quoted:

        https://rust-unofficial.github.io/too-many-lists/#mumble-mum...

    • jszymborski 1496 days ago
      I think it must be stressed that this is an unofficial guide and not "the voice of the project", etc...
    • steveklabnik 1496 days ago
      It explicitly says that there are good use cases, just that they’re very rare.
      • ajross 1496 days ago
        It does not. Maybe somewhere else it does. That section, verbatim, says:

        It's niche. You're talking about a situation where you're not even using your language's runtime. Is that not a red flag that you're doing something strange?

        It's also wildly unsafe.

        But sure. Build your awesome zero-allocation lists on the stack.

        And this is (1) wildly mistating the requirements and (2) deeply offensive to those of us who work in those regimes.

        Obviously, yes, there's room for a reasoned argument about this stuff and whether alternative paradigms can be deployed in heapless contexts. This isn't it.

        • catatattat 1496 days ago
          > (2) deeply offensive to those of us who work in those regimes

          You know how sometimes someone who is A Little Too Online gets offended because they think you said a Bad Thing, but it was actually just a typo or a straight misreading, and you try to explain that they’re reacting to something you didn’t even say let alone believe, but because they have already decided you are the Bad Person they interpret that as you “doubling down“ on the bad opinion that they attributed to you, so they get even angrier and even more convinced that you sincerely believe the Bad Thing?

          I mean no judgment. We all have such sensitivities. But maybe now you see how easily you can wind up on the wrong side of a public debate by searching for offense where there is in fact none.

        • MaulingMonkey 1496 days ago
          > And this is (1) wildly mistating the requirements and (2) deeply offensive to those of us who work in those regimes.

          Care to expand how? While not a kernel dev, I've had my share of use cases where I've done exactly this kind of thing in gamedev, and I simply cannot bring myself to disagree with what's been said, or see what's offensive.

          It's strange, debugging when the pointers get corrupted by other code exhibiting UB is painful, it's a potential multithreading hazard, and flat contiguous arrays are frequently more appropriate - but it's sometimes useful. It's not arguing that an alternative paradigm can - or even should - be deployed in a heapless context. It's explicitly admitting that intrusive linked lists are an appropriate paradigm.

        • steveklabnik 1496 days ago
          > It's a fine data structure with several great use cases, but those use cases are exceptional, not common.
    • jasondclinton 1496 days ago
      https://github.com/Amanieu/intrusive-rs implements what you're looking for with no heap. There's no need to flame or misrepresent what the book says (no one said embedded was a joke). The Rust embedded community is strong.
      • ajross 1496 days ago
        You're like the sixth person to argue I'm somehow "misrepresenting" what is being said in this book? I'm quoting it directly. The "flame" is in the original, and I'm responding to it.

        Take it out. It's bad.

        • jasondclinton 1496 days ago
          Assuming that you're not trolling, I'll explain how you misinterpreted what you read: the section that you quoted is a sub-heading under:

          > An Obligatory Public Service Announcement

          which is an argument that:

          > Linked lists are as niche and vague of a data structure as a trie.

          The author then goes on to state that many people have contacted him to argue that linked lists are not niche and he puts each of those arguments under a heading:

          > Mumble mumble kernel embedded something something intrusive.

          is one such heading. Inside of it, he continues the argument that linked lists for embedded no-heap scenarios are niche and—to continue his argument—we therefore shouldn't be teaching undergrads linked lists just like we don't teach them tries.

          • fhars 1496 days ago
            I think what ajross is arguing is that it is the author of the tutorial who is trolling here. It is all true what you and everyone else say against ajross‘ argument on the purely factual layer, it is just that the tone of the PSA is needlessly aggressive and condescending.
    • robbyt 1496 days ago
      To explain what I think this comment means. When working on embedded systems, you can interact with hardware devices by writing directly to specially mapped memory areas.

      E.g., if you want to write text to a small screen, the kernel driver gives you a memory region that you write bytes to, and they're shown on the screen immediately, without requiring the CPU.

      • redrobein 1496 days ago
        > To explain what I think this comment means. When working on embedded systems, you can interact with hardware devices by writing directly to specially mapped memory areas.

        And why can't you do this in rust?

        • jasondclinton 1496 days ago
          You can. Rust allows full memory addressing.
        • lawn 1496 days ago
          There's nothing to prevent you from doing this, although you may have to wrap it in an unsafe block.
      • monocasa 1496 days ago
        They're saying more that intrusive data structures are really nice in situations that restrict heap allocation. The memory overhead outside of the structures linked together is O(1), because all of the per object metadata is stored in the objects themselves. That means consumers generally don't have to be hardcoded for certain number of objects like you would for an array/vector that doesn't have access to a heap.
    • fortran77 1496 days ago
      My feelings, too. I do both embedded programming (in C/C++ and CUDA) and Erlang programming for a living. Erlang, too, doesn't let you (easily) write your own linked list structures, etc.

      But for embedded programming with tight memory or performance constraints these data structures are essential so we use C++ or even C. They're well understood and the implementations have simple, elegant solutions.

      For "safety" when we don't need absolute control, we'll choose a GC language like C# or F#. No need for the complication of Rust.

      • saagarjha 1496 days ago
        Rust gives you just as much control as C when you need it.
        • throwaway17_17 1496 days ago
          I may be mistaken, but if using unsafe does not allow for the ‘borrow-checker’ to be turned off and allow for code which doesn’t abide by the checker’s requirements, then it clearly does not give “as much control as C”. Again, I might have missed some of the subtleties of circumventing Rust’s static checking, but I don’t think I can purposely create some non-deterministic racy-appearing abstractions, which would be trivial to do using global variables in C.
          • nicoburns 1496 days ago
            You can't turn off the borrow checker for references, but Rust also provides raw pointers which are not subject to borrow checking (these are exactly like pointers in C, and can be cast to and from references (this is a no-op at runtime since they share the same memory representation)).

            https://doc.rust-lang.org/1.30.0/book/first-edition/raw-poin...

            You can create and manipulate raw pointers in safe code, but dereferencing them requires an `unsafe` block.

            • zozbot234 1496 days ago
              AIUI, there are some hardware architectures where even creating a wild pointer might be undefined behavior, regardless of whether that pointer is subsequently dereferenced, and C inherits these requirements. This means that it might be desirable to restrict creation and manipulation of raw pointers to unsafe code in Rust as well, if this can be done without introducing undue incompatibilities.

              (Rust editions would naturally allow for this: Rust 2021 would warn on creating/manipulating raw pointers in Safe Rust, and stop warning for "unnecessary" use of unsafe deriving from these operations; Rust 2024 would make these a hard error ouside `unsafe`.)

              • aw1621107 1496 days ago
                > AIUI, there are some hardware architectures where even creating a wild pointer might be undefined behavior, regardless of whether that pointer is subsequently dereferenced

                Would you be able to point me to some references for such hardware? Im not sure how that would work (at least based on my admittedly limited amount of experience). Wouldn’t a pointer look like any other integer right up until it’s used as a memory operand? Or would said architecture have some way to distinguish pointers and a “regular” integer in registers?

                • Hello71 1496 days ago
                  Allegedly, some platforms have pointer trap representations, where a certain pointer can be created, but may not be used in any operations of that type. No modern systems have such trap representations for pointer types, but the C standard inherits their legacy, and, more importantly, C compilers use it as justification for certain types of optimizations. Since it's not a hardware limitation, Rust can perfectly well take the opposite path and say that the compiler may not use it for those optimizations.

                  https://stackoverflow.com/questions/6725809/trap-representat...

                  • zozbot234 1496 days ago
                    > No modern systems have such trap representations for pointer types

                    This may be incidentally true, but "address sanitizer"-like features are becoming more common on modern hardware, and while these do not currently trap on creation/manipulation of a 'wild' pointer (since, strictly speaking, a trap only happens on dereferencing), there's no solid reason to expect this to remain the case in the future.

                    • int_19h 1496 days ago
                      I don't see how you could trap creation or manipulation, since those pointers are stored in registers and/or memory, and both are fundamentally untyped. How would the hardware even know that something is a pointer, on any architecture that is popular today?
                      • saagarjha 1496 days ago
                        Because you use typed instructions to access them. For example, on ARM with pointer authentication you’ll sign pointers and unsign them right before using them. If you forge a pointer it’ll cause a crash when it’s used because its signature will be incorrect.
                        • int_19h 1496 days ago
                          But that would still happen at the point of dereference, no? Or does it allow to tag even operations like moves and adds?
                  • nicoburns 1496 days ago
                    > C compilers use it as justification for certain types of optimizations

                    I believe that the Rust compiler is free to make it's own choices about what is considered valid, and which optimisations it wants to enable. It doesn't need to follow C's lead here.

                    • Hello71 1493 days ago
                      isn't that what I said
              • Dylan16807 1496 days ago
                Maybe. Or maybe you'd just change it so 'wild pointers' created in safe code are stored as ints on that architecture.
          • zozbot234 1496 days ago
            > if using unsafe does not allow for the ‘borrow-checker’ to be turned off and allow for code which doesn’t abide by the checker’s requirements

            It allows the latter. 'Code that doesn't abide by the checker's requirements' uses separate facilities that are only allowed in unsafe code. This means that `unsafe` doesn't have to turn off anything, and further pinpoints the parts of the code where caution is needed in order to maintain the invariants that Safe Rust is based on.

          • steveklabnik 1496 days ago
            Unsafe doesn’t turn off the borrow checker, but it does give you access to pointers that aren’t checked.
          • fortran77 1496 days ago
            It's sad that we have to create throwaway accounts if we want to be able to simply say we prefer or use other languages over Rust without getting downvoted so much that we'll end up shadowbanned.
          • saagarjha 1496 days ago
            > if using unsafe does not allow for the ‘borrow-checker’ to be turned off and allow for code which doesn’t abide by the checker’s requirements

            It does, that’s why it’s there.

      • terminaljunkid 1496 days ago
        In before someone from rust evangelism strike force chimes in saying rust is actually very simple and borrow checker takes 2 days to get familiar.
  • xlap 1496 days ago
    Basically anything published about Rust is turning me away from the language.

    It could be unfair though: Technical writing (and that includes humorous opinionated pieces) has declined dramatically in the last 10 years.

    Or perhaps writing as a whole has declined.

    • Dylan16807 1496 days ago
      It wouldn't hurt to be specific.