Making SoA Tollerable

(hacksoflife.blogspot.com)

35 points | by ingve 1147 days ago

8 comments

  • amluto 1146 days ago
    > So far all we've done is replaced every STL container with vector. This is something that's easy to do for new code, so I would say it should be a style decision - default to vector and don't pick up sets/maps/lists/whatever unless you have a really, really, really good reason.

    There is frequently a really really good reason: when I write set<Foo>, I want a set, not a vector. This frequently has little to do with the underlying data structure or asymptotic complexity or performance at all — it’s a semantic issue. I want a container of distinct elements in which order doesn’t matter. I want to be able to ask whether an element is in the set, I want to be able to insert an element if it doesn’t exist, and I want a guarantee that the same element isn’t there more than once. Similarly, if I type map, I want a map, thank you very much.

    This can be reconciled with performance: don’t use red/black trees. An actual high quality set or map will likely be backed by arrays or things much like arrays. B-trees can have very large nodes, aka arrays. Judy arrays are a whole pile of different underlying data structures under the hood. Some hash tables are basically arrays with 70% or so occupancy.

    It’s also worth noting that using asymptotically horrible algorithms that are fast on your test case can be a security problem. And it can also cause hilarious slowdowns like the recently reported GTA Online issue. So be careful throwing extra O(n) factors around.

  • snicker7 1146 days ago
    "I view this as a fundamental design limitation of C++, one that might someday be fixed with generative meta-programming"

    As an example of this could be done in a homoiconic language, refer to Julia's StructArrays.jl [0] package. This package can automagically and transparently convert from AoS to SoA without touching 'business logic'.

    Under the hood, this uses the '@generated' macro -- which is essential in a lot of zero-cost abstraction packages, e.g. Blobs.jl [1].

    [0]: https://juliahub.com/docs/StructArrays/jRMFC/0.5.0/

    [1]: https://scattered-thoughts.net/writing/zero-copy-deserializa...

  • AaronFriel 1146 days ago
    Macros or compile-time rewriting of structures makes this possible in languages that support them, Rust has several which avoid baking in assumptions from the library as part of the standard and the crates are a proving ground of the concept. I doubt any of them will ever become part of std, since structure of arrays is somewhat esoteric, but it's a lot like (read: is) columnar data storage, so I could see some common traits being standardized.

    Soak - https://docs.rs/soak/0.2.0/soak/

    soa-derive - https://docs.rs/soa_derive/0.4.0/soa_derive/

    legion and specs from amethyst, see: https://csherratt.github.io/blog/posts/specs-and-legion/

    and many more! https://arewegameyet.rs/ecosystem/ecs/

  • vvanders 1146 days ago
    Lot of truth in this article.

    We used to use the floating point hack for radix sort[1] in alpha sorting transparent quads on a back to front renderer. Depending on the cache line size we'd either have 3 or 4 pass sort radix with a puny little arm chip.

    Guess what, that thing screamed because it was linear memory reads all the way through. We could throw thousands of quads at the thing and it wouldn't even flinch.

    [1] http://stereopsis.com/radix.html

  • 7800 1146 days ago
    Tolerable(sp.), not tollerable.
  • nenadst 1146 days ago
    am i the only one expecting the article to be about Service oriented Architecture ? :-)
  • tiddles 1146 days ago
    I happened to watch a talk by Chandler yesterday that might be the one the author was referring to?

    https://youtube.com/watch?v=2EWejmkKlxs

  • fxtentacle 1146 days ago
    "I haven't pursued this in our code because of the annoyance of having to write and maintain the offset-fetch macros by hand"

    OFFSETOF in stddef.h solves that for you.