How to Read 100s of Millions of Records per Second from a Single Disk

(clemenswinter.com)

130 points | by cwinter 2071 days ago

4 comments

  • dang 2067 days ago
  • gopalv 2071 days ago
    Hard-disks aren't slouches when it comes to sequential throughput, but building around disks has some serious non-future-proof consequences.

    Building things for spinning disks is a cul-de-sac, which has no exit over onto the faster characteristics of SSDs.

    A significant part of Hadoop is built around the idea that "random = bad, sequential = good" and this is still arguably true when it comes to writes.

    However, building around sequential reads gives you the wrong idea when the hardware changes underneath you.

    I see a tiny sign of that in this post, where it talks about the smarter data partitioning.

    For example, when dealing something like a join, a spilling sequential write join like Grace hash-join works much better than a memory mapped hashtable for spinning disks - however a grace join will be crushed by a better random IOPS driven SSD.

    Also, if the author is reading this - Zlib isn't a slouch either (just like a spinning disk), the trick is to use the right combination of Zlib parameters. Compression will be slower with Zlib, but the read throughput of Zlib can exceed the performance of memcpy of the same data depending on your data conditions (look for inflate_fast and inflate_slow).

    And if you're getting into the business of improving sequential reads, the next big trick for systems is vertical row partitioning of data (think of TPC-H, except now you stored L_COMMENT in a separate file).

    • sitkack 2071 days ago
      Streaming is better for every technology, of course this might not be always true ... but RAM isn't really random access, that is an abstraction, it is much faster at streaming. Same goes for SSDs.

      > However, building around sequential reads gives you the wrong idea when the hardware changes underneath you.

      If one is going to build a system tuned to the hardware, they can't ignore the hardware. What abstraction should the author be using?

  • canhascodez 2071 days ago
    This seems like a very clever way to accomplish this feat. I think I'm happy that to date I haven't the sort of problems that might require reading 100M records per second. However, I'd be all ears if someone wanted to contribute to my education by telling me what sorts of interesting things that this might enable.
    • 1996 2071 days ago
      You could do without SQL stored data: read on demand, in the native format

      Many people deal with csv made by external sources (ex: financial data). Having to parse and load it, then redo that on every change is tough for non technical users, and still a waste of time for technical users

    • ddorian43 2071 days ago
      Think about searching logs. You could keep an inverted index, or just brute-force-search through raw log files.
    • BitcoinCash 2071 days ago
      Loading past unspent outputs for Bitcoin Cash; to validate new blocks, and when queried by lightweight mobile wallets.
      • tomrod 2071 days ago
        This would be helpful to increase transaction speed.
  • anon12341234 2071 days ago
    Make each record one byte long.