The Great Migration from MongoDB to PostgreSQL

(infisical.com)

345 points | by thunderbong 30 days ago

39 comments

bit_flipper 30 days ago
I've run Postgres at large scale (dozens of machines) at multiple companies. I've also run MongoDB at large scale at multiple companies. I like both generally. I don't really care about data modelling differences - you can build the same applications with approximately the same schema with both if you know what you're doing.
I don't understand how folks seemingly ignore Postgres' non-existent out of the box HA and horizontal scaling support. For small scale projects that don't care about these things, fair enough! But in my experience every Postgres installation is a snowflake with cobbled together extensions, other third party software, and home-rolled scripts to make up for this gap. These third party pieces of software are often buggy, half-maintained, and under-documented. This is exacerbated by Postgres' major version file format changes making upgrades extremely painful.
As far as I can tell, there is no interest in making these features work well in Postgres core because all of the contributors' companies make their money selling solutions for HA/sharding. This is an area where MySQL is so significantly better than Postgres (because so many large Internet companies use MySQL) that it surprises me people aren't more unhappy with the state of things. I don't really want to run another Postgres cluster myself again. For a single node thing where I don't care about HA/scaling I do quite like it, though.
[-]
- bognition 30 days ago
  You'll never see true support for horizontal scalability in Postgres because doing so would require a fundamental shift in what Postgres is and the guarantees is provides. Postgres is available and consistent. It cannot truly be partitionable without impacting availability or consistency.
  When an application grows to such a scale that you need a partitionable datastore it's not something you can just turn on. If you've been expecting consistency and availability, there will be parts of your application that will break when those guarantees are changed.
  When you hit the point that you need horizontally scalable databases you must update the application. This is one of the reasons that NewSQL databases like CockroachDB and Vitess are so popular. They expose themselves as a SQL database but make you deal with the availability/consistency problems on day 1, so as your application scales you dont need to change anything.
  Context: I've built applications and managed databases on 10's of thousands of machines for a public saas company.
- BiteCode_dev 30 days ago
  Because vertical scaling can take you so far these days that 99% of companies will never, ever reach the scale where they need more. There is just few incentives.
  Especially since:
  - Servers will keep getting better and cheaper with time.
  - Data is not only in postgres, you probably have redis, clickhouse and others, so the charge is balanced. In fact you may have different dedicated postgres, like one for GIS tasks.
  - Those hacky extensions are damn amazing. No product in the world is that versatile.
  - Posgres has much better support from legacy frameworks like django/ror/laravel than nosql alternatives. People shits on ORM, but they enable a huge plugin well integrated ecosystem that makes you super productive, and PG is happily and transparently handling all that.
  - If by some miracle you actually reach the point you need this, you'll have plenty of money to pay for commercial HA/sharding, or migrate. So why think about it now?
  [-]
  - riku_iki 29 days ago
    > vertical scaling can take you so far these days that 99% of companies will never, ever reach the scale where they need more
    its less about the scale and more about HA and service interruption: your service will be down if server dies.
    [-]
    - hahn-kev 28 days ago
      Never heard of docker/k8s?
      [-]
      - riku_iki 28 days ago
        I don't think these two words will buy you HA automagically. You will need 3 layers of various open source components on top, and I am not sure if they will improve or reduce HA at the end.
- amluto 30 days ago
  > This is an area where MySQL is so significantly better than Postgres (because so many large Internet companies use MySQL) that it surprises me people aren't more unhappy with the state of things.
  I’m not sure precisely what you mean by “HA”, but, in my experience, out-of-the-box support for the most basic replication setup in MySQL is pretty bad. Just to rattle off a few examples:
  Adding a replica involves using mysqldump, which is, to put it charitably, not a very good program. And the tools that consume its output are even worse!
  There is nothing that shops with MySQL that can help verify that a replica is in sync with its primary.
  Want to use GTID (which is the recommended mode and is more or less mandatory for a reasonable HA setup)? Prepare for poor docs. Also prepare for the complete inability of anyone’s managed offering to sync to an existing replica set via mysqldump’s output. RDS will reject the output due to rather fundamental permission issue, and the recommended (documented!) workaround is simply incorrect. It’s not clear that RDS can do it right. At least Azure sort of documents that one can manually real and modify the mysqldump output and then issue a manual API call (involving the directives that you manually removed from the dump) to set the GTID state.
  Want point-in-time recovery? While the replication protocol supports it, there is no first-party tooling. Even just archiving the replication logs is barely supported. Postgres makes it a bit awkward, but at least the mechanisms are supported out of the box.
  But maybe the new-ish cluster support actually works well one it’s set up as long as you don’t try to add managed RDS-style nodes?
  [-]
  - evanelias 30 days ago
    > Adding a replica involves using mysqldump
    That's one path, but it is not the only way, and never has been.
    MySQL 8.0.17 (released nearly 5 years ago!) added support for physical (binary) copy using the CLONE plugin. And MySQL Shell added logical dump/reload capabilities in 8.0.21, nearly 4 years ago.
    Third-party solutions for both physical and logical copy have long been available, e.g. xtrabackup and mydumper, respectively.
    And there was always the "shut down the server and copy the files" offline approach in a pinch.
    [-]
    - amluto 30 days ago
      CLONE is indeed nifty. But why is it a plugin? And who don’t any of the major hosted services support it? (Or do they? The ones I checked don’t document any support.)
      I wouldn’t call xtrabackup or mydumper an out-of-the-box solution.
      [-]
      - evanelias 29 days ago
        What's wrong with CLONE being a MySQL plugin? I mean a good chunk of this page is people praising Postgres for its plugins.
        As for support in hosted cloud providers, that's a question for the cloud providers, no one else can answer this. But my best guess would be because they want you to use their in-house data management offerings, snapshot functionality, etc instead of porting MySQL's solution into the security restrictions of their managed environment.
        Yes, xtrabackup and mydumper are third-party tools, as I noted. If you needed something out-of-the-box prior to CLONE, the paid MySQL Enterprise Edition has always included a first-party solution (MySQL Enterprise Backup, often abbreviated as MEB). Meanwhile Community Edition users often gravitated to Percona's xtrabackup instead as a similar FOSS equivalent, despite not being a first-party / out-of-the-box tool.
- mixmastamyk 29 days ago
  Citus is open source and well financed. This comment may have made sense a few years ago, but no longer.
  [-]
  - evanelias 29 days ago
    By "well financed" you mean "owned by Microsoft"?
    That situation raises a separate set of concerns, especially in the context of Microsoft's main database cash cow being SQL Server, not Postgres/Citus.
    [-]
    - sgent 29 days ago
      How is that different than owned by Oracle?
      [-]
      - evanelias 29 days ago
        Yep, exactly. Apologies, my previous comment was semi-sarcastic but in retrospect that was way too vague :)
        On average, HN leans anti-MySQL, with concerns about Oracle ownership frequently cited in these discussions (mixed in with some historic distrust of MySQL problems that were solved long ago). But I rarely see the same sentiment being expressed about Citus, despite some obvious similarities to their ownership situation.
        Personally I don't necessarily think the ownership is a huge problem/risk in either case, but I can understand why others feel differently.
    - mixmastamyk 29 days ago
      I'm as skeptical of MS as anyone. However it is licensed GNU AGPL, so not particularly worried.
- switch007 30 days ago
  I guess some people really, really dislike Oracle (understandably).
  And MariaDB is lagging behind, less and less compatible with MySQL etc leading to various projects dropping support for it - notably Azure. I wouldn't pick it for a new project.
- jmuguy 30 days ago
  This depends on what level you consider HA and horizontal scaling to be required. I could make the same argument, based on my personal experience, that postgis ought to be included out of the box. Of course, I'll assume most people don't need it :)
- klysm 29 days ago
  I feel like I have read this exact comment before verbatim
zulban 30 days ago
If you like this kind of thing, I also migrated from mongodb to postgresql and wrote about it here: https://blog.stuartspence.ca/2023-05-goodbye-mongo.html
My post is more technical, with examples and graphs, and less business-y.
[-]
- bberrry 29 days ago
  I like your custom prompt. I'll be trying it out for a while!
jpgvm 30 days ago
I have done this migration twice and rethinkdb to PostgreSQL once. At this point I think document DBs are as good as dead for new projects. They will live for a really long time still but are in contraction and rent seeking mode now. Expect MongoDB licensing and hosting to increase in price and languish in terms of feature development from here on out.
[-]
- Jolter 30 days ago
  Maybe you’re just experiencing the trough of disillusionment, from the Gartner curve? I think it’s fair to call document databases a mature technology now, rather than “dead”.
  [-]
  - Semionilo 30 days ago
    If your normal DB can handle documents like a doc focused DB but better, than it might no longer be worth it to have it as a category but as a feature
- aembleton 28 days ago
  Why did you move from RethinkDB to Postgres? I've very little experience with Rethink and have only played around with it a few years ago with some small project but found it to be really interesting. I'm just curious about what issues you experienced with it.
- Beefin 30 days ago
  mongodb is a fortune 100 company and blows through expectations every quarter. doc dbs are far from dead.
  [-]
  - unmole 30 days ago
    > mongodb is a fortune 100 company
    Excuse me? MongoDB Inc. posted a mere $1.68 billion in revenue (with a $177 million loss). Coco-Cola posted $45 billion in revenue and ranks at #100. Forget Fortune 100, MongoDb isn't even in the reckoning for Fortune 500.
  - forgotmyinfo 30 days ago
    Huh, I didn't realize database performance correlates so strongly with stock price. Are there Prometheus metrics for this?
  - jpgvm 30 days ago
    I didn't say it wouldn't make a ton of money, there is a huge number of MongoDB instances out there in the wild in Fortune 500 due to the MEAN stack days (just like RoR before it). I fully expect the company to keep blowing away expectations by increasing how much revenue they extract from that existing customer base. That said they -still- aren't profitable, they might be profitable next quarter in fact I expect them to be with the current environment but to only reach profitability now after their glory growth days are behind them seems... well pretty garbage. If anything it reminds of me of GoPro except instead of cheap Chinese clones it's PostgreSQL and JSONB that is coming for their lunch.
zachmu 30 days ago
PostgreSQL really is eating the database world.
Although in this case, the authors originally chose an architecture that was poorly suited to their data model. They had relational data and put it in a non-relational store. This was obviously always going to cause problems.
[-]
- SOLAR_FIELDS 30 days ago
  Over the years I’ve learned two things when it comes to picking a database when starting a new project
  1 - most data is inherently relational
  2 - Postgres is pretty good at basically any problem you throw at it, pretty scalably. It will fall over at bonkers amount of scale in some cases but you probably won’t have bonkers amount of scale
  Thus, if you want to pick any new data store that isn’t Postgres for your project, the default is Postgres. You have to specifically convince me that Postgres is not good enough for this specific use case if you want me on board for anything else as a data store.
  [-]
  - zachmu 30 days ago
    It's a pretty good default stance, yeah.
    We have been trying to convince people to use our new database [1] for several years and it's an uphill battle, because Postgres really is the best choice for most people. They really have to need our unique feature (version control) to even consider it over Postgres, and I don't blame them.
    [1] https://github.com/dolthub/dolt
- chrisjc 30 days ago
  In my experience, it's always come down to either using a specific database type for the wrong job, or using a database type the wrong way.
  - Treating an OLAP like an OLTP.
  - Using a document store for relational data bc managing relations in SQL is tedious and slows development/progress/features.
  - Using OLTP for analytics.
  - and so on...
  And the usual reaction ensues. Conversations begin about migrating to another product while not really paying attention to interaction patterns and the intent/reason for the various workloads.
  Of course such migrations only lead to either...
  - Place-shifting the issues, or
  - Resolving the issues that lead to the migration while creating new ones that didn't exist before. Workloads that were suited to the existing DB are now mismatched to the new one.
  However, Postgres seems to be a pretty safe bet when such migrations are undertaken due to how versatile it is. Personally, I don't care which database is used (within reason of course). Just use the right database type for the job.
namaria 30 days ago
> As part of that stack, we chose MongoDB + Mongoose ORM because the combination presented least overhead and allowed us to ship quality features quickly. As Sir Tony Hoare states, “premature optimization is the root of all evil,” and there was certainly no need for further optimization at the time.
I find it interesting that one sentence claims they made an optimal choice for feature delivery speed, and the next one that they rationalized it as "non optimal as prescribed by Hoare". Never mind the fact that three sentences down the original quote Hoare said "It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail."
[-]
- dangtony98 30 days ago
  The nature of the use-case evolved and, at one point, the MongoDB + Mongoose ORM combination was in fact optimal judging from various aspects including tooling familiarity with most to the primary use-case being the managed cloud service. As the situation evolved, it no longer became optimal and was superseded by various stack-wide adjustments.
  Regarding tooling familiarity: In the earlier stages of a company, I'd argue it to be important to pick what you can ship with fastest. You don't want to prematurely, for example, setup a complex microservices setup and deployment with K8s for an idea that is in its infancy and that might not even get that much traffic to begin with. Instead, deploying a monolothic app to Heroku might be the optimal choice (YC has a more extreme term for this they call "duct tape engineering"). As the company grows and there is clearer product market fit, it makes sense to further optimize the stack and perform adjustments as needed.
  Regarding the use-case changing, it wasn't entirely obvious in the beginning that a lot of users would self-host the product, so it was designed more-so with the cloud product / managed service in mind; in hindsight things seem more obvious but not in the moment. As it became clearer that more and more people were going to self-host this product, the team shipped more features to accommodate for a simpler self-hosting experience and, as part of that, the PostgreSQL migration.
p-e-w 30 days ago
At some point during the past 10 years or so, the world realized that document databases are actually a bad idea, and relational databases can do the same job better with just a few easy-to-implement QoL improvements (such as JSON operators in SQL).
Meanwhile MongoDB's creators thought this was a great time to make Mongo non-free software, presumably with the goal of making a quick buck from cloud operators.
The result is pretty much the landscape we are seeing today, with Postgres reigning supreme, and even SQLite being experimented with in server setups, while Mongo is on life support and MySQL not far behind.
[-]
- vmatsiiako 30 days ago
  I wouldn't say "Mongo is on life support" – it's actually a very successful business growing 30% YoY on a massive scale. Yet, I agree that the license switch has definitely damaged their long-term ecosystem.
  [-]
  - smt88 30 days ago
    They grew 30% if you look at revenue, but that's not how people determine whether a business is healthy.
    EBITDA is a better metric, and that number was getting dramatically worse every year until 2023.
    But even in 2023, EBITDA was -$202M. I don't see it ever approaching $0.
    So it's still a shit business on top of a shit product that no one ever really needed.
    [-]
    - Joel_Mckay 30 days ago
      I think anyone who administers mongodb would agree it has a list of very dangerous behaviors for novices.
      Try to upgrade a running instance node between some versions, and one may thrash object keys or worse. Best to dump and re-load your data every major upgrade to recover all set properties... and there still may be slight differences in the query language later. Wait, your data sets are 1.5TiB per node.... that is a lot of downtime...
      The cursor, projection, and re-format queries can be very good at reducing reporting traffic. However, the 3 page long syntax can be unreadable to the uninitiated. The json essentially jams a bunch of equivalent SQL queries in 1 transaction, but you still have to be careful as there is no real guarantee of ordering without auto-indices in clusters.
      mongodb is not as clean as SQL, but works for storing constantly changing meta data. i.e. if you find your SQL schema ends up with tuples that have a lot of poorly designed null entries, table edits, and or simply implements an object storage class abstraction in massive meta-data and object-class catch-all tables.
      And yes, the mongodb new licensing model telegraphs an unfavorable policy posture. However, I do understand not wanting fortune 500 companies exploiting your teams time for free.
      Profit is not an indication of quality, but rather fiscal utility. ;-)
      [-]
      - DrNash17 18 days ago
        what is mongo's new licensing model?
    - dangtony98 30 days ago
      I'm actually very bullish on MongoDB.
      While the article and significant sentiment in threads suggest a push off MongoDB toward PostgreSQL, I do think MongoDB has its own place in the stack and that it won't be obsoleted. I've personally had many pleasant experiences working with it in past projects.
      Regarding business metrics, I may have a slight bias coming from the startup world but we often value revenue the most, especially in the earlier stages of a company; we can reduce costs and perform various optimizations in the future but what stays at the foundation is whether or not there is strong product market fit such that more customers keep coming in and coming back to use the underlying product — I would say MongoDB's current growth trajectory is in line with that.
      I'm hopeful overall that they can turn profitable (maybe not this year or next but eventually so).
      [-]
      - smt88 29 days ago
        HN should be full of Mongo's users and advocates, and yet this site loathes it. That by itself is a sign that it's a zombie.
        The reason we don't just look at revenue is that you can "buy" revenue with ads and sales teams. But if that revenue growth costs more than it pays because your customers aren't sticky, then your business doesn't work. You can't apply startup principles to this large, aging public company.
        [-]
        mushi 27 days ago
        “This site” loved ReThinkDB. How is it doing?
    - rrr_oh_man 30 days ago
      I was totally expecting someone to come up with some vanity metric here, but a negative EBITDA of several HUNDRED MILLION? Jesus. How did that happen. B2B business should be a symphony of ka-chings for them.
    - otabdeveloper4 30 days ago
      MongoDB is great if what you need is a replicated log or queue. It's what Kafka should have been.
      [-]
      - smt88 29 days ago
        As others have said, you can just use your RDBMS for that now. Mongo is a feature of better products.
        [-]
        otabdeveloper4 27 days ago
        Postgres doesn't have a sane and working replication story. (Mysql does, but it's full of legacy warts, of course.)
        Mongodb is foolproof and works out of the box.
        The other alternative here isn't Mongodb, it's Couchdb, but that isn't an RDBMS.
    - basil-rash 30 days ago
      Why do you choose EBITDA? Last I heard it was utter nonsense”, at least according to Warren Buffet.
      https://m.youtube.com/watch?v=tvnKylAyLbQ
      [-]
      - roenxi 30 days ago
        You'd probably save time if you directly suggest what income metric you want to use. In the video, Buffet is basically saying he cares about income after interest, tax, depreciation and amortisation. In other words, that he cares about actual income. That'd be very on-brand for Buffet.
        That measure won't be good for MongoDB Inc. either; they've been millions in the red for years. Losing $100 million in a year would be the best result they've achieved in a while now.
        [-]
        _v7gu 30 days ago
        so by Buffet's metric, Mongo is doing even worse
      - airstrike 30 days ago
        Buffett's criticism is that EBITDA is just accounting and he cares about cash flow, specifically also after paying for CAPEX (which is Investing Cash Flow so comes after Operating Cash Flow)
        EBITDA is used in the industry because it is a proxy for operating cash flow. Sometimes you don't have all the available data needed to get to OCF, or you're looking at company guidance (for future EBITDA values) or analyst estimates. It's easier to keep the conversation at the EBITDA level because it requires fewer assumptions. Generally the revenue line is ~easy to estimate because you can conceptualize how to go from the current number of customers to some future number of customers, how many dollars per customers and so on and so forth.
        Then as you work your way to EBIT (Operating Income) you still have to assume some gross margin, R&D expense, etc. These are pretty tangible. It should be pretty easy to get to estimated EBIT from what the company discloses in guidance or what analysts forecast. Since D&A is pretty linear over time, people generally assume it just remains constant as a % of revenue, so now you have EBITDA which is very much like cash flow
        EBITDA is similar to cash flow because it adds back to EBIT the non-cash expense that is D&A. The reason it's good to look at it before interest and taxes is because you're also thinking about how much cash the whole enterprise generates, not how much cash goes to equity holders at the end (which is often called "Free" Cash Flow because it's not tied up with commitments to others)
        Coming back to Buffett, in the industries he tends to pick stocks from, CAPEX is a major thing. Companies need to build factories, buy equipment, etc. So if you just look at future EBITDA without accounting for future CAPEX needs, you're fooling yourself.
        Truth be told, in those industries everyone also looks at "EBITDA minus Capex". Maybe they do so now that he's bemoaned everyone for not doing it in the first place, but IMHO his criticism largely doesn't apply among valuation professionals. Maybe it does for stock traders, but not for valuation purposes like in an M&A context
      - jpgvm 30 days ago
        Buffets point is that it can be gamed to look good. In this case even if they are gaming it to look better than it is.... well it's still awful.
      - phonon 30 days ago
        Their "Comprehensive Losses" for the last 3 years are (in thousands)
        (171,150)
        (343,375)
        (309,090)
        So there you go...
        [-]
        librasteve 30 days ago
        they lost 300 million in one year?
        [-]
        christkv 30 days ago
        Stock compensation cash flow was positive meaning they added cash to their account.
        jeltz 30 days ago
        Are they literally setting money on fire?
      - blitzar 30 days ago
        People who use EBITDA are either trying to con you or they're conning themselves.
        [-]
        aerhardt 30 days ago
        Probably not used in a conning context when it’s 200-300 million in the red.
        [-]
        DrNash17 16 days ago
        Hedge fund investor here chiming in for all of you financial wizards. Hint: don't look at EBITDA. Look at (Atlas) revenue, billings, (current) remaining performance obligations, and operating cash flow. On cash flow, some people decide they want to subtract stock comp (i.e. don't add it back in the GAAP Net Income > OCF reconcilation), but that's a whole other philosophical debate that the investing community has not settled on.
        Can someone tell me what their licensing changes are?
- tgv 30 days ago
  Document databases aren't an inherently bad idea. They have their uses. I'm using mongodb because it fits my use case: the app is very front-end oriented, and the data the user needs on a specific page fits neatly in one document. There are (quite a lot) related documents, but they are only needed in specific parts of the app, and that's when those are retrieved. Pretty simple.
  I could argue that in-server SQLite is a bad idea: if you ever need to share the database or distribute tasks, you're fucked. But for some use cases it just works.
  "The world" hasn't realized shit. It jumps from fad to fad, fueled by hyped promises and lack of experience.
  [-]
  - eddd-ddde 30 days ago
    What about a Postgres database where you store your documents in a key value table of json objects?
    What mongodb benefits would you be missing?
    [-]
    - tgv 30 days ago
      What benefit does switching to postgresql bring? When mongodb stops being maintained, I'll consider postgres, but until then, I now only see downsides.
      [-]
      - eddd-ddde 30 days ago
        Lot's of useful features?
        You can join data when needed, proper transactions, schemas where you need schemas.
        For example you start with a basic (key, json) table. Once you begin to stabilise your schema, add columns generated from your json objects, now you can get proper joins, indexes, validations, on said columns.
- MrBuddyCasino 30 days ago
  I‘m not sure that is the case. What I‘m seeing is that people use the cloud-native document stores such as DynamoDB instead of Mongo. The only thing zoomers hate more than traditional SQL databases is XML.
  [-]
  - cwbriscoe 30 days ago
    Doesn't everybody hate XML?
arusahni 30 days ago
This seems to gloss over the actual rewrite. How did you ensure queries were equivalent? How did you configure the product to be able to read from and write to both databases? Did migrating reveal any bugs? Were you largely able to port your queries 1:1?
[-]
- Semionilo 30 days ago
  You can write logging for both drivers dumping results.
  Then checking which query doesn't return the same result set.
- Cyberdog 30 days ago
  Hopefully they had comprehensive test coverage. That would have helped greatly in that regard.
KingOfCoders 30 days ago
Some years ago we had so much problems with MongoDB that we also ditched it for Postgres. Never been happier.
Not the reason, but very annoying: Why doesn't (perhaps this has changed?) MongoDB use SQL but Javascript with endless $s? It was always a pain writing ad-hoc queries.
[-]
- cj 30 days ago
  > Why doesn't MongoDB use SQL
  Perhaps not the right answer, but SQL literally stands for Structured Query Language. I suppose you'd need an USQL?
  [-]
  - KingOfCoders 30 days ago
    Google says:
    'The original full name of SQL was SEQUEL, which stood for "Structured English Query Language". A language which has been structured in English or lets say English-Like Query language which follows the syntax structure of English.'
iambateman 30 days ago
I use MySQL for everything - always have.
Can someone hit me with a few reasons why you would use Postgres over MySQL? I don’t have any familial affinity to any database, but I’m not sure what the benefits to Postgres are relative to MySQL.
[-]
- phamilton 30 days ago
  Some things that are nice in Postgres and missing in MySQL:
  - Create constraints as NOT VALID and later VALIDATE them. This allows you to create them without expensive locks.
  - explain (analyze, buffers). I miss this so much.
  - Row level security.
  - TOAST simplicity for variable text fields. MySQL has so many caveats around row size and what features are allowed and when. Postgres just simplifies it all.
  - Rich extension ecosystem. Whether it's full text search or vector data, extensions are pretty simple to use (even in managed environments, a wide range of extensions are available).
  Is that (and more) enough for me to migrate a large MySQL to postgres? No. But I would bias towards postgres for new projects.
  [-]
  - dvnguyen 30 days ago
    How about HA and horizontal scaling? I’ve heard that MySQL excels in that area.
    [-]
    - phamilton 29 days ago
      I mostly have used AWS Aurora there, which is significantly better than vanilla MySQL or Postgres and both are similar enough.
      In Aurora, Postgres has Aurora Limitless (in preview) which looks pretty fantastic.
      As far as running yourself, Postgres actually has some advantages.
      Supporting both streaming replication and logical replication is nice. Streaming replication makes large DDL have much less impact on replica lag than logical replication. As an example, if building a large index takes 10 minutes then you will see a 10 minute lag with logical replication since it has to run the same index build job on the replica once finished on the primary. Whereas streaming replication will replicate as the index is built.
      Postgres 16 added bidirectional logical replication, which allows very simple multi-writer configurations. Expect more improvements here in the future.
      The gap really has closed pretty dramatically between MySQL and Postgres in the past 5 years or so.
- wlll 30 days ago
  I do scaling and performance work, mostly with Rails apps, but a significant amount of the work is database level and not language specific. I've used both postgres and MySQL (and a few other databases) going back to 2000.
  The best thing I can hear from a company when I start is "We use Postgres". If they're using postgres then I know there's likely a far smoother path to performance than with MySQL. It has better tooling, better features, better metadata.
  [-]
  - Pengtuzi 30 days ago
    Two paragraphs about yourself and ending with a very vague sentence that “answers” their question. 10/10
    [-]
    - wlll 29 days ago
      It's context and weight for what would otherwise be just another opinion.
  - derekperkins 19 days ago
    Postgres does several things better than MySQL, but tooling isn't among them.
- eknkc 30 days ago
  Last time I used MySQL I had some delete triggers to clean up some stuff.
  Apparently MySQL does not run delete triggers in case the rows are deleted due to a foreign key cascade.
  Eveytime I used its slightly advanced features, I ran into such problems. With PostgreSQL I do not need to think if this would work.
- andybak 30 days ago
  PostGIS was way ahead of the MySQL equivalent last time I checked.
  The plugin ecosystem is pretty astonishing. Foreign Data Wrappers... I'm not hands on so much any more but there were a lot of things back when I was.
- dpcx 30 days ago
  I'll give you the one that matters to me: in MySQL, you can't do DDL statements (create table, alter table, etc) inside a transaction. MySQL will implicitly commit after each DDL statement.
  [-]
  - kstrauser 30 days ago
    Aww, yes. It’s so nice being able to wrap giant schema and data migrations in a transaction so the whole process is atomic.
- tianzhou 30 days ago
  Architecture-wise, Postgres is more extensible. Another reason is licensing. Both contribute to a viral ecosystem.
  https://www.bytebase.com/blog/postgres-vs-mysql/
  [-]
  - p4ul 30 days ago
    This is exactly what I would argue. PostgreSQL makes it straightforward to create extremely powerful extensions.
    PostGIS is one such extension, and I would argue that if your use case involves geospatial data, then PostGIS alone is enough of a reason to use PostgreSQL!
- gkbrk 30 days ago
  Not worrying about Oracle suing you is a good reason to use PostgreSQL or MariaDB.
- kqr 30 days ago
  Back when I selected which database to get more fluent in one of the concerns I had about MySQL is that it was decribed as playing loose and fast with type casts, at least compared to Postgres.
  [-]
  - derekperkins 19 days ago
    That's thrown around by people who have been around a long time, but hasn't been true by default for a decade, since 5.7 was released.
- teunispeters 30 days ago
  Stability, reliability, consistency. If you do anything with Unicode you're also much better in PostgreSQL. Faster indexes, smaller base install (), and much more complete SQL language support.
  () note: when PHP was taking off, MySQL had a smaller install base. This has long since changed - PostgreSQL hasn't grown much over the years, and MySQL has, at least since the last time I worked on both circa 2015-ish.
  [-]
  - majewsky 27 days ago
    Friendly nitpick, because I had a double-take when reading the second paragraph: Be careful to differentiate between "install base" and "base install". In both cases, you are referring to "installed size of the database in its base configuration". But "install base" commonly means "number of installations". So I was very confused when the second paragraph was implying (using the standard meaning for "install base") that the number of Postgres installations had not grown over the years.
    [-]
    - teunispeters 19 days ago
      Fair. I was referring to the space a minimal installation takes.
- anticodon 30 days ago
  PostgreSQL supports more SQL features and data types out of the box. Also, it looks like MySQL development has stalled after purchasing by Oracle. PostgreSQL has exciting new features in every release, I forgot when anything significant happened in the MySQL world. It's frozen for like a decade now. There're some new releases, but you won't find anything exciting in the change log.
- callalex 30 days ago
  I know this isn’t a technical reason, but my main reason is “eww, gross, it smells like Oracle here.” I’ve been around long enough to know that even being in the same zip code as Oracle is a bad idea.
- megaman821 30 days ago
  I use MySQL mostly, but I would love to have a few features from the Postgres world; namely, the better full-text search, key value store, queue and vector search. A lot of projects I have never reach the scale where I need these to be separate data products so the perfectly fine Postgres versions would suffice.
  [-]
  - baur 30 days ago
    CrateDB might be a good fit for full text and vector search (it’s SQL database but has dedicated clauses for FT and VS).
    Curious how do you use PG for key/value and queue - do you use regular tables or some specific extensions?
    I can imagine kv being a table with primary key on “key” and for queue a table with generated timestamp, indexed by this column and peek/add utilising that index.
kcartlidge 30 days ago
I've used (and introduced) MongoDB in production. Though I much prefer PostgreSQL, SQLite, MySQL, or SQL Server, for some use cases a document database is fine.
However as I discovered myself, once you realise you need to use Mongoose with it you should usually take that as a prompt to consider going relational.
Don't get me wrong, Mongoose is a good package. But the things it solves could likely be better fixed by moving away from MongoDB - I'd go so far as to say that in most cases Mongoose should only be added to help an existing project; if you need it from the start you probably should go relational.
(YMMV and there will be exceptions.)
endisneigh 30 days ago
Take away is this has less to do with technical reasons and most to do with licensing, which is fair.
The other takeaway is the fact that they saw huge gains switching as a result of query optimizations with joins shows that they’re data wasn’t properly modeled for use with a key value store, which probably added to them switching to the right type of store to begin with.
[-]
- ako 30 days ago
  And that’s the core problem with key value stores, your data needs usually grow beyond key value scenarios. In the beginning it might fit, but then you add more pages with different needs, reporting/dashboards with different needs, APIs with different needs, and ETL processes with different needs. Trying to force everything into key value is short term thinking.
  [-]
  - endisneigh 30 days ago
    What you’re describing isn’t an inherent problem with a key value store. Forcing everything to OLTP SQL is also short term thinking. Pick the right tool for the job.
    It’s funny you mention growth as the limiting factor for key value stores, if anything that’s the one area where they’re objectively superior. Which is why most search databases and caches are key value stores.
    [-]
    - ako 30 days ago
      Not if you use a rdbms to store your key/values. Then you can do arbitrary queries with joins over multiple tables, add views, stored procedures, etc.
      [-]
      - endisneigh 30 days ago
        Sure, but that’s not related to what you said prior. Again, right tool for the job.
        [-]
        ako 30 days ago
        Sure. I meant grow in number of ways you use your data, not in volume of data.
        [-]
        endisneigh 30 days ago
        What you meant was understood and my point stands
paulmd 30 days ago
In this world you either evolve to Elastic/Cassandra/etc or return to Postgres :crab:
You really gotta search your soul and ask yourself: your cool idea really a database, or just a plugin for Postgres? And the answer is often the latter.
[-]
- winrid 30 days ago
  The problem is that postgres is not a replacement. The JSON operations are not atomic which is terrible. If you wanna use PG that's cool, but I'd suggest just avoiding JSON.
  [-]
  - radiospiel 30 days ago
    > The JSON operations are not atomic
    I hear this today the first time. What exactly os not atomic, and is there a resource with more details?
    [-]
    - adastral 30 days ago
      Not sure if this is what the above comment means by "atomic", but a shortcoming of Postgres' JSON support is that it will have to rewrite an entire JSON object every time a part of it gets updated, no matter how many keys the update really affected. E.g. if I update an integer in a 100MB JSON object, Postgres will write ~100MB (plus WAL, TOAST overhead, etc.), not just a few bytes. I imagine this can be a no-go for certain use cases.
      [-]
      - bandrami 30 days ago
        It drives me batty to see people store 100MB JSON objects with a predictable internal structure as single records in an RDB rather than destructuring it and storing the fields as a single record. Like, yes, you can design it the worst possible way like that, but why? But I see it all the time.
        [-]
        Temporary_31337 30 days ago
        Because schemas. The whole point of nosql is that you can alter your data model without having to reload the whole database
        [-]
        blksv 30 days ago
        Actually, that's the whole point of RDBs: that you can alter your data model (in most cases) just by a simple DDL+DML query. And it is with NoSQL that you have to manually download all the affected data from the DB, run the transformation with consistency checks, and upload it back. Or, alternatively, you have to write your business logic so that it can work with/transform on-demand all the different versions of data objects, which to my taste is even more of a nightmarish scenario.
        bandrami 30 days ago
        Which is great in the early stages of development, but people actually deploy like this
        [-]
        aerhardt 30 days ago
        The benefits of going schemaless in the early stages of development are highly suspect in my experience. The time that one might save in data modeling and migrations comes out from the other end with shittier code that’s harder to reason about.
        [-]
        blksv 30 days ago
        My perspective is that using NoSQL does not save time in data modeling and migrations. Moreover, one has to pay in increased time for these activities, because (a) in most cases, data has to follow some model in order to be processable anyway, the question is whether we formally document and enforce it at a relational storage, or leave it to external means (which we have to implement) to benefit from some specifically-optimized non-relational storage, (b) NoSQL DBs return data (almost) as stored, one cannot rearrange results as freely as with SQL queries, not even close, thus much more careful design is required (effectively, one has to design not only schema but also the appropriate denormalization of it), (c) migrations are manual and painful, so one had better arrive at the right design at once rather than iterate on it.
        That is, of course, if one doesn't want to deal with piles of shitty code and even more shitty data.
      - winrid 30 days ago
        It's not an issue with size. It's an issue with race conditions. With Mongo I can update a.b and a.c concurrently from different nodes and both writes will set the right values.
        You can't do that with PG JSONB unless you lock the row for reading...
        [-]
        callalex 30 days ago
        Yes but that simplified write complexity means you are pushing a ton of read complexity out to your application.
        [-]
        winrid 29 days ago
        What?? That's an insane argument. That's like saying if one client sets column X to 1 and another client concurrently sets SET y = 2, one client's writes will be LOST. It shouldn't, and it doesn't. If it did, nobody would use Postgres. This issue only exists with PG's JSON impl.
        [-]
        callalex 29 days ago
        What?? That’s an insane way to describe what I’m talking about. Data/transaction isolation is very complex and extremely specific to every use case, which is why database engines worth anything let you describe to them what your needs are. Hence why when one client writes to Y they specify what they think X should be if relevant and get notified to try again if the assumptions are wrong. An advantage of specifying your data and transaction model up front is that it will surface these subtle issues to you before they destructively lose important information in an unrecoverable manner.
        https://en.wikipedia.org/wiki/Isolation_(database_systems)
        [-]
        winrid 29 days ago
        So updating one column on a table is destructive and can lose important information now? :D
        How do you increment a value in a column while other people write to the database? You don't grab the whole damn row and rewrite it...
        winrid 29 days ago
        Also this statement:
        > which is why database engines worth anything let you describe to them what your needs are
        Contradicts your argument.
        Mongo has atomic updates to update specific fields, or you can do replaceOne() etc to replace the whole document.
        While PG only gives you "replace" with JSON.
        So I guess postgres isn't worth anything. :)
      - jeltz 30 days ago
        A JSON object which is 100 MB after compression is a quite huge thing.
  - mulmboy 30 days ago
    How are JSON operations not atomic? Genuinely curious
    [-]
    - bandrami 30 days ago
      I think the sense is not "atomic by field" or whatever you'd call that. If you're going to ignore the fact that PG is an actual ORDB and just store gigantic blobs of JSON in it, it will write the entire object whenever you update part of it, because the whole point is you're supposed to store it as a multi-field record.
      [-]
      - SonOfLilit 30 days ago
        Isn't that true for Mongo as well?
        [-]
        rb12345 30 days ago
        Yes, but as MongoDB is a document database, storing and updating giant blobs of JSON as a single operation as opposed to breaking the JSON down into individual fields is intended behaviour. This works in Postgres too, of course, but then you lose the relational database advantages on top of the large-single-field issues.
        All this really comes down to is picking the right database type for the problem you're trying to solve.
    - winrid 30 days ago
      I answered here: https://news.ycombinator.com/item?id=39865835
adhamsalama 30 days ago
I pushed the company I work for to use Postgres instead of Mongo in all new services and refactors. Now I'm leaving it to join a new company, where they use Mongo. Sigh. I'll have to do this all over again.
winrid 30 days ago
I don't doubt there are real benefits for them to switch but BTW this is not true:
> Difficulty configuring database transactions: With MongoDB, setting up transactions was not trivial because it required running MongoDB in cluster mode with various configuration overhead; this made it extremely difficult, for instance, for customers to run a simple POC of Infisical because it required a production setup of MongoDB.
You can run single instance in clustered mode, just a single-instance replica set. You get the oplog, transactions, etc. No advanced configuration required.
[-]
- vasco 30 days ago
  And this has been possible for many years.
- blue_pants 30 days ago
  Are there any disadvantages to using transactions with a single-instance replica set?
  [-]
  - ammo1662 30 days ago
    We run a single instance replica set for three years in a small application for its transaction function.
    The only difference we noticed is the additional replica set setting for a new instance. No other disadvantages are found.
thih9 30 days ago
Loosely relevant, “the marketing behind mongodb”
https://news.ycombinator.com/item?id=15124306
[-]
- CSMastermind 30 days ago
  Man, that thread feels prophetic. Since 2017 whenever this comment was written:
  > 100% of my friends who have used Mongo/similar NoSQL have given up and had a nasty rewrite back to pgSQL...
  I've led three migrations at different companies from Mongo to Postgres, spending months of my life and millions of dollars.
  There really should be a sociology study of why so many people/companies used Mongo even well after the downsides were clear.
jacobyoder 30 days ago
"this hurt in particular because our data was very much relational."
Umm... why would you not choose a relational engine to start with? This isn't said with the benefit of hindsight. I worked with Lotus 123 back in the early 90s. I get that there's value in document databases, but even then, there were limitations, and the need for the ability to have structured/related data in an engine that easily allowed for ad-hoc queries out of the box was apparent.
I watched the entire nosql/mongo movement arise and evolve over time, and it rarely made sense to me. Even when mongo made sense for a problem... it only makes sense for that problem, and not as the primary basis for your entire application. Relational/SQL, with nosql/mongo/etc as secondary/ancillary data store has always made the most sense (imo).
JSON columns in major databases now tend to provide a pragmatic balance of "good balance for many use cases". Justifying something other than that as a starting point is possible, but I've rarely seen cases where it makes sense to avoid a decent SQL engine as the core data store for your business apps.
[-]
- CSMastermind 30 days ago
  > Umm... why would you not choose a relational engine to start with?
  There's typically two answers to this question:
  When they first started things were changing so quickly they didn't want to commit to a schema and felt like doing real data modeling would slow their velocity down so they dumped everything into Mongo and called it a day.
  Or they had someone on the team who insisted that Nosql was the way to go. They'll say something like, "I worked on xyz team at Amazon and we just used DocumentDB for everything and never had a problem. Those horror stories about Mongo are all from people who didn't use it right."
  [-]
  - jacobyoder 20 days ago
    Interestingly... I've never encountered the first scenario as a justification. I know it must exist, as I've read people describe it.
    Every time I've been in a group where nosql was desired, it was because they saw it at a conference, or watched a youtube video, or read some blog extolling nosql virtues, and... yeah - that second option. They'd either never actually used nosql, or didn't realize what they were getting into, possibly because even if they'd used it, the larger architecture had SQL in it someplace, just outside what they could see.
fl0ki 30 days ago
I inherited a troubled project using MongoDB and learned a lot about what is, and isn't, really a problem with MongoDB [for this kind of project].
BSON is a terrible format, but I genuinely like that I can use the same schema all the way through the JSON APIs to the database. If they have to deviate in future for some reason, you can deal with it then.
If you are clever with generics, you can compose type-agnostic DB code with type-safe in-memory data structures, without needing any schema generation or ORM. It's a natural fit for data-oriented programming in my workhorses Go and Rust. Unfortunately, the Rust library is very poorly designed in many ways, making many basic things extremely inefficient, and that's not what you want in Rust. The Go library has really slow reflection-based (de)serialization, but other than that, it's very flexible and robust.
If you keep your queries simple, it's easy to mock out and combine with techniques like snapshot testing. Since BSON has a lossless representation in the form of extJSON, you can have human-readable snapshots, diff them easily in review, etc.
Doing bulk writes is easy and efficient, and doing sessions/transactions is easy, but you should basically never combine the two. The server can have its own configured timeout for a session, and if your bulk write exceeds that, it fails and retries, fails again, etc. This is a really serious design flaw that shouldn't have made it past review. If they wanted to enforce a maximum bulk write size, that should be known up-front, not depend on how many seconds it happened to take each time.
Writing data to a replica set is extremely slow, and every index you add makes it slower. This compounds further with the above.
I have concluded that MongoDB only makes sense for read-mostly workloads. If you're going to do your own in-memory data structures and validation anyway, writing out small infrequent changes, MongoDB doesn't do much for you but it also doesn't get in your way very much.
akie 30 days ago
Did anyone here ever see a migration _from_ PostgreSQL/MySQL _to_ MongoDB? I've only ever seen startups pick Mongo, then regret it a few months or years down the line and migrate to a relational database. Did anyone ever see the opposite?
[-]
- djbusby 30 days ago
  I watched a team go from a sloppy MySQL that they never tried tuning to Mongo (cause it's web-scale). It was another thing that almost killed the company. Now they are on PostgreSQL.
- hibikir 30 days ago
  I've seen a couple of successful ones, but there were extremely good reasons: Most of the time we don't need scaling, but when your target Mongo cluster has a couple hundred machines, and your queries really are very simple, NoSQL is not a crazy idea. Same with DynamoDB: Sometimes it really is a very good choice, but you need a use case of the right shape. If you have an extremely good use case for your table to be in 5+ regions, the dataset has billions of rows and you have no hot keys, maybe running your own replication system really is more effort than letting AWS do it all for you.
  It's just far more common for some tech lead in small company to imagine that they are going to be bigger than Netflix, than for someone to start conservatively with a RDBMS, and realize that suddenly they have hundreds of millions of users and some kinds of data where the relational bits don't matter.
- harry_ord 30 days ago
  I work with the aftermath of going from postgres to couchdb. The migration was done years before I joined. There are still customer records showing postgres ids.
  It didn't seem like it made things better since there were issues with dates and lost data. Working with it now isn't fun either.
- willio58 30 days ago
  No, but I am currently in a startup that mainly uses DyanomoDB (similar to Mongo) with support from Elastic for more search-based queries. We've found DyanmoDB to be great and we don't feel hindered by the lack of relational-ness.
  I will say in case anyone doesn't understand how this might be possible, relational querying is possible in document databases, but they essentially require just different ideas to achieve. In Dynamo, you might need to create secondary indices for example.
  Personally I love Postgres and would not mind if we had gone that direction but the more I use DynamoDB the less I feel like Postgres is the "only way"
rglover 30 days ago
> Missing out on relational features: With MongoDB, we lost out on many nice features from the relational world like CASCADE which, when specified, deletes all referenced resources across other tables whenever a target resource is deleted; this hurt in particular because our data was very much relational.
I'd be very curious what their data model was in relation to this problem. I wonder if denormalization of the data would have solved the problem without a need for a full database hop.
bsaul 30 days ago
is it still difficult to create a cluster of pg dbs. either for redundancy or speed ?
Last time i advocated for using pg vs mongodb, the person replied that mongodb clustering was super easy.
[-]
- baq 30 days ago
  Mongo absolutely can be the right answer if you need to horizontally scale to tens or hundreds of TBs. Anything below that I’d rather have a small cluster of big Postgres instances. There’s value in a SQL RDBMS which you just don’t get anywhere else.
- Ozzie_osman 30 days ago
  Read replicas are pretty easy on postgres using replication. That said, you need to be careful about replica lag.
  If you want to distribute your writes, that's a little trickier. There are options like Citus and such. But still not natively supported.
  [-]
  - KingOfCoders 30 days ago
    Second that. A client hat PG replica lag, was doing backups from the replica and found out during restoring a backup, that some hours of data were missing.
    [-]
    - jononor 28 days ago
      Ouch? Are there standard tools to monitor that? Any best practice to avoid? I am considering exactly tha kind of setup...
      [-]
      - 5Qn8mNbc2FNCiVV 28 days ago
        I don't know about there being a standard tool but you can track the LSNs together with a timestamp on the write node and from there calculate the replica lag by checking what LSN the replica is at
globular-toast 30 days ago
I started my career doing web development using PHP and mySQL. Then pivoted to something completely different for about 6-7 years as I got sick of spending all my time making stuff work in IE6. I heard it the whole nosql thing, didn't understand it but thought it sounded silly. Then came back to web dev 5 years ago and found postgres was all the hype and nobody talked about nosql any more. Glad I skipped it!
pmarreck 30 days ago
Could have told most people this back when everyone was going to NoSQL.
"You'll be back". Relational capability is just too useful beyond a certain point
XCSme 26 days ago
I currently use MariaDB for my self-hosted application. Would I gain big advantages switching to PostgreSQL? My main reason for going with MySQL/MariaDB is the availability and ease of installation on most hosting providers.
[-]
- derekperkins 19 days ago
  No. In almost no circumstances will it be worthwhile to switch either direction. While people on HN love to nit-pick on the margins, the core functionality for both is 95% overlapped.
jrochkind1 30 days ago
> we felt that it already delivered significant benefits by virtue of not being MongoDB
Well then, tell us how you really feel.
rullopat 30 days ago
That's what happens when you choose the next product / framework / methodology because of fashion or because FAANG is giving you the stuff for free, without thinking if it really fits your use case and that 99.9999% of the companies will not have the problems Google, Facebook, etc. have.
h1fra 30 days ago
Funnily enough, the one feature I don't recommend in Postgres is Cascade. Unless you have a very small controlled set of FK it's not a good idea. Batch delete is less problematic. I wish Postgres would do something about it, like elasticsearch deleteByQuery that can run in the background.
Cyberdog 30 days ago
What is the point of using a query builder if you’re only going to support a single RDBMS? Why not write straight SQL and avoid the unnecessary abstraction layer? Is the query builder really going to be easier to learn to use (particularly for non-trivial queries) than SQL?
[-]
- callalex 30 days ago
  If your DB schema is expressed as an object/class it means your IDE will have robust autocompletion. It’s just a convenience to avoid having to look up the schema for whatever tables you’re interacting with. It helps cut down on trivial bugs during the writing phase, before you even get to testing.
fakedang 30 days ago
Can somebody please explain to me the use cases in which someone would use Postgres vs MySQL vs something like Cassandra or Convex? I know the latter two have their own types of SQL implementations, so why aren't they used as much, even if they are NoSQL databases?
j45 30 days ago
It always struck me as odd to see the hard work put into NoSQL databases to make them relational.
Somehow, the work to do this seemed less work than learning SQL.
agentultra 30 days ago
Hoo boy. I did this once. Only it was CouchDB. And they didn’t use any schema. Data had accumulated in it for years. Then a customer said they wanted their data exported to PowerBI.
I used a library that generated a type that could parse any JSON document you threw at it by feeding it examples. So I wrote some tools to scan through the database, generate parse test cases, etc.
I then wrote a library to version and migrate records.
With this I wrote some tools to start extracting meaningful bits out of the big-blob type. I could scan the database, download examples that didn’t work, make test cases, fix those, ask the experts in the system, etc.
Then eventually it started spitting out structured data into Postgres and from there into PowerBI.
Decent article.
CoastalCoder 30 days ago
Obligatory parody link: https://youtu.be/HdnDXsqiPYo
[-]
- anticristi 30 days ago
  :)))
  I find the super-plain voice of the annoyed character further amplifies the humor.
- WuxiFingerHold 29 days ago
  Talking to AI 12 years ago :-)
hello_computer 30 days ago
obligatory reference: https://www.youtube.com/watch?v=b2F-DItXtZs
Semionilo 30 days ago
General speaking: don't use mongodb.
It was super shitty when it came out and using postgres gives you a better and faster solution which you can also use in other projects.
I'm always impressed when I see mongodb still making money.
[-]
- b9b10eb736 28 days ago
  While I'd agree with the general stance of avoiding MongoDB for any new project, I find the statement that postgres always gives better and faster solution dubious. They don't really solve the same problems. If you happen to really need horizontal scaling, actual HA (not failovers) or documents with many field-level atomic operations, MongoDB might still be a better fit than postgres. For on-premise hosting, not having any HA out-of-the-box can be a major painpoint.
  [-]
  - Semionilo 28 days ago
    Sharding does exist with postgres.
    What type of ha can mongo do postgres can't?
onetimeuse92304 30 days ago
Ah, the old "I used a wrong product for my problem and now I complain the product is bad because it does not suit my case." defence.
MongoDB is a document database. It is not supposed to be good at relations. Also lacking support with cloud providers and lacking experience with MongoDB is not MongoDB's problem, it is your poor decisionmaking. If you value those things, you should have taken it into account when you were choosing MongoDB in the first place.
Now, I am not a big lover of MongoDB. Some years ago I was forced to use it and I had very low opinion of it as quite immature product. It is still far behind in maturity compared to something like Oracle database or PostreSQL, but in the meantime I learned to appreciate some of the things MongoDB is good at.
I also admit that MongoDB's transactions are a total joke. It should be prominently placed in the documentation that you are using them at your own risk. I don't use MongoDB transactions anymore because there are better ways to architect your application with MongoDB without using transactions.
I like MongoDB for the ease of use when rapidly prototyping things. I like the concept of a no-schema, document database and some of the additional features it provides to deal with the document. I like its reactive java driver which is a breeze to use to construct data pipelines quickly. I like change streams.
In the end, I think it is good to have a selection of tools that are good at doing different things.
It is our responsibility to chose the right tool for the job. If you chose poorly, don't try to fault the tool for it.
[-]
- dangtony98 30 days ago
  Author of the article here!
  The article itself does not "complain the product is bad" but some other comments in this thread would certainly suggest so. Instead, the article says that the use-case evolved, states reasons for why MongoDB was no longer suitable (while at some point it was) and why PostgreSQL was chosen instead, and discusses the migration process involved in the transition and the results.
  Regarding lacking support with cloud providers and lacking experience with MongoDB is indeed NOT MongoDB's problem. However, it is a problem for users of the platform that are trying to self-host it with MongoDB as a database dependency and therefore justifies the PostgreSQL migration with PostgreSQL being a better candidate for this use-case.
  To be clear, the article is not saying that MongoDB is bad and no fingers are being pointed. I would in fact say that there are an array of use-cases where MongoDB is an excellent choice (e.g. I resonate with the point on rapid prototyping).
  Please don't skew the words.
  [-]
  - finaard 30 days ago
    To me the article read more that PostgreSQL might have been the better choice from the beginning - but due to lack of experience in the team MongoDB was "good enough" at that point.
    [-]
    - tkellogg 30 days ago
      I recall when my team chose MongoDB ~2011, Postgres & friends didn't have JSON columns, so there was a lot of extra data modelling that probably was unnecessary.
      The biggest use case for MongoDB was for huMongous data. Obvs MongoDB was a good fit, because of the name.
      [-]
      - jerf 30 days ago
        Yes, I think it's reasonable to say the NoSQL community drove the JSON columns in the relational databases. And while the JSON-ness of said columns is not necessarily the relevant bit to me, having the escape hatch of being able to just sort of splat out a "other" column has been something I've used several times, and it has relieved some non-trivial stress and strain in relational designs in my opinion. It's not even that hard to migrate something out of the JSON column if you want to raise it to a full column later, and JSON indices mean you may not even really need to.
        I have a hard time justifying a document database in 2024 when I have 2024 Postgres; I don't know how large the window is where MongoDB will do something Postgres won't, but I know my systems are not in that window. (Assuming it even exists, which I wouldn't be surprised it doesn't, but for the sake of argument I will assume it does.) But a decade ago that would be a much harder decision if I truly had a document-based use case.
        [-]
        jimbokun 30 days ago
        Horizontal scaling is still challenging in Postgres. That’s where “NoSQL” systems still have their place. Or one of the SQL systems layered on to a NoSQL architecture, with the trade offs that entails.
      - vbezhenar 30 days ago
        I still didn't have a single use-case to use JSON columns yet. Sometimes I feel that JSON columns are made for people who don't want to properly design their domain model. Big mistake.
        [-]
        unclebucknasty 30 days ago
        Interesting point, but I wonder if there's a little confirmation bias and/or circular logic there.
        In other words that you don't see a use case for JSON columns because you don't think it's good DB design.
        I know I have used them after decades of RDBMS work, and it feels like I'm "cheating" a bit. But they really do make sense for some use cases that can be harder to achieve with traditionally normalized RDBMS alone. For instance, for configuration options or scenarios where different objects may have a few different attributes, but are largely the same.
        If you've ever used an entity-attribute-value data model or otherwise ended up with table proliferation or a bunch of nullable columns to accommodate the latter use case, you might appreciate the simplification JSON columns offer.
        onetimeuse92304 30 days ago
        I think it is the same misguided discussion as strongly, staticall typed vs weakly typed languages.
        (Mind I mean the discussion is misguided, not you)
        The problem is that relational model is a one solution to certain problem of when you want to pay the price of enforcing the model of your data. Relational database is when you pay all the cost upfront.
        With MongoDB I have other options.
        It does not mean they are necessarily better options. But some of them are better in some situations, for example when rapid prototyping. Or when you need to preserve data in different formats as the application evolves.
        Personally, if you can, you probably should model your data upfront. But it is good to have other options, sometimes.
        abound 30 days ago
        The main use case I find myself reaching for JSON/JSONB fields is unstructured map-like metadata, like KV tags on a resource, where the data model/domain has no expectation that any given field is set.
      - wlll 30 days ago
        https://www.postgresql.org/about/featurematrix/#json
        You're right, 9.4/9.5 were when JSONB was introduced and expanded (release 2013/2014).
        That said, it's a big decision to go with two very very different technologies (relational vs Document store) just for a column type, storing JSON must have been a pretty major product feature?
  - lloydatkinson 30 days ago
    I have noticed in recent times comments on here have been getting more and more like this. An article I posted a few weeks ago got the same treatment even though I iterated multiple times the same points as I anticipated these kinds of comments from people that didn’t read it fully.
- thih9 30 days ago
  To be fair, Mongodb’s pr team did their best to present it as the solution to all problems.
  > In 2012, 10gen’s VP of Corporate Strategy Matt Asay argued “there will remain a relatively small sphere of applications unsuitable for MongoDB … the majority of application software that developers write will be in use cases that are better fits for MongoDB and other NoSQL technology … Those functions that really help a company innovate and grow revenue [will be NoSQL].” He would note that we were living in the post-transactional future.
  https://news.ycombinator.com/item?id=15124306
  [-]
  - gwbas1c 30 days ago
    They were doing that back in 2010. I went to a few conferences where they promoted MongoDB as an all-purpose database that was the "next thing" compared to SQL.
    Once everyone realized its shortcomings as a general-purpose database, there was a gradual "oh shit this isn't working" migration back to SQL-based databases.
    I think the big problem is that the MongoDB programming model and APIs are very nice; we really need a database that is still relational "under the hood" but has an API that's more like MongoDB.
    [-]
    - bashinator 30 days ago
      I believe one of the job interview problems at Mongo for software engineers, was to design a relational layer on top of the document store.
  - onetimeuse92304 30 days ago
    To be fair, almost all product PR teams do the same. I just ignore most of what they say in favour of my own opinion that I try to build based on reading actual technical documentation, experimenting with the product, etc.
    Anybody who makes tech decisions based solely on what PR teams say is naive and incompetent at best.
    [-]
    - zzzeek 30 days ago
      MongoDB came into the community like a ton of bricks, taking over conferences, flooding the zone with mugs and t shirts, and for a time there, you couldn't view the front page of hacker news without at least two MongoDB posts. The death of SQL, now known to be an inferior relic of the past, was a regular topic of discussion. There's a reason why the "Web Scale" meme is so famous, because that's what actually happened for awhile there. SQL was an inferior, leaky abstraction, and ACID was not "web scale". it became all about the CAP theorem (which I hardly ever see anyone writing about these days).
      only a year or two later, when teams that went all in on MongoDB at the behest of MongoDB's marketing department started realizing they'd been sold a bill of goods, did the slow and arduous march back to what continued to be the best generalized solution for 95% of data problems, the ACID compliant relational database, begin to occur. given that, this blog post seems really behind the times.
    - thih9 30 days ago
      > To be fair, almost all product PR teams do the same.
      No, mongodb’s PR approach and results were largely uncommon at the time. They didn’t advertise directly but instead targeted and amplified dev communities - as detailed in the article linked earlier.
      The result was that people not associated with mongodb talked about mongodb at various dev conferences and in blog posts. People didn’t want to listen to PR teams then too, but they followed their peers.
      [-]
      - onetimeuse92304 30 days ago
        Again, I will stick to saying that if you use technology you have no experience with, the problems resulting from it are on you.
        Testing out technology on important projects is reckless and is a risk a lot of people take but push away any responsibility for.
        As a professional, before I suggest to use a technology at a project, at first at the very minimum I spend some time with it to learn the ins and outs. Every technology has cons. Just because you don't know them does not absolve you from responsibility for them.
        The reality is that most applications are boringly similar once you define the problem in simple words. There is very rarely a need to use new technology. It is my experience that it is usually better to use technology that is not necessarily cutting edge but one that you know well and that you can easily hire people who know it well.
        Because no technology is perfect, but it is much better to know the cons ahead of time and know you will have people who will know the problems and who can find reasonable solutions for.
        I am personally sick and tired of people who first advocate for new technology to be added to the project, touting its supposed benefits, not knowing the cons, then don't want to take responsibility once we actually find the dead bodies.
        [-]
        rokkitmensch 30 days ago
        The funny thing is that they chose Mongo precisely because they were familiar with it...
    - taffer 30 days ago
      As a counterexample, Azure Cosmos or Amazon DynamoDb are marketed as specific solutions for specific problems. Other RDBMS like SqlServer or Postgres are marketed as general purpose solutions because they are in fact general purpose solutions. Mongodb, on the other hand, is a specialized "document database" marketed as a replacement for relational databases.
    - benterix 30 days ago
      > To be fair, almost all product PR teams do the same.
      As it is their "duty" to twist the truth in this way, it is our role as the users of these technologies to present things as they are.
    - swasheck 30 days ago
      mongo was absolutely one of the most egregious (daresay dishonest?) purveyors of such myth. it was quite off-putting for me and my team.
  - liquidgecka 30 days ago
    In 2010-2011 time frame the mongo team sponsored an effort inside of Twitter to replace MySQL with mongo. There was a group of us that worked to migrate the tweet store to Cassandra that were able to talk with leadership and get that initiative killed. Turns out migrating HIGHLY structured data into mongo was never a good idea, especially at that scale.
- pydry 30 days ago
  The issue isn't that mongodb isn't suitable for some problems. I've used it before and it has not caused issues. It's that the problems it is suitable for are a subset of those that postgres is suitable for.
  I can just as easily rapidly prototype in postgres. If I want to store schemaless JSON in postgres I can easily do that.
  >In the end, I think it is good to have a selection of tools that are good at doing different things.
  In the end I think every tool needs a niche - something it is good at that its competing tools are not good at.
  Mongo doesnt have that.
- CuriouslyC 30 days ago
  If mongo is just a document database, there's really no reason to use it over elastic. The query story is slightly nicer with mongo, but we're not doing relational algebra here, right? Elastic crushes mongo at literally everything else.
  [-]
  - LtWorf 30 days ago
    ES had a bug where a syntax error in a query would send it in a weird state that would then give wrong answers to all queries until the process got rebooted. Which of course isn't fast.
    I reported the issue, and thankfully I changed job and never had to deal with ES again. After a few years they contacted me to ask if the issue was still there, I said I had no idea.
    I have no idea if the issue is still there.
  - finikytou 30 days ago
    i guess elastic is more heavy setup for some POC.
    [-]
    - CuriouslyC 29 days ago
      That might have been true in the past but it's not hard to PoC with elastic using docker compose.
      [-]
      - synthc 29 days ago
        Elastic is much more of a pita to maintain and monitor than Mongo
- taffer 30 days ago
  So what is a good use case for Mongodb? I have never seen an application where a "document database" would have been a good choice. In every project where I have seen Mongodb, it has turned out to be the wrong choice.
  Even for prototyping there are many other good choices since RDBMS like Postgres have implemented JSON support. Mongodb looks like a solution in search of a problem.
  [-]
  - joneholland 30 days ago
    Applications that are essentially form wizards are a great fit for a document database.
    Think application forms etc.
    [-]
    - wlll 30 days ago
      Is there a benefit of Mongo for that over just Postgres with JSONB columns? You're still storing JSON in Postgres but you'd get the relational aspects too for things like users having many forms, billing and account relationships etc.
      [-]
      - unclebucknasty 30 days ago
        No benefits that I can think of at the database layer. Postgres's addition of JSONB columns represents the best of both worlds. Funny to think that all of that noise about nosql replacing rdbms was essentially nullified by the addition of one column type.
        Some people do like the MongoDB API, however.
      - joneholland 30 days ago
        Easier horizontal scaling and organizational inertia would be the main reasons to use mongo over a jsonb column. I wouldn’t introduce it to a psql shop if they are already great at running psql.
- forgotmyinfo 30 days ago
  Sorry, 100 times out of 100, "rapid prototyping" means "we built this with the wrong database, now we're stuck with it". If it isn't obvious how to store your data up front, then you either aren't planning your software well, or you don't actually know what you're building.
  [-]
  - wlll 30 days ago
    I was also a bit confused by that part. What is it about Postgres that prevents rapid prototyping? It's not like the schema is set in stone, or even particularly hard to change once you've created it.
    [-]
    - benterix 30 days ago
      Well, yes and no. If you already have your database running in production with many transactions and live connections 24/7, changes to the schema might not be hard per se but always needs careful planning and execution. Additionally, in Postgres a change like adding a column locks the whole table (although I hear this is going to change) so writes are off for a short time. If this becomes unacceptable, you go for a blue/green deployment which has its own gotchas.
      So, while I agree with your main point - there is nothing in Postgres that prevents rapid prototyping, and I would chose it over Mongo any moment, I understand why some people might prefer the more "dirty" approach.
      [-]
      - wlll 30 days ago
        True enough, and the solutions for lighter-weight schema changes have evolved and weren't always so good, but early stage startups often don't really have the sort of data-weight issues that make this hard, unless they're starting with large data sets already. Even at the scale of small hundreds of millions of rows per table (like the company I'm in charge of the database for) it's not much of an issue.
        I had to look it up, apparently adding a column with a non-null default was "fixed" in PG 11 (2018), but with a null default it had been fast for a while:
        https://www.depesz.com/2018/04/04/waiting-for-postgresql-11-...
- JJMcJ 30 days ago
  > MongoDB is a document database.
  Thank you. A one sentence clarification of the issues in Mongo vs RDBMS.
  [-]
  - benterix 30 days ago
    It's a bit more nuanced than that. PostgreSQL supports JSON/JSONB data types so if you need, you can use it in a way similar to MongoDB, with less structure and the remaining caveats, but with transactions and all other goodies working out of the box and being able to use standard RDBMS features if needed.
- teaearlgraycold 30 days ago
  > I like MongoDB for the ease of use when rapidly prototyping things.
  I think this is similar to when people say writing tests speed them up through TDD, then others don’t write any tests at all. What I take away is you can do it either way as long as you’re bought in and are familiar with the process.
  For me it’s no trouble at all to define my schema in advanced. And Postgres provides me with the json column escape hatch. Combined with the massive benefit of the data validity guarantees later in the app’s lifecycle I pick Postgres every time.
- jeff-davis 30 days ago
  "It is our responsibility to chose the right tool for the job."
  That perspective doesn't work well for database products, in my opinion. There is a huge pressure for databases to evolve with your business and applications and to adapt to whatever your throw at it.
  Swapping out a database product is less like changing tools and more like changing a foundation. You can't do it every time a new problem arises.
  That's not to say you can't use a few different products if that makes sense. But that has its complications.
- 999900000999 30 days ago
  >As part of that stack, we chose MongoDB + Mongoose ORM because the combination presented least overhead and allowed us to ship quality features quickly. As Sir Tony Hoare states, “premature optimization is the root of all evil,” and there was certainly no need for further optimization at the time.
  Looks like the tool did a great job while they were getting started. Mongo is very easy to hack something together with, later on it looks like they just needed to migrate to something more stable
  [-]
  - swasheck 30 days ago
    yeah ... in the early days of mongo it was pitched as a replacement for relational because a) it had faster writes without those pesky joins and constraints, and b) because it's too hard to model data, so chances are it's just unstructured. the onslaught of "relational is dead" and "joins are annoying" was tremendous
    [-]
    - 999900000999 30 days ago
      Plus there's a good chance they didn't know exactly what they needed to build when starting out.
  - jimbokun 30 days ago
    Postgres and MySQL are easy to hack something together with, too.
    [-]
    - 999900000999 30 days ago
      That's assuming you're experienced with them, if you know next to nothing about databases, Mongo is a much better choice. In this case it looks like when they got started the team just didn't have all that much experience with databases.
  - forgotmyinfo 30 days ago
    Nonsense. Choosing the correct database isn't "premature optimization", it's the bare minimum of being a competent programmer. Why do we never have time to do it right, but we always have time to do it over?
    [-]
    - 999900000999 30 days ago
      It's generally faster to get started with Mongo.
      If you're talking about a startup, you might not even get to the let's do it over stage. You might have six months to just hack something out so you can raise more funding, and in that case Mongo is a great choice.
      The needs of your business might also just evolve over time.
- unclebucknasty 30 days ago
  To be fair, it burst onto the scene back in the "all you need is NoSQL...RDBMS is dead" era.
  So it became one of those software dogma things and I remember having "debates" right here on HN, including around the necessity of transactions.
  So it's kind of ironic to see people now chastising those who fell prey to the dogma. Full circle.
- throw_m239339 30 days ago
  > MongoDB is a document database.
  So can PostgreSQL be with JSON types and queries.
renegade-otter 30 days ago
The irony of it all is that MongoDB actually matured and you do not have to jump through hoops to have, say, transactions.
I am ALL for Postgres, I even wrote a post about the importance of being fluent in relational databases: https://renegadeotter.com/2023/11/12/your-database-skills-ar...
What grinds my gears is the whiplash caused by this frantic stampede from one hype to the next.
I am just waiting for the AI hype to blow over and for everyone to rediscover Hadoop, or something.
[-]
- thiht 30 days ago
  Postgres is not hype though. There's very few legitimate use cases for not using a RDBMS as a main data store, and Postgres happens to be the most popular nowadays, for good reasons.
  [-]
  - jimbokun 30 days ago
    In my opinion the use case is when your data no longer fits on a single, potentially very large, server. Automatic distribution and replication and rebalancing of data is a tricky problem and something like Cassandra handles those very well.
    Or Cockroach DB or similar, if you still need relational capabilities.
  - circusfly 30 days ago
    I've nothing against Postgres but MySQL is a wonderful option that is likely still in far higher and heavier use, when considering open source databases.
    [-]
    - kstrauser 30 days ago
      In my mind, MySQL no longer exists. It’s a weird Oracle thing that I wouldn’t touch with a lineman’s pole. MariaDB is what happened after MySQL disappeared.
      [-]
    - LtWorf 30 days ago
      I think it gained popularity over a decade ago, when it was faster because it was unsafe.
      In the end postgres is the better one. With mysql the defaults are bad.
      [-]
    - wlll 30 days ago
      I wrote this in another comment, but it's relevant here I think:
      " I do scaling and performance work, mostly with Rails apps, but a significant amount of the work is database level and not language specific. I've used both postgres and MySQL (and a few other databases) going back to 2000.
      The best thing I can hear from a company when I start is "We use Postgres". If they're using postgres then I know there's likely a far smoother path to performance than with MySQL. It has better tooling, better features, better metadata. "
      Right now I would not choose MySQL over Postgres at all, ever. I can't think of a single way it is materially better.
      [-]
  - oldsecondhand 30 days ago
    In my experience MySql is more popular. Worked at several companies that used MySql, none that used Postgres.
    Kids at universities still use XAMMP to learn databases.
    [-]
    - lutoma 30 days ago
      In my experience MySQL is still commonly used in the PHP world, but everything else is mostly Postgres
  - bradleyjg 30 days ago
    The number one reason being that it’s free. In a purely technical comparison it is not at the top of the list.
    [-]
    - simonw 30 days ago
      What's top of the list technically?
      [-]
      - bradleyjg 30 days ago
        Oracle
        [-]
        pmarreck 30 days ago
        Oracle is a giant mess, is the sense I got from trying to work with it in the past.
        While I am loathe to recommend any Microsoft product, SQL Server is at least coherent, and has actual transactions like Oracle (and not just "checkpoints")
        [-]
        paulryanrogers 30 days ago
        Aren't MSSQL and Oracle just doing redo logs, like MySQL?
        What do you consider not actual transactions?
        Postgresql keeps dead tuples in place until vacuumed, which seems optimized for rollbacks. But isn't so bad if you're inserting much more than deleting or updating.
        [-]
        pmarreck 29 days ago
        Nestable transactions = “real transactions” to me, I guess.
        I almost don’t see the point in non-nestable transactions
        [-]
        paulryanrogers 29 days ago
        SQL standardizes SAVEPOINT for such nesting. Is that not sufficient? FWIW Postgres supports it.
        VWWHFSfQ 30 days ago
        Anybody that has ever used Oracle knows how nice it is. But very few people have that experience because their company isn't spending millions of dollars on the database server.
        [-]
        bradleyjg 30 days ago
        Yes, that was my point.
        I get that working with Oracle and their army of lawyers is not cheap or pleasant but the db is still excellent.
  - darby_eight 30 days ago
    > There's very few legitimate use cases for not using a RDBMS as a main data store
    I don't think that's true—all the old cases that were legitimate before are still legitimate. The price of both persistent storage and memory has simply come way down to the point where many computational workloads are viable under vertically scalable databases again. I suspect the pendulum will swing the other way one day yet again (and of course there is a rich ecosystem of ways to query horizontally, if not transactionally, with SQL that will likely temper this swing even further).
    [-]
    - wlll 30 days ago
      The price of persistent storage and memory was the same for Mongo as it was for Postgres back when the NoSQL movement happened. Mongo wasn't made of magic scaling fairies and still needed resources. As soon as you cranked up the safety of Mongo to a nearly acceptable level it's performance fell through the floor. The only reason people though it performed amazingly was because of Mongo's rather deceptive marketing.
      I use the term "nearly acceptable" because for a long time Mongo's data consistency was absolutely crap (https://jepsen.io/analyses)
      Personally I think people used Mongo because it was new, shiny and exciting and it promised performance (though it didn't really deliver for most people).
      [-]
      - darby_eight 30 days ago
        I am certainly not going to defend mongo ever having being a good choice; I was referring to other "NoSQL" databases (namely, the large number of highly scalable key-value stores).
        [-]
        thiht 29 days ago
        But you wouldn’t use a key value data store as your main store. It’s a specialized kind of store. Of course Redis and friends have their place. Just not as a main store.
        [-]
        darby_eight 29 days ago
        [dead]
    - bradleyjg 30 days ago
      In addition to changes in ephemeral and persistent memory, the other big difference between now and nosql’s heyday is improvements in distributed relational transactional databases (newsql).
      We haven’t exactly circled the square on CAP but we’ve certainly bent it.
- sph 30 days ago
  Postgres is no hype, it was already in plateau of productivity when MongoDB came to the web scale scene.
  Hadoop just like Mongo had their hyped time in the sun, but RDBMS are far too advanced and versatile than any of them were.
  [-]
  - renegade-otter 30 days ago
    I am not calling Postgres hype - it should have never been NOT hype. It's a reasonable default for most problems,
    Now all I read about is how Postgres is awesome, as if it's this great new thing. I guess that makes sense, as the new generation of engineers is rediscovering stable, reliable, lean technologies after a decade of excesses with "exotic" tech.
    For grey beards, it's all very odd. Like, "where have you all been?"
    [-]
    - arp242 30 days ago
      I remember Robert C. Martin ("Uncle Bob") going on about how No-SQL will replace literally all SQL and that there is literally not a single use case for relational data and SQL. I wonder if he ever came back on that.
      Now, my opinion of Martin in general is not especially high and he's a bit of a controversial figure, but it wasn't just the kids. And Martin is also all about reliable software, so that makes it even more surprising.
      [-]
      - belter 30 days ago
        I also remember Michael Stonebraker basically saying something, but being ignored...Along the lines of "I will see you all map reduce and nosql people in 20 years, after you have all reinvented SQL...."
        Not an actual quote but a factual statement
        [-]
        jimbokun 30 days ago
        Because he realized you can parse SQL and generate the equivalent of a map reduce job on the backend to execute it.
      - rjbwork 30 days ago
        He's always been fundamentally a salesman. Hopping on the hype trains lets him sell more.
        [-]
        pydry 30 days ago
        He's always been fundamentally a zealot who sees most narratives in black and white terms. It's an extremely unfortunate characteristic of a software engineer.
        It's the same reason he's so pro Trump I think.
        [-]
        arp242 30 days ago
        I didn't even know about that; I checked his Twitter, and first thing is a reply to the Baltimore bridge collapse:
        > Doors and wheels fall off airplanes. Hundreds of illegal immigrants assault and overrun border guards in Texas. We can’t even muster support for our allies in the security council. But at least our pronouns are in order.
        https://twitter.com/unclebobmartin/status/177262869295654553...
        You really need to be a special sort of individual to go from a collapsed bridge to ranting about pronouns with no prompting or any prior mention of that. What a complete spanner.
        [-]
        forgotmyinfo 30 days ago
        That happened to esr, too, although he derailed right after 9/11, and it's been downhill ever since.
        zo1 30 days ago
        From the other side, people like me, see your odd dismissive behaviour in the same light. This bridge and hundreds of other things going wrong throughout the world atm are all symptoms of some sort of rotten problem, and I honestly don't know why you guys are so adamant about dismissing it. Almost as if you want to see it all come crashing down.
        Pure evil, as Trump would call it.
        [-]
        arp242 30 days ago
        Yes, a single accident of the type that's been happing sine bridges were invented, pronouns, and unconditional support for an out of control government in Israel are all deeply related...
        Or he's a hyper-politicized idiot completely consumed and obsessed by some topics and needs to talk about them all the fucking time at the drop of a hat, whether appropriate or not.
        It has nothing to do with "sides"; it has to do with the forceful injection of completely unrelated matters.
        rjbwork 30 days ago
        >of some sort of rotten problem
        Agree. Out of control neo-feudal lords taking up all the slack out of every system they can get their tendrils into in the name of efficiency, thus endangering us all in order to extract as much profit as possible out of our entire economic system.
        The "other side"'s solution tends to be to further deregulate industry and the economy, and to empower and lower the tax burden of these neo-feudal lords, all while pointing at gays, trans-folk, and immigrants as if they are the reason people can't afford rent and bridges are falling down.
        renegade-otter 30 days ago
        Another man opens the morning newspaper, and after scanning the headline, declares: "Honey, this planet is going to hell".
        jimbokun 30 days ago
        Is the rate of these incidents increasing, decreasing, or about the same?
        [-]
        zo1 30 days ago
        Obviously you can say I'm biased but I think they're increasing.
        pydry 29 days ago
        >From the other side, people like me, see your odd dismissive behaviour in the same light. This bridge and hundreds of other things going wrong throughout the world atm are all symptoms of some sort of rotten problem
        Just one problem, huh?
        This illustrates what I meant about black and white thinking.
        [-]
        zo1 29 days ago
        Not sure what angle you're going with. If you want to say there that there is more than one rotten problem at the core of society, go for it. I'd be the first to agree with you.
        This is not black and white reasoning. Each problem can be on a continuous plane, from "slightly bothersome" to "there is an absolutely horrible evil lurking here".
        wesselbindt 30 days ago
        The reason he says he voted for Trump is taxes. But that seems like a pretty flimsy reason. Have dems in the past couple decades ever reverted any republican-instated tax cuts for the wealthy/corporations, or even made any moves for tax bumps? In my experience dems are just republicans, except they wave a rainbow flag every now and then.
      - GiorgioG 30 days ago
        Unpopular opinion, Robert C Martin has caused more harm than good. People take his ideas and blindly follow them no matter the cost. I’m looking at you clean architecture.
        [-]
        forgotmyinfo 30 days ago
        Well yeah, solving problems and thinking critically and learning are hard and boring. "Uncle Bob" gives you handy alhorisms you can spam on your next pull request and then feel like you've done something.
        arcbyte 30 days ago
        There's definitely a missing piece he hasn't explained well enough. I've seen the best code in the world because it was clean architecture and I've also seen some truly atrocious code in the style of "clean architecture".
        To me the difference was in recognizing that the architecture advice is implicit - its not actually written down in the code, rather the code exemplifies it - and the business domain is explicitly modeled. When developers name everything "use case" and "interactor" and go crazy with packages and folder structures with architectural names it's a recipe for terribleness.
      - forgotmyinfo 30 days ago
        That's as ridiculous as saying "integer division will be useless in 20 years". Sometimes relational databases really are the optimal solution to certain problems. I really wish CS had better authority figures, although we shouldn't need them in the first place.
      - Shorel 30 days ago
        His words: "I am also self-taught. I never got a degree of any kind. I entered the industry at the tender age of 17 and never looked back. Frankly, school isn’t all it’s cracked up to be — especially in the software field."
        https://twitter.com/unclebobmartin/status/107212575854868889...
        Which means his opinions are more often than not uninformed opinions and should be taken with a grain of salt.
        [-]
        fijiaarone 29 days ago
        When you turn 23 in a couple years, you’ll look back at college and realize you weren’t at the apex of all knowledge back then.
        sph 29 days ago
        I think you overestimate college education a little too much. Having a degree doesn't make your opinions automatically better than those that didn't have the chance to get one. You wish it did, but it don't.
        [-]
        Shorel 23 days ago
        My remark is not about the quality of college education. It's about his mindset.
        You don't need a college education to read research articles, books and journals. But you certainly need an open mind and some discipline to not try to reinvent the wheel all the time. You need to learn about the state of the art in your field.
        Some ideas and algorithms are proven and battle tested, and people should learn about them, before trying to create something new and claim it is much better than the old thing. This is very noticeable in some fields like cryptography.
        His modus operandi certainly seems to be on the other extreme, everything old is bad and the new thing around the corner is going to change everything. Which causes a lot of people who believe him to collectively waste millions of hours of their time.
        About your comments on the degree... it really depends on how you got it. For most people, I would agree with you. A small group of people in academia are insanely smart and are focused on research.
    - tomnipotent 30 days ago
      Postgres as we know it has only existed for about a decade, since the post-9.x era in 2010-2014 when many of its lauded features were added. Replication, heap-only tuples, extensions, foreign data wrappers, JSON, leveraging multiple indexes, parallel index scans, reindex concurrently, JIT compilation, declarative partitioning, stored procedures, and major WAL improvements are all "recent".
      I love Postgres and it's been my go-to default since 2014 and you can pry it from my cold dead hands, but it's more contemporary with Perl 6 than Perl 5 if we're talking grey beards.
      [-]
      - Sesse__ 30 days ago
        Extensions and stored procedures are not; they've been around for longer.
        I do not actually use any of the others you listed (well, occasionally bitmap index scans, but it's not a killer). The main difference between 8.x and 15.x for me is the small stream of optimizer features and tweaks (e.g. at some point, it started being able to reorder outer joins).
        [-]
        tomnipotent 30 days ago
        Postgres 8 supported functions capable of returning tuples (RETURN QUERY), but lacked transaction control and other features that CREATE PROCEDURE added in Postgres 11 (2018).
        Extensions in Postgres 8 (and earlier) were also half-baked compared to 9.1 (2011), which added CREATE EXTENSION and other quality-of-life improvements including standardization.
        The latest complete implementations are what most people have in mind when talking about them.
        [-]
        Sesse__ 29 days ago
        Sure, there are improvements, but it's not like Postgres was a radically different beast back then.
        Would I pick Postgres 15 over Postgres 8? Yes, any day of the week. But I would also readily pick Postgres 8 over most of today's databases (including e.g. the latest MySQL and MongoDB versions), for most projects, if that was the choice I had.
        [-]
        tomnipotent 29 days ago
        I would not have risked my business (or budget) on Postgres 8, even with Slony-I. My appreciation for its well-written code and thoughtful architecture simply did not outweigh what it lacked for a reliable operations experience.
        MySQL foot guns were mostly under control by 2010, and it was operationally efficient to manage a cluster of MySQL servers in diverse configurations with mostly-known boundaries to what could go wrong.
        The Postgres foundations are the same today, but what you can build on top is very different today than it was pre-9.x.
        [-]
        sph 29 days ago
        I operated MySQL clusters in the late 2000s. I still am surprised how many hoops you have to jump through to set up replication in PostgreSQL than what you could do - out of the box - with MySQL 15 years ago.
        Sesse__ 29 days ago
        > I would not have risked my business (or budget) on Postgres 8, even with Slony-I.
        This assumes you need replication, though. You can get pretty far without it, especially given how wide hardware you can get now.
        > MySQL foot guns were mostly under control by 2010
        Eh. :-) I think you would have to specify which footguns in particular.
        [-]
        tomnipotent 29 days ago
        If you're doing tens of millions in revenue and the RDBMS is critical to your operations, I'm a firm believer you need replication. Keep in mind this was the era of spinning rust and a day-old backup could take hours to restore even with a SAN/DAS. A replica was for more than just distributing load, it was for resilience to keep the business healthy and avoid a SPOF.
        Risks to the business are more important to me than my enthusiasm for a technology or product.
        turtles3 29 days ago
        Replication can be for HA, not just for scale. All depends on your business requirements.
        Also replication can be good for other operational reasons, such as zero downtime major version upgrades. Again depends on the business need/expectations.
    - sph 30 days ago
      Excited, vocal users make the most noise.
      Somehow the entire Mongo DB era passed me through and I never used it once. I used to use MySQL in the 2000s, and switched to PostgreSQL in the 2010s.
      [-]
      - LtWorf 30 days ago
        They used it at my 1st job, a startup.
        They had an ORM which made everything much slower, and the most common operation they did was to add items into lists, which causes the whole document to be resized and requires it to be moved.
        It could have been done with 1 sql server, but we needed 4 really heavy mongodb servers instead. And of course our code was buggy so the schema was all over the place.
      - benterix 30 days ago
        You haven't missed much TBH. I think the only case where you'd use it in new projects in DynamoDB if you work with AWS.
        [-]
        finikytou 30 days ago
        Dynamo is not mongo?
        [-]
        benterix 30 days ago
        No, DocumentDB is Mongo. What I understand the OP parent referred to is the NoSQL hype era, not just Mongo which was probably the most popular one at that time.
      - jimbokun 30 days ago
        My company considered it, but went with Cassandra instead.
    - forgotmyinfo 30 days ago
      This is what happens when you hire 20-somethings who don't know how computers actually work to build your entire business. You either learn about relational databases in school, or through experience. Relational algebra has been around for half a century, at least. If someone doesn't know about its utility, it's a failure of the education system, or a failure of corporate incentives.
    - benterix 30 days ago
      > For grey beards, it's all very odd. Like, "where have you all been?"
      To be frank, PostgreSQL has evolved a lot since the late nineties. There was a time where people preferred MySQL over it at it seemed to work faster, certain things were easier and so on.
      [-]
      - hibikir 30 days ago
        I used postgres in the 90s. The biggest complaint was that postgres's default config needed a change so you could connect to it remotely, as the default, wisely at the time, decided that allowing connections from any IP by default before you changed any security setting whatsoever was a real problem.
        The MySQL performance advantages were large, with the small caveat that it was fast as long as you were using MyISAM, which achieved its speed by throwing away many safety features, like working transactions and foreign key constraints. Unsurprisingly, when eventually it got an engine that didn't throw caution to the wind, the performance advantage went away.
        I spent a lot of time in the early 2000s showing people whose that many of the issues they were facing with their e-commerce websites were ultimately down to the MySQL engine they had picked. But it's faster! Yes it is, but now your application code has to deal with all the things the engine has now offloaded to your application code.
        It's not that Postgres' changes were key to increasing market share, but that as MySQL matured, it had no options but to lose its 'advantages'.
        [-]
        kstrauser 30 days ago
        That mirrors my experience. MySQL was faster for use cases where data loss was acceptable. Will anyone die if you lose some web log analysis stats? Probably not. You wouldn’t want to use it for anything where guaranteed consistency was critical, like anything involving money, but it was an alright replacement for things like an in-memory data store.
        I say “was”. I hear it’s vastly better now.
        [-]
        gkbrk 30 days ago
        If you want a fast DB for web log analysis, you shouldn't look at MySQL or PostgreSQL anyway. ClickHouse will run circles around both of them for the use case of web log analysis.
        [-]
        kstrauser 30 days ago
        That was an example of the sort of thing that someone might reasonably have used MySQL+MyISAM for in the past. I wouldn’t use that combination at all today.
      - Shorel 30 days ago
        There was a time PostgreSQL did not run under Windows, and that is IMO what gave MySQL the market share edge.
        Without these windowless years, PostgreSQL would be the most used RDBMS right now.
      - jimbokun 30 days ago
        MySQL was more popular for a while, but from a technical standpoint Postgres was always ahead.
      - fijiaarone 29 days ago
        PostgreSQL didn’t exist in the 1990s. It was called Ingres. Postgres started as a personal project and was first released in 1999, but was unusable. Around 2004 the project started getting popular and hipsters started screaming that MySQL didn’t have stored procedures and wasn’t fully ACID. Acid became a buzzword for PG fanboys, even though data corruption and memory leaks plagued the system for the first decade. It became a stable around 2008 as long as you didn’t mind restarting your database every few days. PostgreSQL didn’t really become a viable option until around 2010.
    - rendall 30 days ago
      ...and, continuing the cycle next season, devs will (re)discover NoSQL and document DBMS. I do wonder if this phenomenon of trendy tech is somehow integral to advancing the state of the industry, the SWE equivalent of early adopters. "Rediscovering the old" is a side effect, as old tech is used and reinterpreted in new ways. Kind of neat, really.
      [-]
      - Retric 30 days ago
        There’s an entire industry of people selling knowledge of the “next great thing/the one true way," but they need a new thing every few years as people learned whatever the old thing was.
  - nemo44x 30 days ago
    MongoDB is more popular that it ever has been. You can look at their quarterly filings and learn this. The NoSQL space is larger than it ever has been and is the fastest growing segment in the database world.
- bdzr 30 days ago
  > The irony of it all is that MongoDB actually matured and you do not have to jump through hoops to have, say, transactions.
  Are transactions still limited to a single document in Mongo?
  [-]
  - throwitaway1123 30 days ago
    Mongo has had multi-document transactions since 2018 (version 4.0), and joins (the $lookup aggregation pipeline stage) since 2015 (version 3.2) [1].
    [1] https://www.mongodb.com/evolved#mdbfourzero
- debarshri 30 days ago
  Generally with these migrations biggest problem in my experience is that all the operational learnings are not transferrable.
  You often want to create expertise and achieve operational excellence with the tech that you start out.
  [-]
  - renegade-otter 30 days ago
    Back in the day we used to say "you choose your database once". You can swap in/out a lot of things, but your data store is really the core.
    It is shockingly hard to change your storage engine once you are serving customers, so you really want to get that right.
    [-]
    - cpursley 30 days ago
      Exactly, languages and frameworks come and go - but get the data layer right and you’re set.
cpursley 30 days ago
The one unsolicited piece of tech advice I always give is “Just use Postgres”.
Postgres Is Enough: https://gist.github.com/cpursley/c8fb81fe8a7e5df038158bdfe0f...
There’s some Mongo/json alternatives in the list if you really need unstructured data.
And there’s a huge plug-in ecosystem as well for just about anything you could imagine.
[-]
- kqr 30 days ago
  But isn't there a point before which "just use sqlite" is more appropriate advice?
  [-]
  - lysecret 30 days ago
    I love sqlite as much as the next guy but as soon as you are working with multiple write processes (which you quickly will) you will have to move off. So, why not just start with Postgres?
  - jimbokun 30 days ago
    SQLite is a replacement for structured data stored in a file. If your data store is a big XML or JSON or CSV, SQLite may be a better solution. For example, I’m using it for ETL tasks and it’s working well for this use case.
    If you need a networked database supporting many users, probably better off with Postgres.
  - WuxiFingerHold 29 days ago
    For server applications I don't think so (despite the recent hype driven by businesses marketing the cloud services). Managing multiple writes and durability needs the same or even more effort and attention than running PG or MySQL. And on top of it SQLite is missing many often crucial features. For me, those are the type system and functions.
  - cpursley 30 days ago
    Sure, if you simply need a dumb store just for storing basic types that’s simple to operate. But SQLite is not at feature parity with Postgres.
  - __s 30 days ago
    sqlite is great for what it is
    But I recently ended up using sqlite on a site, & it's clear it makes a lot of trade offs to keep itself minimalist. I wish I had just used postgres
    That minimalism can be appropriate if you want something that works with a single file to track state & no external process
    [-]
    - tristan957 30 days ago
      Have you tried using something like Turso? Curious how that experience would be for you.
      (I do not work at Turso)
      [-]
      - __s 30 days ago
        Looking over it, I don't see why you'd use a managed sqlite service. Unless you started with sqlite & now you're stuck with its quirks
        Don't know, I lean pretty minimalist on things, like I'm using `hyper` directly on another website I work on, rather than picking axum or warp
  - madsbuch 30 days ago
    I am curious on this. Why do you think SQLite is simpler than Postgres? and in what setting is that the case?
    Sqlite comes with the headache of managing an attached volume.
    [-]
    - J_Shelby_J 30 days ago
      Building an mvp or a single container app.
      I’m actually trying to get Postgres to work like this right now.
      SQLite being embedded is simpler to use and more performant in this environment as there is no IPC overhead.
      [-]
      - madsbuch 30 days ago
        Why even use a database? Just assign a variable. All state is dead the second you destroy the container anyways.
        [-]
        jimbokun 30 days ago
        Well then single container plus one attached volume.
        [-]
        madsbuch 30 days ago
        Yep, my point was that this roughly has the complexity of setting up eg. Supabase.
        In conclusion: I would always advice for Postgres over the approach of setting up a volume with sqlite.
        The second you need to move on, you need to get that pesky file out of that volume. With Postgres, you just connect to the database from any psql compliant client.
glimshe 30 days ago
Friends don't let friends use MongoDB. In my previous job, I got a big promotion in part for getting rid of MongoDB and virtually eliminating our database problems.
[-]
- pydry 30 days ago
  Most of the time I've tried to convince people not to move away from mongo i failed.
  My only success was pushing back on switching from postgres to azure cosmos after Microsoft salespeople convinced some of our managers that we were making a mistake by not using their hot new toy.
  This industry...
antfarm 30 days ago
I remember well how in the early 2000s everybody wanted to get away from their relational systems and NoSQL (Not only SQL) databases were the latest fashion. Looks like we have come full cycle.
[-]
- Dudelander 30 days ago
  A fad that set back the industry by a decade.
  [-]
  - trynumber9 30 days ago
    Not sure on that, it made some of the SQL solutions innovate quite a bit.
    [-]
    - forgotmyinfo 30 days ago
      What did relational databases need from all the document DBs du jour? Broken transactions? Lack of ACID guarantees? Devil-may-care schemas?
      [-]
      - rsanek 30 days ago
        yep, the last one -- Postgres' support for json / jsonb is way better now.
- jpalomaki 30 days ago
  Relational databases have also evolved. With JSON types you can now easily mix relations and documents.
  [-]
  - Aloisius 30 days ago
    Couldn’t you do that with the XML type? I believe that was added to Postgres before MongoDB existed.
EMM_386 30 days ago
> With MongoDB, we lost out on many nice features from the relational world ...
> ... this hurt in particular because our data was very much relational.
That is my best attempt at the summary.
The write-up is good, but after you've been around a long (long) time ... you know what this is likely going to involve.
Glad relational SQL is still going strong 50 years later.
[-]