Custom SQL functions for data analytics in PostgreSQL


120 points | by carlotasoto 12 days ago


  • djbusby 12 days ago
    Not specifically about these functions but in general PG makes it pretty easy to build custom functions and data-types. I one of those who doesn't want to put too much business logic in my DB but for those cases where it really makes sense to get into your lower level language and do all the magic it afford on your data - it's a killer feature. Can't do custom extensions in RDS last I looked tho
    • jandrewrogers 12 days ago
      This is one of the most under-rated features of PostgreSQL -- its extreme extensibility. I've abused this to great effect over the years. If you are bold, you can make it do some things it really wasn't designed to, relatively safely. And the things it was designed to do, like introducing types and operators, is relatively simple.

      My only complaint is that for a few realistic use cases (e.g. zero-copy aggregates with variable-sized memory) there is no documentation how to do this efficiently and "correctly".

      • tpetry 11 days ago
        Custom functions and data types are great but what most people miss is custom aggregates! Some weeks ago I built tsvector_agg() in ~10 minutes to aggregate the fulltext search vectors of multiple rows into one. That‘s really magical and shows why PostgreSQL is do loved by anyone wantint to use more than a simple SELECT or UPDATE.
        • asguy 12 days ago
          Yah, you can't install AWS-unsupported extensions in RDS, but you can extend the type system and operators. You also have access to stored procedures with plsql/plv8/plperl so it doesn't end up being that bad in the end.
          • cfors 12 days ago
            Came here to comment this, there's a ton of awesome stuff around Postgres that using RDS makes impossible. Kinda stinks, but I don't have the time/expertise to manage a Postgres Instance on EC2.
            • Tostino 12 days ago
              If you have kubernetes experience, running a cluster using a pre-built helm chart (like pgo) takes away a ton of the pain by bundling needed services together and giving you easy upgrade paths, high availability, backups, etc.
              • ahachete 11 days ago
                I agree.

                If the extension availability is the main concern, I'd recommend the open source StackGres [1] operator, which has, possibly, the largest Postgres extension catalog [2] available.

                [1] [2]

                Disclaimer: founder of OnGres, the company behind StackGres

            • jerrysievert 12 days ago
              heck, even just having pl/pgsql means you can do pretty much anything _safe_ in your Postgres database, you don't need plv8 or plperl.
              • Tostino 12 days ago
                It's really not more verbose for anything i've written in it (and i've written a lot of pl/pgsql) than other languages for implementing a ton of business logic that actually makes sense to have right next to the data.
          • BeefWellington 12 days ago
            It is a pet peeve of mine when people offer code samples that require horizontal scrolling even when full-screened on a 1440p monitor (let alone 1080p).

            Either don't use arbitrarily size-restricted code blocks, or manually format your code so it's legible in the window size you've chosen.