I read this as “Postgres functions are 9x faster in Citus” but it seems to actually be “Postgres functions in Citus 9 are 9x faster than in Citus 8” - because Citus 8 has round trips that standard PG doesn’t. Or did I miss something?
As ozgune just said: yes, that's right. The 9X is a comparison of how Postgres stored procedures in Citus 9.4 perform vs. in Citus 8.3 (as measured by the HammerDB benchmark, shown in Figure 5.)
When I read this post, I find the HammerDB TPROC-C benchmark results in Figure 1 to be super interesting, too. When you compare Citus 9.4 performance with distributed tables and distributed functions vs. Citus 9.4 with regular Postgres tables and regular Postgres functions (on a single node) -- then the multiplier is an even more impressive ~19X. :)
In Citus 8, the coordinator node would run stored procedures. Citus 9.x introduced changes to push down stored procedures to worker nodes. So, if your stored procedures mostly touches local shards, Citus 9 significantly cuts down network traffic.
What's kind of unique about the distributed functions feature in Citus that's described in the blog post is that you can scale stored procedures horizontally if procedure calls query co-located shards, but it doesn't restrict you from doing queries and transactions across all shards.
Happy to answer any questions or debate the merits/evils of stored procedures.
I'm interested in a comparison between distributed Postgres, and regular Postgres running on a distributed file system with SkyhookDM. It seems like a deep topic, and results would be very workload-specific.