Amazing how many e-commerce sites there are out there with time to first byte in excess of 3s, DOM loads over five seconds, as if they have no idea they are throwing away sales and killing their SERP ranks. And the fixes quite often don't require money or that much skill, just a little help from something like Varnish in many cases.
Is the only good way for me to use QUIC with NGINX (or any server that can deliver most of what NGINX does) to use GoQuic and quick-reverse-proxy from github? I realize that, I assume due to the protocol being quite different and using UDP, that it may require a lot of from-scratch retooling by the NGINX devs to light it up (otherwise we'd have had it from them long ago, I imagine), but we need to cut down on round trips baby. Would be nice if Google were to give https://cs.chromium.org/chromium/src/net/tools/quic/ an update with a happy .deb to get it running. Kind of a funny page for google not to serve with quic, guess they like a little irony.
Further parenthetically, I am all in favor of Google making the web better both with things like protocol development, pagespeed, image and video things, and also with search signals that give sites with https and speed a ranking boost. Money talks.
Also, a note on QUIC. Seems like the spec is a draft and is not going anywhere right now, but Google has made big progress with TCP BBR control-flow algo, and there's also TCP Fast Open and TLS 1.3 0-RTT to look forward to.
Moving more delivery (not just static content) to edges + using HTTP/2+Push should help a lot with perf, too.
There's definitely some truth to how (at least some of) e-commerce sites could be easily sped up. Part of the problem is that you have a combination of legacy stack, organizational inertia, lack of know-how and resource outsourcing (so 3rd party vendors owns parts of the stack).
And specifically, wrt caching and varnish, as soon as you put something in your cart and you're cookie'd pages you visit are not cacheable.
I think there's no straightforward answer to this. E.g. why would you have 2 DNS look-ups with cookieless domain? I imagine that means cdn.mydomain.com and mydomain.com?
If you're using HTTP/2, then one domain is better, because you can use the same persistent connection to multiplex your traffic. Well, technically, if you use the same IP and SSL cert then you can HTTP/2 Push the contents to the client (doesn't have to be the same domain).
However if your web server is slow or if the distance between the client and your web server is significant then you're likely not going to benefit from the performance gains of using one connection, because of the network propagation delay (vs low latency POPs from major CDNs).
Lastly, your static assets don't have to be on a cookieless domains, you can just strip the cookies and cache the content - that's what many varnish instances do ;)
One thing not explicitly mentioned here is what happens to your performance profile under load. You can have a well-designed app with solid, maintainable code, but your overall stack isn’t set up to handle simultaneous users. Whether it’s node, ruby, python, or php, you still have to consider things like your web server, reverse proxy, load balancer, and database configurations.
When I’m building things that are going to have heavy use I run regular tests at each level with the very simple Loads package in Python. I want to know the answer to the following question as each layer gets added in: what’s the maximum number of requests I can handle before a complete page load gets above 50ms? 100ms? 500ms?
I start with no database. Just pure application code with hard coded values. That gives me the practical maximum for the code as written. Is it good enough for the expected use?
Add the database and test again. How much does that hurt? Are we still in good shape? Or do we need to optimize?
Add your reverse proxy in what you expect your production settings to be (i.e. serve your static assets through nginx and not through your application’s framework for static assets)
How are we doing now? Often times you’ll need to do some tweaking at this stage. I run multiple instances of my Python apps on each box. Usually one per core and reverse proxy them to nginx via a socket instead of the http default, so I have to load balance them. In the simplest case you can use an iptables set up to round robin every new connection, but that’s really pretty hacky.
In the real world, I use haproxy because the open source version has sophisticated load balancing tools and health checks, both of which are limited in the free version of nginx. So get that configured and turn it loose and test again. You should see better performance than in your last run because you have more resources. But you’re probably going to have to do some tweaking here as well.
Etc., etc. until you’ve tested your entire delivery stack. Also, test with TLS enabled before you launch into prod it adds overhead to each request. Don’t get caught flat-footed when your real-world numbers don’t live up to your benchmarks and your product manager wants to know why.
Loads is a pretty blunt tool though. It gives you an early insight in to the max performance you’re going to be able to get, and it alerts you when one of your layers have caused a significant decrease in performance or not increased as much as expected.
The tools in the article are good for drilling down into where you need to look when something does go awry. But even when you use all of them and do edge testing on your CDN and all that, if you only start doing speed and load tests after your build is done, it’s too late.
You have to speed test and load benchmark early and often, so you know what to expect and you know where to probe if things come up short of expectations (and they will).
It’s a good explanation of what various tools and resources mean, but it’s only part of what you need to be doing to make sure your app actually performs the way you want it to in the real world.
This is a good reminder to me that that I want to start hitting the pagespeed site earlier in the dev process. I haven’t traditionally cared all that much because when my junk is loading in 50ms or less, I kind of don’t care. But I’m probably leaving some performance on the table by ignoring that until late in the process.
It’s also a reminder that I need to clean some things up and create a github repo with all of this laid out and some starting configurations for a real deployment scenario for each part of the stack. I’ll do a show HN if I ever get around to this.
I agree with you that the earlier you can catch those performance issues, the better. It's just like with catching bugs early in your development cycle. You could even argue that poor perf is a bug!
The more you know about the performance characteristics of your stack, the better, but given how many 3rd party services you might be using in production it might be close to impossible to predict that. Either way, you should be constantly monitoring the performance of your applications (backends, DBs, browser timings, RUM, cross-stack traces etc.) .