Ways to reduce the costs of an HTTP(S) API on AWS

(gameanalytics.com)

280 points | by praveenscience 1594 days ago

24 comments

georgyo 1593 days ago
I run a small service, ifconfig.io, that is now getting 200 million hits a day from around the world.
The response from it is about as small as you could make it, however at that volume it is about 150gb a day.
If I hosted this on AWS, the bandwidth alone without any compute would cost $900 a month. Prohibitively expensive for a service I just made for fun.
The cost of just sending the HTTP response headers alone is the majority of that cost to. There is no way to shrink it.
It is currently hosted on a single $40 linode instance and can easily keep up with the ~2400 sustained QPS. I think it can get up to about 50% more traffic before I have to scale it. And linode includes enough bandwidth with that compute to support the service without extra costs.
I don't see how anyone pays the bandwidth ransom that GCP and AWS charge.
[-]
- blantonl 1593 days ago
  Well, to be fair, most of us who are paying the "bandwidth ransom" do have to scale, quite significantly I might add, and so the value is the platform as a whole.
  Furthermore, if you are doing something for fun like you are, the bandwidth ransom definitely comes into play for elastic cloud environments, but anyone doing anything significant on AWS/GCP has definitely already negotiated down their bandwidth spend with their AWS/GCP account management team.
  [-]
  - georgyo 1593 days ago
    At large scales, decisions need to get made. AWS and GCP will not negotiate with you unless you're big enough to make any if that worth their time.
    Netflix is a great example. They run most of their services on AWS. But they also run their own CDN with real hardware in data centers because serving it from Amazon would be a deal breaker.
    There are reasons to use AWS and GCP. But when I start a project, I don't start there. It's too expensive one way or another, and the "free" tier gets blown out extremely quickly.
    A smaller provider will provide what you need, normally be cheaper, and has no lock in. If you later decide that you really want autoscaling or managed databases then you can move easily. And if you do switch, you'll at least know what your product even wants to be, and it's projected growth.
- jeremyjh 1593 days ago
  For a lot of services, bandwidth is the smallest part of the hosting cost; often around 10%. It really depends on the kind of workloads and traffic you are getting. Of course, the low percentage is partially because their other services are also all very expensive relative to a VPS or dedicated host, but its not really a comparable service offering.
juliansimioni 1594 days ago
This is a good list of ways to reduce outgoing bandwidth costs, but as someone who has switched from backend developer to running a small business, I can't help but notice that they don't talk at all about whether any of their cost savings were meaningful to the business.
Sure, it looks like they saved about $2000/month, but consider that those savings probably won't even pay for more than a quarter of a one of their developers.
Even though their service is free (their parent company gets business value from the aggregate analytics they obtain through their service), it very possible that there's something they could have done to bring more value to their parent company than the money they saved here.
Maybe it's unreasonable to expect a company to talk about that in a blog post, but it left me wondering.
[-]
- markonen 1594 days ago
  My read was that they actually saved over $8000 per month:
  - They mention that the initial savings of $1500/mo from omitting unnecessary headers was 12% of their egress cost (so the total before this was $12500)
  - Then they got an additional 8% of savings by increasing the ALB idle connection timeout to 10 minutes (down to $10120)
  - Finally they said they saved $200 per day by switching to a lighter TLS certificate chain ($6000/mo, so down to $4120)
  None of those steps seem to have required any meaningful amount of development work. Let's say this took a developer one week? The return on that effort would be $100k a year, or $2500/hour for the first year alone.
  [-]
  - blazespin 1593 days ago
    Considering they have enumerated this for others to pick up and execute quickly, they may have just saved the wider industry potentially 100s of thousands per month.
    Give and take is an open source attitude. It doesn’t always have to be about source code, sometimes it can be about cost savings techniques such as this.
  - Bombthecat 1593 days ago
    Maybe it is a good thing that cloud is so expensive here.. The internet traffic is bloated enough already :)
- yelite 1594 days ago
  > consider that those savings probably won't even pay for more than a quarter of a one of their developers
  Although I never run a business, I do believe this kind of optimization is quite meaningful even though they will never be the top priority of a business.
  Those optimizations lower operational cost while being mostly maintainance free (except the one that switches off from AWS certificate manager, which may increase some effort when renewing), risk free (unlike refactoring a large legacy system) and requiring little engineering effort (Maybe 10 engineering days from investigation to writing the blog post?)
  In addition this blog post itself brings intangible benefit on their branding, website ranking and hiring.
  [-]
  - markonen 1594 days ago
    I think you're exactly right. It has become a HN trope that every cost optimization story gets a response like this: your infrastructure costs are trumped by the cost of your developers, so why spend the expensive resource (developers) on optimizing the comparatively cheap bit (infrastructure). I'm tired of the trope because it's such an oversimplification.
    What matters is the return on investment, and as you state, one of the great things about cost optimization is that its returns come largely risk free. By my math the optimizations described here return $100k a year. On a risk-adjusted basis, what task could this developer have performed that would have returned more?
    [-]
    - adventured 1594 days ago
      In this thread line regarding small businesses, another critical point is that the $24,000 (and certainly in the $100k premise) also might be part of the remainder compensation or profit calculation for the owner of the business. Sure it pales next to the cost of five engineers and yet it could easily be anywhere from 1/10 to 1/3 of the annual profit for a small business. If you're the owner, that's a big deal over time. You never know how tight a small business has to operate, however typically it's thinner than not.
    - blantonl 1593 days ago
      you are kind of making the assumption that the developer spent all year working on this cost optimization.
      I'd bet the optimization and subsequent write up in a blog post didn't take more than a week to get done from start to finish.
- jacob019 1594 days ago
  Let's say it is $2k/mo. I also run a small business. And when things are growing it's easy to think that way. But in the long run every business that faces competition needs to focus on the bottom line. How many developer hours do you think it took to save that $24,000 per year? Not much. And that is just one example. A culture that ignores efficiency is doomed to failure.
  [-]
  - MuffinFlavored 1594 days ago
    > Let's say it is $2k/mo. I also run a small business.
    Your server bill should be $100/mo - $200/mo max for a "small" business. I've ran a multi-tenant SaaS platform on a $200/mo DigitalOcean budget (server for Postgres, server for Redis, server for node.js apps) that brought in $30k/mo. If you're spending that much a month on cloud hosting, consider yourself got by the marketing of "serverless".
    [-]
    - jfengel 1594 days ago
      In this case, they receive five billion requests a day, from a client base of over a billion. That's not really a small business any more.
      I don't know what DigitalOcean would charge for servers at that scale.
    - sombremesa 1594 days ago
      I think you missed GP's point. They aren't telling us how much their business makes, they are replying to the posts saying "that wasn't worth it" -> "maybe it was worth it because it was probably ~8k" with "even if it was 2k it was worth it".
- zamadatix 1594 days ago
  I'm not sure this is really questioning anything more than "I wonder if there is something they could have done better in terms of business operations" to which I can't imagine the answer ever being anything other than "yes", especially in retrospect.
- lucb1e 1594 days ago
  > those savings probably won't even pay for more than a quarter of a [developer]
  So you're assuming that configuring nginx properly, once, takes 3 months, every year? If it takes the developer (or sysadmin) less long than that, you're already saving money.
- lowdose 1594 days ago
  Isn't IT all about investing in fixed cost upfront and reaping profits on the variable costs in the future?
- whatupmd 1593 days ago
  Hey boss, just found a way to save 2 grand a month without any operational impact!
  Johnson, you’re fired! I just saved myself 10 grand a month!
  Narrator: where do I sign up to work for that guy...........
  [-]
- kohtatsu 1594 days ago
  I'd much rather that $2,000/month go to my developer than to line Bezos' pocket.
- ggregoire 1594 days ago
  FYI: $2000/month pay 2+ developers in half of the world.
  [-]
  - ulfw 1593 days ago
    No it doesn't. Not good ones.
- bdcravens 1594 days ago
  If it saves $24,000 a year, and your developer cost is $100/hour, 240 hours or less spent a year on this effort is your breakeven. Pretty sure that's a win.
- tuananh 1593 days ago
  yup. at scale, percentage is what matters.
alex_young 1594 days ago
All great ideas.
Another suggestion:
Terminate somewhere else.
If you fit inside of the CloudFlare T&Cs, you can probably save a much larger amount terminating there and having them peer with you using the same TLS every time, or failing that, try someone like BunnyCDN.
I've found that while AWS CloudFront is easy to instrument, it's neither very performant (lots of cache misses even when well configured), or cost effective (very high per byte cost).
[-]
- stefan_ 1594 days ago
  This. If your service is collecting aggregated analytics data from users, bytes that those users would never care to send in the first place, you can get vastly vastly better pricing on traffic by going with providers that don't care too much about high-quality peering.
- StavrosK 1593 days ago
  > terminating there and having them peer with you using the same TLS every time
  Can you elaborate for someone who isn't that familiar with networking? How does this work?
  [-]
  - Taik 1593 days ago
    This is basically saying, use a 3rd party CDN (e.g. Cloudflare) to handle and terminate client connections, letting the CDN pipeline the actual requests through a handful of persistent connections to your server.
    [-]
    - StavrosK 1593 days ago
      Ah, I see, thank you. So this is just to avoid TLS negotiation every time.
iconara 1594 days ago
This was a great read.
We went through something similar a couple of years ago, when TLS wasn't as pervasive as it is today and at first focused mostly on minimising the response size – we were already using 204 No Content, but just like the OP we had headers we didn't need to send. In the end we deployed a custom compiled nginx that responded with "204 B" instead of "204 No Content" to shave off a few more bytes. It turned out none of the clients we tested with cared about the string part of the status, just that there was a string part.
When TLS started to become more common we realised the same thing as the OP, that the certificates we had were unnecessarily large and costed us a lot, so we switched to another vendor. When ACM came we were initially excited for the convenience it offered, but took a quick look but decided it would be too expensive to use for that part of our product.
chrismeller 1594 days ago
I was honestly expecting some kind of meh article that said to reduce headers, enable compression and other basic stuff. I was pleasantly surprised that wasn’t the case... and absolutely astounded that the handshake provided that much of a difference, it was the last thing I would have thought of.
maxkuzmins 1594 days ago
At such a high volume of requests it probably makes sense to consider going one abstraction level lower by replacing HTTPS with plain SSL sockets based communication for further cost reduction.
Nice deep dive into the S of HTTPS anyway.
[-]
- jrockway 1594 days ago
  I think using HTTPS is fine. But there is probably some value in using GRPC+proto by default instead of REST+json. With client-side streaming, you set up and tear down the connection less frequently, and that means you negotiate TLS and send initial headers less frequently. And the messages themselves are smaller, especially for small messages.
  https://nilsmagnus.github.io/post/proto-json-sizes/
  GRPC streaming is almost as efficient as just using a raw TCP stream, but saves you having to write the protocol glue code. There are already clients and servers that work, and you can just write your protocol definition in the form of a protocol buffer. Worth a look for this use case.
  (Also, the clients know how to do load balancing, so you don't have to pay Amazon to do it for you. Unlike browsers, most language's GRPC clients are happy to take a list of IP addresses from DNS and only send requests to the healthy endpoints. Browsers, if you're lucky, try opening a TCP connection but will happy keep the same IP address even if it 503s on every request. Chrome, Firefox, and Safari all do different things.)
  [-]
  - mr__y 1594 days ago
    >the clients know how to do load balancing
    that is of course true, but they won't be able to ommit not working/failed/overloaded nodes whereas a load balancer might be able to do so. On the other hand the client might be programmed to just use another IP from the list and resend request if one node fails to answer, but this would increase the total time required by the client to do a successful connection. I also realise that non-responsing nodes might be rare enough for this to be a negligible problem - just playing devils advocate here.
    [-]
    - jrockway 1593 days ago
      No you can do all that stuff with grpc. You can use active health checks (grpc.health.v1) to add or remove nodes from the pool. (You can configure the algorithm that is used to select a healthy channel for the next request, too.) You can also talk to a central load balancer, which provides your client with a list of endpoints it's allowed to talk to.
      When you control the client, you don't have to resort to L3 hacks to distribute load. You can just tell the client which replicas are healthy. (And both ends can report back to give the central load balancer information on whether or not the supposed healthy endpoints actually are.)
      L3 load balancing actually works somewhat poorly for HTTP/2 and gRPC anyway. They only balance TCP connections, but you really want to balance requests. That is why people have proxies like Envoy in the middle; the client isn't smart enough to be able to do that, but it is. But if you control the client, you can skip all that and do the right thing with very little resources.
      [-]
      - namibj 1593 days ago
        If your nodes are large enough, you won't need to balance requests directly. You do need it to work on 5~50 requests in parallel, however.
- bureaucrat 1594 days ago
  Or just encrypt in-house and use HTTP.
  [-]
  - maxkuzmins 1594 days ago
    These guys receive requests from mobile devices. Afaik sending unencrypted HTTP requests is not allowed on some platforms (e.g. iOS).
    [-]
    - throw03172019 1594 days ago
      You can disable transport security for a certain domain using the plist key NSAppTransportSecurity. But you rarely should do this :)
      https://developer.apple.com/library/ios/documentation/Genera...
    - bureaucrat 1594 days ago
      First, they can offer both http and https endpoints. Second, you can send HTTP request if you set NSAllowsArbitraryLoads=true.
  - jrockway 1594 days ago
    I think you should think long and hard before you roll your own crypto.
    [-]
    - gberger 1594 days ago
      You don't need to roll your own crypto in order to encrypt in-house.
tumetab1 1594 days ago
> Also, the certificate contains lengthy URLs for CRL download locations and OCSP responders, 164 bytes in total.
If you're going on that path It's probably best to avoid revocation altogether, since it doesn't really work, and go the let's encrypt way, certificates with lower lifespans.
On that scale a 15 days cert on rotation is probably fine.
[-]
- mhenoch 1594 days ago
  That's a good point. Seems like Let's Encrypt certificates contain an OCSP URL but no CRL URL, so they are a bit smaller.
SlowRobotAhead 1594 days ago
> We’re currently using an RSA certificate with a 2048-bit public key. We could try switching to an ECC certificate with a 256-bit key instead
Having just ruled out RSA on an embedded project for exactly this reason, definitely the first thing that came to mind.
If they’re getting down to the byte differences, under their additional options, they really should have had binary serialized data instead of JSON. Something like CBOR “can” near immediate conversion to JSON but it would mean an update to all of their end points and they might not be feasible but could be worked in for new projects over time.
[-]
- namibj 1593 days ago
  I'm sad about the state of support for ed25519/curve25519 crypto in TLS.
  If you could reasonably deploy a website that doesn't offer anything else for https, you'd instantly fix many session establishment-based CPU DoS attacks. It's multiple times faster than what you usually allow your server to negotiate.
bandris 1594 days ago
Perhaps AWS Certificate Manager certificates are deliberately large so more outgoing traffic can be charged?
Interesting idea from the post: "it could be a selling point for a Certificate Authority to use URLs that are as short as possible"
[-]
- jrockway 1594 days ago
  I doubt it. AWS's certs are just another three-quarters baked AWS feature. They did the best they could with the resources they had.
  At my last job we had a fun and exciting outage when AWS simply didn't auto-renew our certificate. We were given no warning that anything was broken, and it apparently began the internal renewal process at the exact instant the cert expired (rather than 30 days in advance as is common with ACME-based renewal). Ultimately the root cause was that some DNS record in Route 53 went missing, and that silently prevents certificate renewal.
  We switched TLS termination from the load balancer to Envoy + cert-manager and the results were much better. You also get HTTP/2 out of the deal. We also wrote a thing that fetches every https host and makes sure the certificate works, and fed the expiration times in prometheus to actually be alerted when rotation is broken. Both are features Amazon should support out of the box for the $20/month + $$/gigabyte you pay them for a TLS-terminating load balancer. Both are features Amazon says "you'll pay us anyway" to, and they're right.
  [-]
  - ti_ranger 1594 days ago
    > it apparently began the internal renewal process at the exact instant the cert expired (rather than 30 days in advance as is common with ACME-based renewal).
    Was this some time ago?
    The FAQ for ACM (https://aws.amazon.com/certificate-manager/faqs/ ) says:
    > Q: When does ACM renew certificates? > > ACM begins the renewal process up to 60 days prior to the certificate’s expiration date. The validity period for ACM certificates is currently 13 months. Refer to the ACM User Guide for more information about managed renewal.
    > We switched TLS termination from the load balancer to Envoy + cert-manager and the results were much better. You also get HTTP/2 out of the deal. We also wrote a thing that fetches every https host and makes sure the certificate works, and fed the expiration times in prometheus to actually be alerted when rotation is broken. Both are features Amazon should support out of the box for the $20/month + $$/gigabyte you pay them for a TLS-terminating load balancer.
    You're implying that AWS doesn't support HTTP/2 on any load-balancers they offer, but ALB has supported HTTP/2 since launch ( https://aws.amazon.com/blogs/aws/new-aws-application-load-ba... ) 3 years ago.
    I don't see any current load-balancer priced at $20/month (ALB, NLB and Classic ELB are all ~ $8/month), so I can't guess which one you were using here ...
    [-]
    - jrockway 1594 days ago
      I have no memory of when this was but it was on the order of 9 months to a year ago.
      "up to 60 days before" includes "five minutes after". What it excludes is the renewal starting 61 days before the cert expires, and, as documented, it sure didn't do that.
      Stuff went wrong and we had no observability. That is the AWS way.
      [-]
      - sciurus 1594 days ago
        Not to be mean, but you definitely had observability into the expiration date of your certificate. You just weren't monitoring it yet. What you are doing now with Prometheus sounds good.
        [-]
        toast0 1594 days ago
        If you need to figure out for yourself what to monitor about the service, including things AWS says it handles, it brings into question the value of the service.
    - tybit 1593 days ago
      Interestingly while ALBs support serving HTTP2 client requests, they only proxy them to back ends using HTTP1.1. This breaks some use cases like gRPC unfortunately.
      https://serverfault.com/questions/836568/terminate-http-2-on...
rlastres 1594 days ago
Funny enough, Amazon.com uses a Digicert certificate similar to the one mentioned on the article, they don't seem to use the ones they provide for free on AWS :slightly_smiling_face:
[-]
- raxxorrax 1593 days ago
  You have to terminate TLS at their load balancers though as they don't hand out any private keys of course. Still a great service.
  Digicert is pretty expensive otherwise... always a shock when I look up prices... There is let's encrypt, but I never tested it with anything hosted on AWS.
  Still, the article has great tips. And even if your app is some B2B service with <200 users, it still wouldn't hurt to implement the measures. Even if the product owner doesn't care if the solution costs 20$ or 200$ a month. Some of these tips are pretty low effort. Saves energy at least.
- yandie 1594 days ago
  Big surprise. Contrary to the popular belief, AWS wasn't/isn't built to support Amazon.com. Some fundamental pieces are designed for Amazon.com scale, but most other services are not (ACM in this case)
  [-]
  - Dunedan 1593 days ago
    Amazon.com uses a lot of AWS services. They even write about it: https://aws.amazon.com/de/blogs/aws/amazon-prime-day-2019-po...
    Of course it's true that they don't use all AWS services, either because they don't need them or because they had something built in house earlier which works for them.
chrissnell 1594 days ago
Didn't see it mentioned: SSL tickets. If you were running a NLB and nginx in a pool of instances, you can use an Openresty-based inplementation of SSL tickets to dramatically speed up negotiation of reconnecting clients. You will need a Redis server to store the rotating ticket keys but that's easy with AWS Elasticache. You will also need to generate the random keys every so often and store them in Redis, removing the oldest ones as you do. This is a task that I accomplished by writing a small Go service.
If you serve a latency-critical service, tickets are a must.
[-]
- arkadiyt 1594 days ago
  > Didn't see it mentioned: SSL tickets
  They do talk about it, SSL tickets and TLS session resumption are referring to the same thing.
rlastres 1594 days ago
I guess this might be specially relevant for traffic patterns similar to the one described in the article, for other use cases most likely those optimisations will not translate into big savings
devit 1593 days ago
How about the obvious solution of not having ANY data transfer out?
Encrypt and sign the data via NaCL or similar, send via UDP duplicated 5-10 times, no response at all from the server (it's analytics, it doesn't matter if very few events are lost and you can even estimate the rate).
As for the REST API, deprecate it and if still needed place it on whatever 3 VPS services have the lowest costs, and use low TTL DNS round-robin with something removing from DNS hosts that are down.
coleca 1594 days ago
Fascinating article. I love posts with this type of in-depth investigation into what everyone else would just pass over and not even think about.
It's not surprising that it's related to the gaming industry. Some of the best AWS re:Invent videos I've seen are in the GAM (gaming) track. Even though I've never worked in that field, the problems they get hit with and are solving often are very relevant to any high-traffic site. Because of the extreme volume and spikiness of gaming workloads, they tend to find a lot of edge cases, gotchas, and what I'll call anti-best practices (situations where the "best practice" turns out to be an anti-pattern for one reason or another, typically cost).
ajbeach22 1594 days ago
I wonder what the cost is compared to terminating SSL at Clodfront? For my web tier architectures, I use Cloudfront to reverse proxy both dynamic content (from the api) and static content (from s3). SSL is terminated only at CloudFront.
[-]
- ball_biscuit 1594 days ago
  I don't think you can use Cloudfront to serve that kind of traffic. Cloudfront costs are described here: https://aws.amazon.com/cloudfront/pricing/
  So for 10k HTTPS requests, the price is 0.01 $. If you serve 5 billion per day, that is 5000$ a day. With such high traffic I believe it is needed to handle it using performant webservers (Go, Erlang?) to keep costs reasonable, and probably terminating SSL at load balancer is the way to go
  [-]
  - ajbeach22 1594 days ago
    I am not sure that math is right. Using the aws cost calculator, its only about 1100/mo for 5B https requests. However, I think if you consider data transfer its still probably in the range of a several thousand a day. yikes.
    [-]
    - Dunedan 1593 days ago
      Not sure what calculator you're using, but from the pricing page [1] it's pretty clear that 5B HTTPS requests cost at least (depending on the geographic origin) $5000. And that's per day and without data transfer.
      [1]: https://aws.amazon.com/cloudfront/pricing/
synunlimited 1594 days ago
You could also look into using brotli compression over gzip for some more savings of bytes over the wire.
[-]
- Ayesh 1594 days ago
  Brotli support in API clients are quite low. I run a small API service, and you'd be lucky to see API clients even using gzip.
  [-]
  - synunlimited 1594 days ago
    True, though given its relatively easy to support you could get some savings in the few cases that do use it.
    Also if they own some/all of the SDK's that are used for hitting their API they could bake in brotli compression at that level.
- LeonM 1594 days ago
  I don't think there would be many clients that would support that. The API clients are usually not browsers in their case.
meritt 1594 days ago
This is an awesome article but if your egress costs are so high that you're deciding which HTTP headers to exclude, you should probably be moving to an unmetered bandwidth provider, or at least one that charges a reasonable amount for egress.
[-]
- caymanjim 1594 days ago
  Is there any such thing? I don't know of any cloud service provider that offers unlimited bandwidth. There are very few providers who could handle five billion connections per day in the first place, regardless of bandwidth.
  [-]
  - meritt 1594 days ago
    5B requests/day is ~60k/second, that's big but nothing insane. There are numerous frameworks/setups that can do far more than that on a single machine [1]
    popular unmetered options: he.net, ovh, hetzner - You generally lose a lot of the "cloud" capability with these options however.
    cloud options: digital ocean egress is $0.01/GB ($0.005/GB if you buy it via droplets), linode is $0.02/GB, vultr is $0.01/GB, etc.
    [1] https://www.techempower.com/benchmarks/#section=data-r18&hw=...
    [-]
    - all_blue_chucks 1594 days ago
      Unmetered connections are only unmetered until you cost them more than you're paying. They they throttle you or boot you. Nothing is free.
      [-]
      - meritt 1594 days ago
        I'm talking about actual unmetered where you pay for a dedicated amount of bandwidth, e.g. 1 Gbps / 10 Gbps / 20 Gbps. 10 Gbps usually goes for about $1k-$2k/mo in the US. This is how colo facilities have operated for decades.
        10 Gbps fully saturated delivers about 3300TB for that $1-2k/mo, versus the $22k/mo you'd pay AWS for the same.
        I'm absolutely not talking about the "unlimited bandwdith" bullshit that discount hosts offer.
        [-]
        all_blue_chucks 1594 days ago
        If your project gets featured on CNN and your bandwidth goes up 20x can these colo arrangements automatically scale up your dedicated bandwidth? I ask because having an outage when you get your first big break can cost you WAY more than your bandwidth bill ever would...
    - echelon 1593 days ago
      Is there anything cheaper than this? I'm about to stand up a streaming audio server and I'm worried about egress bandwidth costs.
      [-]
      - meritt 1593 days ago
        DI.FM uses cloudflare + bandwidth alliance [1] for their streaming audio network, so I'd model after that. Cloudflare isn't exactly transparent about their egress pricing, but most discussion seems to indicate once you start hitting about 50TB/mo, they'll strongly encourage you to upgrade to their $200/mo plan. But you can likely push tens of terabytes per month on their free or $20/plan.
        [1] https://www.cloudflare.com/case-studies/di-fm-eliminates-egr...
tyingq 1594 days ago
Maybe also consider caching API responses in a cheaper non-AWS CDN where possible. APIs like "zip code to list of cities" where the output is the same for all users, and doesn't change often.
nimish 1593 days ago
Switch to ecdsa certs and shave another few hundred bytes :)
Bandwidth is the killer thing with aws. It's designed to make you move services inside the boundary.
pragnesh 1594 days ago
"accept-encoding: gzip" header is request header. why it is present in unoptimized response in first place ?
[-]
- mhenoch 1594 days ago
  It was added as part of a bug fix five years ago: the server was looking at the Content-Type request header instead of Content-Encoding to determine whether the incoming payload was compressed. Not sure why the Accept-Encoding response header was added as the same time, but it went undetected since it didn't cause any problems (apart from costing money).
nartz 1593 days ago
How about a UDP endpoint?
pragnesh 1594 days ago
Using protobuf or flatbuffers also reduce payload size.
[-]
- rlastres 1594 days ago
  protobuffs is an option that could work for SDKs, but the API is also a public documented REST one: https://gameanalytics.com/docs/item/rest-api-doc. Also, in the responses could be possible to just not include a body, and AWS does not charge for data transfer in, so the size of the request JSON is not relevant for the cost
kaos19870 1594 days ago
wow, such a small change to your HTTP Headers can save you that much?
[-]
- blantonl 1594 days ago
  If you are running 5 billion daily requests where your outgoing response size is significantly less than the aggregate of the size of the headers, then yes.
  Also, the article clearly articulates that the answer is, yes.
bullen 1594 days ago
"If the clients use HTTP/2, data transfer decreases further, as response headers are compressed."
But CPU usage is increased for decompression and CPU is the only real bottleneck.
Just because you don't pay for the compression electricity doesn't mean you get away with it.
This ties back to my previous comment on the User-Agent subject yesterday, remove all headers except "Host" from all HTTP traffic is the solution.
HTTPS is a complete waste of energy. Security should not be overarching, it should be precision.
WebSockets are also bad, since they don't work well with memory latency. Use "Transfer-Encoding: chunked" on a separate pull connection instead.
[-]
- ceejayoz 1594 days ago
  > HTTPS is a complete waste of energy. Security should not be overarching, it should be precision.
  A harmless meme in the US might get you executed in North Korea. Optimizing for energy usage (which is already pretty minor on modern hardware for HTTPS these days) over security is odd.
- toast0 1594 days ago
  Electric use for the client on compressed vs not compressed isn't as clear cut as more/less cpu. You also need to consider the reduction in use of the network interface, since the data size will be smaller. Overall latency could improve as well if the compressed form is meaningfully smaller (depending on the tcp congestion window, just one packet smaller can mean a whole roundtrip time)
  [-]
  - bullen 1594 days ago
    No, that's not how it works, you cannot upgrade the routers in real-time without complexity and additional cost. So the cost for transfer is fixed with more latency. But if you subtract bad protocol design and the latency added by the compression/decompression I'm pretty sure you end up with the same deal just more complexity that costs even if you don't see the costs.
    Just like wind-power actually competes with nuclear because it take 30 days to wind down a nuclear power plant.
    Also data can be compressed with more efficient hardware on the backbone without you having to deal with it.
    The biggest cost of the internet is idle things and synchronized CPUs, async. never made it unfortunately.
- toredash 1594 days ago
  HTTP/2 isn't used on the backend AFAIK, so the CPU usage isn't a concern I terms of increase load on e.g.e EC2
  [-]
  - bullen 1594 days ago
    Still there is going to be energy lost for very little in return. We need to go the other direction; less machines, less IP addresses, less energy, less complexity, less code, etc.
    The only thing we need more of is cores and we can't have that because memory is too slow.
    [-]
    - toredash 1593 days ago
      I think this post shows that indeed small things adds up. Energy might rise by using HTTP/2, but that's not the concerns of OP, they want to reduce their cost, not their energy footprint.
      [-]
      - bullen 1593 days ago
        I think you are going to have a rough time if you separate energy and money like that. The only reason the dollar is worth anything is because of coal, oil, gas and nuclear.
        What do you think happens to the value of the dollar when the physical supply of energy becomes unstable in the coming years?
        Money is energy because debt needs energy to either have been spent in the past or energy being promised to be spent in the future.
        The proportion between these is what makes debt money trustworthy or not. When the states around the world privatizes houses that is old energy (so far since the energy to heat the house is marginal compared to building it), but when the stock market goes up that is a promise for new energy.
        Globally all money for old energy is saturated (negative interest rates) and now all liquidity is being injected into the stock market that promises that the future will be rich with energy.
        The stock market (and all companies) is a promise to spend energy we don't have.
        The only energy that is added to earth is sunlight, the only way to capture that energy are trees and plants.
        All jobs are now meaningless because of the energy that we are wasting. And people now depend on wasting energy to have a job.
        Do you see why energy is more important than money?
        [-]
        toredash 1592 days ago
        I did not say energy is more important than money.