Best Practices for AWS Lambda Container Reuse


66 points | by kiyanwang 61 days ago


  • hn_throwaway_99 60 days ago

    This article was missing a couple of major points:

    1. He doesn't say whether the RDS DB is Postgres or MySQL. The MySQL connection times tend to be much shorter than Postgres. With Postgres, it's a very good idea to put use a pgbouncer instance to implement connection pooling and have the Lambdas connect to that. 2. As others point out here, in AWS it is a very bad idea to have your RDS DB not behind a VPC. When you do that, it basically makes cold start times completely untenable for a user-facing synchronous function. Amazon has promised they are improving this situation, not sure what the status of this is. 3. Author points out some of this in his article, but caching your connection in global scope opens up a whole host of very difficult to track down bugs (e.g. you can have lots of cached connections open and thus killing your DB connection limits, needing to handle dead connections, etc.)

    In my opinion, and with a lot of experience on a high traffic consumer site, it's a bad idea to cache connections in global scope. Either use MySQL or pgbouncer and connect in the function body.

    • Niksko 61 days ago

      We went down the Lambda route at work, and I think ultimately it's a poor fit for the type of service that is being described here.

      Do your persistence elsewhere, probably in an API wrapped around RDS.

      SSM or other secrets needing decryption and fetching at runtime should use the AWS recommended method of storing this data in the global scope (at least that's how it works in Node), but this should feel icky because it is icky.

      Mixing statefulness into things that are inherently intended to be stateless probably indicates you've chosen the wrong tool for the job.

      • KaiserPro 61 days ago

        what the OP describes is caching.

        Each lambda has some state which, if one is sensible can be used to cache things. Having a db connection cached is perfectly sensible, assuming there is some retry logic.

        Making an API around RDS its just a boat load of tech debt, when its trivial to cache a connection object. Not only that, it will add latency, especially as the goto API fabric of choice is http. So a 20ms call(worst case) can easily balloon upto 80ms (average)

        its now possible to have SSM secrets referenced directly in lambda environment variables, so I don't see the need for the complications of doing it yourself.

        Lambda is just cgi-bin, for the 21st century. There is nothing magic to it.

        • ben509 60 days ago

          Caching the db connection may not work depending on your DB driver. A possible scenario is:

          * Cold start, connection is established.

          * Code runs and returns.

          * Lambda stops the process entirely but retains the container.

          * DB detects the socket is closed and kills the connection.

          * Warm start.

          * Code runs and fails because the connection is invalid.

          • redisman 60 days ago

            After they fix VPC cold starts (the ENI issue) I hope we get Lambda lifecycle events. onDestroyed, onFrozen, onThawed, etc. You kind of get onCreated now if you put the code into global scope but that's gross

            • humbleMouse 60 days ago

              The OP commenter said that the cache would have to contain retry logic, which in your sceanrio would trigger and then initiate a new connection object.

              • ben509 60 days ago

                Yeah, I should have been clearer, it's not that it can't be done, it's just a bit messier than it seems at first blush.

                The issue is that actions taken by stateful clients can't always be retried.

                You can detect these kinds of errors, but it's klugey because these drivers generally assume that sockets dying is relatively uncommon, and it's painful to test this behavior when FaaS doesn't give you direct control over it.

                So you typically wind up either with a proxy, or you decorate any handlers to refresh the connection ahead of time to make sure you have a fresh object.

                • humbleMouse 60 days ago

                  I agree with you that it's messy and an anti-pattern to try to implement something like this in lambda.

          • cle 61 days ago

            Agree, at the moment Lambda doesn't quite make it for directly calling RDS DBs, because of the connection problem and because your DB should be in a VPC, which has horrendous cold start implications for Lambda (it needs to attach an ENI in the invoke path, which takes many seconds).

            Re: the connection issue, it would be great if there were a service or RDS feature that could do the DB connection pooling behind a standard AWS API, so that we don't have manage connections in Lambda and so that we can query the DB from public internet with standard AWS auth without having to expose the DB itself.

            • 013a 60 days ago

              This exists, in Aurora Serverless. It is a shame they haven't opened it up to the other RDS databases.

              • fulafel 60 days ago

                Why should DB be in a VPC? I've seen their architecture guidance recommend that, and it often gets turned into gospel in projects. I wish AWS was culturally more pro-internet. VPCs, bastion hosts, etc create more complexity and work, the cost of which could be more productively invested in other security.

              • 013a 60 days ago

                Lambda is an incredibly interesting architecture, but its pretty rare to find a use-case that makes sense given its inherent limitations. There's always going to be a cold start; even if they manage to optimize the runtime cold start to near-zero, functions still have stuff they gotta do once they start.

                Google's recent Cloud Run service feels like a much more generally useful Serverless platform. Even AWS Fargate, despite not having the whole "spin the containers down when its not serving requests" feature, is how I envision most customers adopting for Serverless on the medium term (especially with their recent huge price reduction, its actually competitive with EC2 instances now).

                AWS needs to get Fargate support added to EKS. Stat. It was promised 18 months ago at Re:Invent and is still listed in their "In Development" board on Github. That'll change the game for Kubernetes on AWS, because right now its kind of rough.

                • redisman 60 days ago

                  A predictable near-zero cold-start would be amazing for most applications. Currently in the most extreme cases we've seen 20 second cold starts sometimes which just makes it seem like it's not ready for production.

                • ak217 61 days ago

                  AWS is working on the Serverless Aurora Data API which is meant precisely for this purpose (it's an API wrapped around RDS).

                  • paulddraper 60 days ago

                    I don't understand. Don't MySQL and PostgreSQL already have an API?

                    • tdfx 60 days ago

                      Yes, but each database instance has a limited number of connections that can be open. You can use Lambda to handle as many web requests as you want, but if each Lambda invocation is creating a new database connection you've only shifted your bottleneck one layer down the stack.

                      • paulddraper 60 days ago

                        I think Azure runs pgbouncer for that.

                        • tdfx 60 days ago

                          Sure, you could run pgbouncer yourself on EC2, but unless it's offered as a managed service you're no longer "serverless" as you've got a pgbouncer server to administer yourself.

                          • paulddraper 59 days ago

                            I agree. That's why Azure runs it for you.

                    • jsperx 60 days ago

                      I was actually at an AWS event and they mentioned that, so I searched and found this article...

                      ...that says it’s REALLY slow.

                    • k__ 61 days ago

                      Yes, VPC and RDS are two points that would keep me from using Lambda in the first place.

                    • SethTro 60 days ago

                      Did anyone else notice the sql injection in their first block of code?

                      • Niksko 61 days ago
                        • Animats 60 days ago

                          They've re-invented FCGI.