Funny, right. I worked for at least 2 companies that at some point in time put a lot of money into Oracle. One of them is a leading gaming network, a lot of billions in revenue.
Teams struggled with migration off it. It was a multi year/multi millions project and there is no end to it. And newcomers were saying -> oh, that was a silly idea to use all this stuff (why didn't they used Dynamo :) ), hovewer, 15 years ago it was pretty ok + Oracle solution architects were all over the company.
I don't see how amazon's strategy is different. And I don't get how folks, who are saying Oracle lock was bad, but Amazon is ok, can justify such a thinking.
I will put my money on it, in 10 years those will be good examples of how not to do things. Like, for example, when AWS leadership changes. And internet will be: who would've seen it coming.
Lots of folks from Amazon participate in Hacker News, like Tim Bray, Colm MacCarthaigh, and Jeff Barr. Jeff Barr often comments in threads about announcements he's written. One of Tim's blog posts was recently on the front page . See: timbray, colmmacc, jeffbarr
I doubt there's any kind of voting cabal, but if folks are participating then they're probably voting according to their inclinations. (I don't vote too much myself, either on comments or articles.)
Any time you invent a new technology with a unique interface, then software built using that technology is coupled to it to some extent. It's actually fairly rare for software components to be so completely interchangeable that you can swap out implementations without changing the software that uses it.
At the most basic level, it wouldn't be a tough migration to any other FAAS. Yes it would be work, but I can't think of any other infra migrations that would be less effort.
But also you don't need to think of lambda code as code that can _only_ necessarily run on AWS Lambda.
We organize related lambdas (that would traditionally constitute an 'application') as a gradle multiproject, one module per lambda, with a common module for shared code, like DAOs. The CI creates and uploads an individual jar per Lambda, but updates them all every release.
We then have an extra module that pulls all of those together onto a web API and can be run as container independently any FAAS. At that point the fact your deploying to Lambda is basically irrelevant to your code-base, it looks and feels like any other 'application' and is probably even a little more organized.
Usage of AWS services is a conscious decision, absolutely. However, the product architecture that uses these AWS services is subject to careful review for design and functional components integrity. Any of these components must be replaceable as issues/bottlenecks are identified. For example, if AppSync is proven to have issues as the company scales further, AppSync can be replaced with self-hosted GraphQL clusters. Additionally, other components in the architecture can be similarly replaced.
I like the idea of serverless architectures, but I still wouldn't use it for anything that is important.
- Using a serverless architecture almost always implied getting married to your provider. You can run your code in only one place. You have given up all bargain power. When the relationship ends you have to build your system over again.
- It isn't really serverless; they're just not your servers.
- They are only efficient for the workloads the architecture is designed for. Stray outside the parameters and things start to become expensive, slow or both.
- If you use serverless architectures you have to make damn sure the people who built it stick around, because the only value you are left with if your provider folds or increases prices on you is inside the heads of the people who built the solution.
I have already seen friends getting burnt by this. Typically people build a prototype or a technology demo, it gets funding, the CTO insists that it isn't important to do anything about the serverless bits and just go with them (there is no pause to make good on "we'll fix it later" once money gets involved), then get jerked around by the service provider because they can't provide the support needed. Then they slam head first into the costs of actual production traffic which, somehow, even though it requires only basic arithmetic skills, none of them had been able to calculate before the huge bills started rolling in.
> Using a serverless architecture almost always implied getting married to your provider
I don’t disagree but I’ve made a web app with aws serverless. Frontend on s3, Backend Python flask on serverless and MySQL server (haven’t tried RDS serverless yet). Works fine, had a compiled library that did not work but all standard stuff. No marriage. :)
"- Using a serverless architecture almost always implied getting married to your provider. You can run your code in only one place. You have given up all bargain power. When the relationship ends you have to build your system over again."
I think you can use a format that is mostly provider-independent. Any movement will change just the integration. Also, this kind of applies for anything large. You get married to the API anyway and movement will always be painful to some degree.
"- It isn't really serverless; they're just not your servers."
You mean they are not on prem? 'Cuz they can be.
You mean that the name is bad? It really isn't as bad as people seem to imply. When done right, you don't worry about the servers.
You mean that you have no control over the execution environment? Well, if spectre and meltdown have taught us anything is that really you lose control at some point anyway.
I don't understand what you really gain with this setup. I mean, this extreme vendor lock-in situation is so short term. The absolute wrong strategy if you ask me. I would be curious to see this company in 5 years from now.
Let me ask you this question differently: lets say you are exploring a new market opportunity for the an exciting product you want to build. Would you rather spend cycles building what AWS has done already and thus delay speed to market or would you rather use what has already been done by AWS managed services and build what nobody has done before? Surely, this architecture will evolve over time, but it will only evolve as the startup quickly discovers what the market needs.
I used to hate aws for how expensive their bandwidth and storage was, until I started actually using it last year.
I think their new serverless stack is about to leave a lot of devops out of a job.
You can setup a a CI/CD pipeline in about half an hour with amplify, at the previous company I remember it taking a good 3 weeks to get CircleCi up and running properly.
And then moving a microservice over to it is basically 1 command, a few options, mostly just copy over the config from your old express backend with a few changes, and you're done.
One other dev I've showed the lighthouse scores of the react stack I deployed on it even said "this should be illegal".
And they're right, it's pretty much automated devops, the whole ap now loads in 300ms.
If you have server side rendering in your app the static content will automatically be cached on their CDNs.
And if you want to save a bit of money you can just use google firebase for your authentication and db.
GraphQl is surpsingly a breeze too as a middle layer if you want to leave your java or .net backend apis untouched.
At the end of the day, nodejs is completely insecure by design, your infrastructure will never be as secure as running it on gcp or aws. That's why you go serverless and stop messing with security and front end scalability.
If they solve the cold-start issue of databases on aurora they will completely dominate the market even more than they already have.
>You can setup a a CI/CD pipeline in about half an hour with amplify, at the previous company I remember it taking a good 3 weeks to get CircleCi up and running properly.
>And then moving a microservice over to it is basically 1 command, a few options, mostly just copy over the config from your old express backend with a few changes, and you're done. It's insane.
As a an engineer at a decent sized tech company, this sounds pretty normal, because our infrastructure teams have been providing it (and much more) to service authors via internal APIs/web UIs for much longer than "serverless" has been a buzzword.
You haven’t needed an infrastructure team since PHP shared hosting, and certainly not since Heroku or Elastic Beanstalk, except that people kept wanting greater complexity at lower cost. There is nothing new about “serverless” there.
There is a difference between what was then and what is today. The key difference is that the "serverless" term is massively overloaded here and once you dissect it you will see that it's a mix of multiple serverlessly managed services that we are able to take advantage of: Kinesis, DynamoDB streams, Kinesis Firehose, SNS, Lambda, Cloudwatch, and GraphQL/AppSync. Serverless computing came a long way.
Can you elaborate on Amplify? Is it really that good? It didn't take me terribly long to set up gitlab-ci with ECS and later Fargate; both of these feel more appropriate for web-serving apps.
I may see it in a full-JS app, but I still can't find a good fit for a JS-based backend. I've recently been exploring alternatives to Django for API backends and seriously considering a JS-based framework. I have yet to find one that is all three of: good, simple, in typescript. TypeORM looks excellent for the ORM side but there's still the matter of writing APIs; anything I've looked at (Express, Koa…) is atrociously repetitive compared to Django REST Framework -- NestJS is the best I've found, and it's still miles away.
I'm talking about Next.js/Nuxt.js style JS front-ends replacing exactly that plus JS heavy frontends like Angular and SPA react apps which was the last decade's modus operandi.
The way SSR hooks React/Vue into these JS apps "hydrating" them after loading prerendered component based views...to make them interactive without losing any performance compared to static HTML, is unique and extremely powerful, which most people don't understand until they do it. It really is the future of frontend development.
SSR combined with async loaded chunked bundles of components is far more than prerendering some server side Web apps templating library with full HTTP requests in between. All the power of a full fledged SPA but with none of the performance or SEO downsides with automatic offline + service worker caching. It's great for the webs future.
The scale is on a different level however. Your average node project will have 10/100x as many dependencies compared to other languages. Too many to conceivably check. Also due to how dynamic the language is, I think it is way easier to hide something.
The V8 runtime itself is pretty secure. However every npm package has total access to your filesystem and network i/o.
This is by design, the author of node himself has apologized for it and admitted that nothing can be done now because it'd basically break the internet.
This means any package ( i.e. eslint), dependency, anything that has code from just one malicious contributor can grab away all your API keys, ssh keys(if you still use those), environment variables, crypto wallets of your users( this has actually happened a few times now at scale).
With something like aws-amplify you just go on their site and put your environment variables there, instead of keeping them on your own machine.
Now you don't have to worry about using sketchy docker images, or your junior devs using their work laptops on a malware infested gaming cafe while still running their localhost server.
Aws and gcp can afford to have way better internal security and regular pentesting of their containers and infrastructure, so now wrapping those protecting layers around node, express, etc... is their problem.
You just push your code the production or testing branch and they handle all the provisioning, builds and deployments in 3-5minutes.
The npm dependency issue is a serious concern, but I'm not convinced that gcp or aws would mitigate the issue. If the problem is unaudited code that could be potentially compromised, gcp and aws will run that compromised code without protest.
It's very easy to incur high costs here. We implemented cost analysis dashboards that allow us to monitor costs per each event, device, with visibility into each AWS service we use, with charts showing historical data. Fiscal planning is now part of our architecture design and implementation.
We haven't hit any scaling issues yet. GraphQL is nice. It's really about getting data directly from DynamoDB and Aurora to an end point that Android/iOS/React-JS can query and subscribe to. Apache Velocity Template Language that AppSync uses is a pain though. This post captures it well (unfortunately): https://www.reddit.com/r/graphql/comments/b0zomv/aws_appsync...
AppSync does have limitations we have to contend with. Custom scalar types cannot be defined hence we are not able to define strictly typed GeoJSON objects. Apache VTL has its own learning curve; once you master it you can implement functionality without leaning on invoking lambda functions and avoid paying for their usage in high volume GraphQl call scenarios and access queried data faster.
You can IIRC ping support and ask for a concurrency limit increase, but probably what I would do first is try to segregate lambda deployments and API endpoints (or whatever trigger) by region so that total load is distributed (you get 1000 concurrents per region). Obviously at this point you would also profile your code to optimise function executions.
Right, I'm referring to AWS limits. I was running a benchmark yesterday against a logging endpoint I made with a similar architecture to the article. One function is attached to a public ALB endpoint and does some validation then writes the event to SQS; this was taking 100-200ms with 128Mb of RAM. A second function was attached to the SQS queue; its job was to pull events and write them out to an external service (Stackdriver, which sinks to BigQuery). This function was taking 800-1200ms at 128Mb RAM, or 300-500ms at 512Mb (expensive!).
While running some load testing with Artillery I found that I was often getting 429 errors on my front-end endpoint. When pushing 500+ RPS, the 2nd function was taking up over 50% of the concurrent execution limit and new events coming into the front-end would get throttled and in this case thrown out. That also means that any future Lambdas in the same AWS account would exacerbate this problem. Our traffic is spiky and can easily hit 500+ RPS on occasion, so this really wasn't acceptable.
My solution was to refactor the 2nd function into a Fargate task that polls the SQS queue instead. It was easily able to handle any workload I threw at it, and also able to run 24/7 for a fraction of the cost of the Lambda. Each invocation of the Lambda was authenticating with the GCP SDK before passing the event and the Lambda has to stay executing while the 2 stages of network requests were completed.
I'm happy to report I haven't been able to muster a test that breaks anything since I started using Fargate!
> the 2nd function was taking up over 50% of the concurrent execution limit and new events coming into the front-end would get throttled and in this case thrown out.
It sounds like you already found a great solution for your particular case. But it's also worth mentioning that you can apply per-function concurrency limits, which can be another way to prevent a particular function from consuming too much of the overall concurrency. For anyone who's lambda workload is cheaper than a 27/7 task, that could be a good option.
> Each invocation of the Lambda was authenticating with the GCP SDK before passing the event
I'm curious whether you tried moving the authentication outside of the handler function so it could be reused for multiple events? I've found that can make a huge difference for some use cases.
Good question. At 100x, probably not. At 10x, yes would be better than managing services on our own. By that time, we would have a better prioritized list of which services to self-manage and which ones to leave to AWS. Are you specifically concerned about DynamoDB for some reason?
Ah yes. The engineer would tell you we can move when we want. The manager would tell you it is harder than it looks. Management would tell you it will never happen. :-)
See it as reducing startup risk and deferring the payment to when you become successful and have money/time to throw at problem. Though there are best practices to do it in a clean way so moving is easier.
I’d be curious why you think at 100x is where you would lose out on TCO with self managed. I feel like staff time commitment should only go up with larger fleets, you’d really start running into the pricing advantages on 0-rated network here etc.