This is an opinionated transcription of Eric Johnson's talk Thinking Asynchronously. He has presented in the 2020 GOTO Conference, online edition because of COVID pandemic. His straightforward presentation approach guides us through steps that take advantage of asynchronous persistence pipelines to provide a better experience to our users. It is a great opportunity for newcomers to understand where AWS want to achieve with serverless from know on. I took the opportunity to elaborate more on a few services used by him on his talk to give more context.
Common Serverless Pattern
Usual serverless application will mimic the typical three-tier architecture. API layer will, as naturally happens, be responsible for Security and Routing, while compute layer will have anything else you need to persist your data into the storage layer. On Eric's perspective this comes with a concerning trade-off: if something goes wrong it will probably fail on your code as it is most vulnerable building block of your architecture.
Thinking Asynchronously
Eric proposes that we persist the data before we apply any computation to that. It brings us a few major benefit from the traditional approach:
- Greater reliability. In case of failure in our codebase, we have our data persisted already.
- Faster response times in our APIs. By moving the extra computation to a second step, the user already received the feedback in the UI.
- We can do more in less apparent time to the client. As the complex computing is now the last thing to do, our persistence pipeline has bigger room to process data with no apparent impact on user's experience.
One might argue that you can squish bits and bytes of your code to provide a similar result. Perhaps the pillars from Eric's approach lies on how it increases the flexibility by reducing the response time on the API side. After all, it's an old known fact that better response times implies better user experience.
Well, we talk about serverless we look at "what is serverless?" and basically I meant serverless is: something happens, we react and do something.
Event Driven Patterns
Event Driven Development is the key to make the suggested approach work. AWS-wise, there's a multitude of events that can you can listen to with a AWS Lambda function. Of course, you can also take advantage of AWS network services like SNS, SQS and Kinesis to consume asynchronous events on your Docker container or application instance.
Amazon API Gateway
I'd like to draw your attention to this versatile service available in AWS, and the idea behind its conception. API Gateway was first introduced in 2015, it communicates directly with 100+ AWS Services, allowing you to transform requests and response payloads with Apache Velocity templating language (VTL). It's commonly used as serverless Rest API, allowing developers to configure HTTP routes in a higher level abstraction in which you don't have to provision resources to handle the request - matching Eric's personal definition of serverless. Requests received by the API Gateway are translated into events, allowing you to listen it directly to any compatible AWS service, like DynamoDB or Lambda.
API Gateway is also a well known pattern. A few years ago, Netflix OSS team introduced their own API Gateway solution. It was designed with a few key philosophies in mind, "each of which is", in their words, "instrumental in the design of our new system":
- Embrace the Differences of the Devices
- Separate Content Gathering from Content Formatting/Delivery
- Redefine the Border Between “Client” and “Server”
- Distribute Innovation
In fact, the problem it solves is wide known between teams handling large fleet of microservices. GraphQL, for instance, approaches these problems from a different perspective, and has been in internal use on Facebook since 2012 as well. Since Netflix blog post, several other approaches have been designed at an alternative for the custom brewed API Gateway solution. Krakend is a fairly popular and feature rich stateless API Gateway - might be a good tool for those situations where the cost AWS API Gateway is an issue.
Amazon DynamoDB
DynamoDB is an underrated, fascinating database service. The description on its website doesn't make justice to its capabilities: 1
- Key-value data store - It fits perfectly as a persistence layer for tasks that requires intensive write throughput - being especially good for timeseries data or document persistence.
- Expirable entries - You can define a Time To Live (TTL) value to arbitrary expire entries on your tables.
- Global tables - DynamoDB can manage tables accessible (replicated) globally.
- In-memory Acceleration with DAX - It acts as a mix of near cache and table space for ly accessed data. Pricey, but might worth it if you take into account the cost of maintaining such a mechanism by your self.
Surely, DynamoDB's feature list is more extensive than that. But the ones above mentioned are ingredients for a multitude of scalable recipes for problems you might face on your daily routine. From The Poor Man's Event Sourcing Tool 2 to a Globally Distributed Ordered Queue, it is the kind of Swiss Knife you want to have on your toolbox when you have a complex situation to tackle. You can even create your ad-hoc scheduling mechanism, allowing you "to schedule an irregular point of time execution of a lambda execution without abusing CloudWatch crons".
Inevitably, its simple key-value design introduces trade-offs: indexing is quite limited, you can't join tables and you will probably spend a bit of time trying to fine-tuning it for an optimal cost its read and write provisioning 3. But its flexibility and simple API, along with thoughtfully designed persistence tables, might be an elegant and affordable solution for your company.
Other "storage first" options
Amazon EventBridge
This is another obscure but intriguing service available in the AWS portfolio. EventBridge is more than an old-fashion Service Bus, but a blistering fast decision tree capable to translate inputs into actions. It can connect with, basically, everything from AWS Lambda and AWS Step Functions to AWS Kinesis and AWS SQS. You even use AWS SNS to trigger a further HTTP request to an external service.
Lambda Destinations
With Destinations, you can route asynchronous function results as an execution record to a destination resource without writing additional code. An execution record contains details about the request and response in JSON format including version, timestamp, request context, request payload, response context, and response payload. For each execution status such as Success or Failure you can choose one of four destinations: another Lambda function, SNS, SQS, or EventBridge. Lambda can also be configured to route different execution results to different destinations.
-
"A fully managed proprietary NoSQL database service that supports key-value and document data structures. DynamoDB can handle more than 10 trillion requests per day and can support peaks of more than 20 million requests per second" - as was seen in the Amazon DynamoDB description page - 05/Jun/2020.
-
Depending on how you design your tables, TTL and stream listeners, it might be cheaper than spinning up and maintaining a Kafka cluster.
-
The pricing model is not the same pay-as-go that you find on most of AWS services, instead of paying for requests you pay for the provisioned read and write capabilities of your tables. It's true that, a long time ago, Amazon introduced auto-scaling capabilities for table provisioning, but still you have to keep its pricing model in mind otherwise you might run out of budget.
-
Eric's explanation about Lambda Destination: "There is this really cool thing we announced last Re:Invent called Lambda Destinations. And the way this works is I can run a function and if it is successful than I can just trigger some data into EventBridge, Lambda SNS or SQS. Or if it is on fail, I can then trigger data into the same data."