Serverless Solution to Offload Polling for Asynchronous Operation Status Using Amazon S3 – InfoQ.com

Live Webinar and Q&A – Optimizing your CI/CD Pipeline with Shift-Left Enterprise Observability (Live Webinar Nov 10th, 2021) Save Your Seat
Facilitating the spread of knowledge and innovation in professional software development


Engineering The Digital transformation leverages manufacturing’s successful track record of improving productivity and quality and organizational change management principles. It’s a training program designed to reduce the barriers to change, enable teams to understand good design patterns, and ultimately allow organizations to create a systematic approach to continuous improvement.
The “Architectures You’ve Always Wondered About” track at QCon is always filled with stories of innovative engineering solutions. Bringing those stories to our readers is at the core of InfoQ, and this eMag is a curated collection of some of the highlights over the past year.
In the podcast, Rosaria Silipo talks about the emerging trends in deep learning, with focus on low code visual programming to help data scientists apply deep learning techniques without having to code the solution from scratch.
Problem software delivery projects can be recovered mid-flight if Value Stream Management (VSM) analytics are used in a forensic way to uncover the root-cause of the issues. The root-cause metrics areas considered include: People Availability; Team Stress; Backlog Health; Sprint Accuracy; Process Efficiency; Story Management; and Defect Gen. A root-cause RAG reports shows key mitigations.
Liz Rice discusses how eBPF enables high-performance tools that will help connect, manage and secure applications in the cloud.
Turn advice from 64+ world-class professionals into immediate action items. Attend online on Nov 1-12.
Learn from practitioners driving innovation and change in software. Attend in-person on April 4-6, 2022.
InfoQ Homepage Articles Serverless Solution to Offload Polling for Asynchronous Operation Status Using Amazon S3
Nov 01, 2021 12 min read
by
Cristian Gherghinescu
reviewed by
Steef-Jan Wiggers

Asynchronous APIs come with several advantages like decoupling, scaling, and resilience. However, as there is no such thing as a 'free meal', you need to consider the added complexity both on the client and server-side.
Getting the status of the asynchronous operation often involves the client periodically polling for the result. This operation leads to wasted resources on both ends.
This article proposes a solution to redirect the polling part to the Amazon Simple Storage Service (S3).
D2iQ: The Leading Independent Kubernetes Platform. Learn more.
This service is a highly available, scalable, and secured object storage service managed by Amazon Web Services public cloud provider (AWS).
The article will present a serverless implementation using AWS Lambda functions, but this is not mandatory if you want to use S3.
You could, for example, use Docker containers as well.
A typical serverless implementation of an asynchronous API on the AWS platform involves the Amazon API Gateway, some lambda functions, an SQS queue, and, in our example, a NoSQL key-value database: DynamoDB. Below you can see the high-level architecture diagram:

For the sake of simplicity, the API has only one resource /order with POST to add a new order and GET /order/{id} to retrieve the order. We assume that creating an order takes some time; therefore, the request is asynchronous. Clients call the endpoint and receive back an order id. With this id, they have to poll the GET endpoint to check when the order is created. Of course, if the clients have a callback endpoint that can be called or if they can receive a notification when the order is created, the polling is not needed.
Even though it is kind of simple to call an endpoint every other second or so, this is an ineffective process, wasting resources both on the client and server sides. Moreover, some clients can not implement a webhook endpoint, are unable to consume notifications or there is just not enough time to implement these mechanisms.
One way to relieve the server-side part would be to delegate this polling to a managed service from AWS. We can use the Amazon Simple Storage Service (S3) for this.
Amazon S3 is among the first services offered by the Amazon Web Services cloud provider. It is an object storage service that offers high scalability, availability, and performance. The structure mimics somehow a filesystem with buckets containing objects – files and any metadata describing that file.
We can use S3 to store the status of the asynchronous operation as a JSON file, and the clients of the API will call this service instead of polling on our API. In this way, all the traffic from all the clients checking for the status update will be redirected to the S3 API instead of our own API.
In order not to propagate the credentials or any other authentication mechanism to our API clients, we will use the presigned URL feature from S3. By default, all the buckets and files are private. However, for a limited amount of time, we can share some of the files by using a presigned URL (without exposing AWS security credentials and permissions).
The lambda function that receives the POST request will generate the presigned URL containing the operation’s status and return it to the client. This S3 file name will also be added as an attribute to the message sent to SQS so that the processing part has it as a reference when it needs to update the status.

The AWS SDKs offer functionality to generate these presigned URLs. Below you can see an example in Python for a GET URL to the object with the key ‘OBJECT_KEY’ from the S3 bucket 'BUCKET_NAME' that will expire in 10 minutes:
For examples in other programming languages, check out AWS documentation.
Note that this functionality can be used as well from Docker containers or self-hosted applications. If you can not use one of the AWS SDKs (Java, .NET, Ruby, PHP, Node.js, Python, or Go), there is also the AWS S3 REST API or the AWS Command Line Interface. It’s not a requirement to use serverless lambda functions.
The lambda function that returns the presigned URL used for polling could also include in the response a time estimate for when the client could start asking for the operation status. This time estimate can be based on the approximate number of messages from the SQS queue, the approximate number of messages in-flight (sent to a client but not yet deleted or their visibility expired), and the average time it takes to process one request. Below you can see an example in Python on how to get those numbers from an SQS queue:
While using S3 to store the status of asynchronous operations, the more recent statuses will be queried more frequently, and the old ones might not be read at all after a while. Therefore, depending on your use case, you could take advantage of the different storage classes offered by S3. At the writing time of this article, these are provided classes and their cost (for the Ireland region):
 
Table source
The management of object storage is implemented with the S3 lifecycle rules. For example, you can have a rule specifying that the files will be kept in S3 Standard for ten days, then moved to S3 Standard-IA, and after 30 days deleted or moved to S3 Glacier Deep Archive. The lifecycle can be configured through the Amazon S3 console, REST API, AWS SDKs, and AWS CLI. For more information, check out the documentation.
Although all the files and buckets from S3 are private by default, creating presigned URLs will allow access to those files for the time limit specified. Anyone having the presigned URL will be able to read that status file. Therefore, the communication with the API should be done only over HTTPS, no sensitive data should be stored in the status file, and the time limit for the files should be set as short as possible but not shorter than the actual operation might take.
Another extra protection measure could be taken on the S3 side so that only certain IP ranges are allowed access. This can be achieved with a policy added to the bucket, as exemplified on this AWS documentation page.
Suppose the presigned URL mechanism is not secure enough for your use case. In that case, you could use the AWS Security Token Service (AWS STS) to create and provide your clients with temporary security credentials that can control access to your S3 operation status files. For identity federation, AWS STS supports both Enterprise identity federation (custom identity broker or SAML 2.0) and Web identity federation (login with Google, Facebook, Amazon, or any OpenID Connect compatible identity provider). For more information, check their documentation.
Delegating the polling to S3 will allow the main service to process actual business logic requests instead of constant checks for updates. As a result, our serverless example translates to fewer function invocations and fewer read capacity units consumed from DynamoDB.
Although AWS Lambda functions scale pretty fast and can handle a high number of concurrent requests, you still need to consider the concurrency limits. Depending on the AWS Region, the initial burst limit is between 500 and 3000, applied to all the functions from the account. Not consuming the concurrency with polling will leave more capacity for the rest of the functions. For a full list with all the lambda limits, check out the AWS documentation.
Other wasted resources are the read request units from DynamoDB. One read request unit represents one strongly consistent read request, or two eventually consistent read requests, for an item up to 4 KB in size. Moreover, if your table is configured in provisioned mode, where you specify the number of read capacity units, some of the requests might get throttled. There is also the On-Demand Mode, where the capacity is adjusted to the traffic. Unfortunately, polling will generate just side business traffic.
The cost benefits will start to show above million of requests. For hundreds of thousands, there isn’t such a big difference. Below you can see an example of cost calculation.
We take 100 000 requests and assume that there will be an average of 10 poll requests for each request, therefore a total of 1 million requests. The following calculations have been implemented with AWS Pricing Calculator for the Ireland AWS region.
API Gateway REST API is straightforward: 1,000,000 requests x 0.0000035000 USD = 3.50 USD
For lambda functions we will assume an average of 500 ms execution time and allocate 256 MB of memory:
Total cost for lambda: 2.08 USD + 0.20 USD = 2.28 USD
For DynamoDB we estimate a 10 KB average item size and we will use eventually consistent reads.
Total for reads from Dynamo: 1,500,000.00 total read request units x 0.000000283 USD = 0.42 USD read request cost
Total cost for polling requests would be: 3.50 (API Gateway) + 2.28 (Lambda) + 0.42 (reads from DynamoDB) = 6.2 USD
This cost is slightly overestimated. The lambda functions might take less than 500 ms to responde and it might be enough to provision 128 MB of memory for them.
For S3 we estimate a 1 GB (100,000 x 10 KB) Standard storage per month:
S3 Data transfer, outbound Internet, tiered pricing for 1 GB:
S3 total cost: 0.92 USD + 0.00 USD = 0.92 USD
Note that in order to compare as closely as possible, these calculations include only the costs related to the actual requests. Therefore, any other extra costs are not included, for example, the storage costs for DynamoDB.
The cost difference is not that big. However, it’s included so that you can get an overview of how this is calculated.
Offloading polling to S3 comes with all these benefits, but it also adds extra complexity to the overall solution. You need to involve another service: S3, and create a presigned URL for every operation. If the status files contain any sensitive information, this solution might add a higher risk because anyone getting the presigned URL will have access to that information. Most of the benefits will materialize when there are a lot of calls from many clients, and they are polling at short intervals. Therefore, in a situation with just a few calls from time to time, the main API could also handle the polling traffic without the need to use S3.
The article showed how you could use AWS S3 to handle the polling traffic from an asynchronous API. If you can not implement a notification strategy and the clients need to poll for the operation result, then S3 can be a good candidate to take those calls from your main API. Generating an S3 presigned URL for each operation and returning it to the client so they can call it will allow your compute resources to handle the main business logic of your application instead of calls to check the status of the operation.
The example from the article presented a serverless API. However, this mechanism can also be used from other kinds of applications like those hosted in Docker containers, virtual machines, or even self-hosted. The benefits will start to show for many calls at short intervals. If just a few clients make calls from time to time, adding one more system to the solution might not prove that efficient.
Cristian Gherghinescu has been working in the software development field since 2006. He is currently a Software Architect at Visma, a Norwegian-based company. Cristian started with C# and Java EE and now is focusing on adapting the current solution to the AWS platform. Lately he became enthusiastic about Serverless solutions.

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.
You need to Register an InfoQ account or or login to post comments. But there’s so much more behind being registered.
Get the most out of the InfoQ experience.
Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.
Focus on the topics that matter in software development right now.
Deep-dive with 64+ world-class software leaders. Discover how they are applying emerging trends. Learn their use cases and best practices.
Stay ahead of the adoption curve and shape your roadmap with QCon Plus online software development conference.
InfoQ.com and all content copyright © 2006-2021 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we’ve ever worked with.
Privacy Notice, Terms And Conditions, Cookie Policy

source