As cloud providers continue to provide an ever-expanding catalog of services to their customers, it has become increasingly complex for development shops to select the best approach for their new cloud native architecture.
One popular topic of discussion is whether a function as a service (FaaS)-based approach or container-based approach is appropriate.
At projekt202, we realize that there is no one-size-fits-all solution and like to arm our clients with the knowledge necessary to make informed technology decisions.
Let’s walk through some of the pros and cons to consider with each approach:
A FaaS offering, such as Lambda or Azure functions, is marketed as the end all and be all of cost savings. But is that entirely accurate? The answer is, it depends.
The pricing behind Azure Functions and Lambda are pretty similar, measuring themselves as a function of (# of requests) X (GB-sec), where a GB-sec is the measurement of allocated memory over overall runtime. So FaaS cost depends on # of requests, amount of allocated memory (MB), and the number of milliseconds it takes to run. An important note is that FaaS cost scales linearly with memory, duration and # of requests each.
SYNCHRONOUS BEHAVIOR & FUNCTION CHAINING
So now that we know how to calculate the cost of FaaS, what other factors can influence the price tag? Let’s start with duration. While your FaaS function is executing, it’s important to consider that your function is still running, and more importantly, still billing, during synchronous calls to other services. So if you’re calling another function, a database, or a third-party API, you will be charged for the time your function is idling.
In a FaaS architecture, you’ll need to code against a few different scenarios. Some of these cases include:
- Your database is under load and your usual 50ms query is now taking 5 sec.
- Your typically reliable third-party API is timing out and you’re waiting for the response.
- Your function has a long call chain. For example, your Lambda needs to call 9 other lambdas. Each Lambda has a minimum runtime of 100ms (Azure rounds to nearest 1ms), so although your function executes in 10ms, you will be billed for 10 x 100ms minimum, or 1 full second.
Unlike FaaS, containers do not automatically scale up for each incoming request. Instead, the number of containers is either pre-selected or auto scaling is used. This allows containers to be added automatically based on measurement of latency, host utilization, etc. The most notable impact of this model is that your containers cost the same regardless of whether you’re receiving 10k requests or zero; containers do not idle for free like a FaaS solution. The cost of containers will be a function of required resources (vCPU & memory), number of required containers, and total uptime.
Containers cost the same regardless of whether you’re receiving 10k requests or zero; containers do not idle for free like a FaaS solution.
As an example, let’s use AWS to compare cost between FaaS and containers. AWS has recently released a flurry of cost-cutting measures (reducing overall container spend), but these cost reductions also complicate the decision-making process. Some of these measures include rate reductions, spot instances, and reserved capacity.
To determine container cost, you first need to select your desired runtime in AWS. There are two questions that need to be answered:
- Do you want to use ECS or Kubernetes for your container management & orchestration platform?
- Do you want to manage the underlying EC2 hosts or have AWS be responsible for managing the underlying hosts?
The primary cost difference between container options comes from the answer to the second question: that is, do you want to manage the underlying EC2 hosts? Previously, Fargate had a hefty premium for the management it offered. After a round of price reductions and a new compute saving plans (reserved capacity), Fargate has become more competitive.
To determine cost-effectiveness of managing your own hosts, you need to look at your host utilization. In general, the higher your host utilization, the more attractive the cost is of managing additional layers of your stack becomes. Ignoring the discount of reserved capacity, the break-even point for on-demand pricing comes around 70% utilization. If you’re able to use 70% or more of each EC2 host, then EC2 will be less expensive. If you’re utilizing less than 70%, then Fargate is more cost-effective. Using reserved capacity for Fargate will further reduce the cost up to 50%.
Ignoring other variables and looking strictly at host utilization, the rule of thumb is that Lambda will save at low utilization, Fargate will save at medium utilization, and managing ECS on top of EC2 will save at high utilization.
HIGH AVAILABILITY & HIDDEN COSTS
There are other costs to running your infrastructure beyond container uptime or # of requests for your lambda. For example, what is your approach to maintaining service in the face of failure? Traditionally, achieving this resiliency requires deploying your infrastructure to separate regions, allowing one region to remain in service while the other experiences an outage. A container-based approach requires just this: deploying instances of your software to multiple regions so that your system continues to function if one region is down.
In the above scenario, FaaS offers a unique advantage. Due to the pay nothing while idle model, you are able to operate in a secondary fallback region free of charge. If one region has an outage, the other region is able to automatically scale up so that service continues uninterrupted.
TAKEAWAYS ON COST
A useful over-simplification of cost is to think of lambdas being billed per request, whereas containers are billed per minute. There’s a lot of nuance we’re skipping, but it helps us form some generalizations on cost.
FaaS excels at unpredictable, short duration and overall lower utilization requests. Strong use cases include stream processing, non-latency sensitive API GW endpoints, CRON jobs, or “glue” between various cloud services. The cost comparison isn’t even close if you’re dealing with low utilization (sub 10k requests). FaaS typically offers up compute prices that are a fraction of a container-based solution.
Containers excel at predictable, long duration, or memory intensive workloads. If you know that 8 p.m. to 10 p.m. is your busy period of the day, it is cost-effective to scale up and handle the requests and then scale down. In general, the higher your predictable overall utilization, the more you can potentially save by managing additional layers of infrastructure, such as the underlying hosts.
One of the strongest arguments against using FaaS is for workloads that are sensitive to latency. The primary reason behind this is a cold start, which occurs the first time a new function instance is invoked.
A cold start is the combined delay that occurs as the cloud provider provisions an execution runtime and your FaaS code goes through any necessary initialization (e.g., framework does classpath scanning). The period of time required by the cloud provider to provision your function varies by language choice and the size of your package. A small JS function will startup in 750ms in AWS (2-11 sec on Azure). A 15MB package adds 3 sec to a Lambda’s cold start time and 5 sec to Azure. For this reason, it’s important to be aware of the disk space your library dependencies add; the larger the function, the more time it takes to transfer to a new runtime.
A cold start will occur whenever your function is redeployed, is inactive, or receives concurrent requests. Again using AWS as an example, lambda functions can be reused; however, AWS will reap unused functions starting around 4.5 minutes of inactivity and will have near a 100% chance of being reaped after 10 minutes of being idle. Each concurrent invocation will also create a new lambda instance, such that 100 concurrent invocations will cause 100 parallel cold starts if no idle instances are available for reuse. It’s worth noting that you will not be billed for concurrent requests. Lambda doesn’t charge you for the number of instances that it manages, only the number of requests, memory size, and duration.
Each concurrent invocation will also create a new lambda instance, such that 100 concurrent invocations will cause 100 parallel cold starts if no idle instances are available for reuse.
There are two types of invocation types for Lambdas. When dealing with a streaming source, such as Kinesis Streams or DynamoDB streams, the number of shards per stream is the unit of concurrency. If you have 50 active shards, AWS will provision 50 lambda functions. When dealing with synchronous source types, such as API Gateway, new instances will be provisioned as concurrent requests occur. These instances will be reused if possible.
At re:invent 2019, Amazon announced Provisioned Concurrency to reduce the number of cold starts your application will experience. This announcement aims to rid AWS of the cold start complaint; however, it comes at a cost. Provisioned concurrency allows you to specify the number of concurrent connections that AWS will keep warm for an hourly fee. While this is great for companies seeking FaaS for their next architecture, it goes against Lambda’s original vision of “pay per request.” Users will now have to do capacity planning and be charged extra for “pre-warmed” capacity. If you’re considering provisioned capacity, I would recommend re-evaluating whether FaaS is the right choice for you instead of a managed container service, such as Fargate.
BRING YOUR OWN FRAMEWORK
To avoid lengthy startup times in your code, you may need to re-evaluate the framework and libraries that you’re using. Since your code may be booted from scratch on each execution, it’s important to avoid the latency and cost caused by extra startup time. This might mean shedding a framework, which your dev shop has become accustomed. For example, spring boot may add multiple seconds (or minutes depending on app size and framework version) to startup time. Hibernate similarly adds multiple seconds to project initialization and can have a 400% slow down to database interactions, especially when unoptimized. Also consider lower-level libraries, such as database connection pools. A database connection can be costly to establish and you may need to deploy workarounds for database connection pooling to work (requires global state, which is available on a per lambda instance basis, despite lambdas commonly being thought of as stateless, which they should be).
CONTAINER COLD STARTS
Containers can be thought of as having their own “once per instance cold start”. Similar to Lambdas, an auto-scale event may need to grow your underlying instances. In this case, your docker image needs to be pulled from an image repository and your container must boot. The infrastructure time required to do this greatly exceeds that of a Lambda. However, there are two primary differences worth noting.
First, the “cold start” time for a container is a “pay once” occurrence. Once your container is running, it remains active until you decide to terminate it.
Second, through the use of load balancers, your new container will not register itself to take incoming traffic until it has fully booted. Despite the container boot time taking longer, this registration event will prevent your consumers from experiencing the latency of this cold start. This is because the container isn’t in service until it is ready to respond to requests.
TAKEAWAYS ON LATENCY
Traffic spikes, particularly ones with concurrent requests, will cause your lambda function to experience an increased number of cold starts. The duration of your cold start is heavily impacted by your language choice and the size of your lambda package. Provisioned capacity is available to combat cold starts, but needing provisioned capacity is a smell that may indicate that FaaS isn’t the ideal choice for your architecture.
If your development shop has found efficiencies with a particular framework, that framework may not be well-suited to a Lambda execution environment. While the architecture of Lambda is attractive, you must weigh the cost and efficiency of your development team.
You must also consider the cost of your consumers experiencing a cold start. While both containers and lambdas have startup times associated with them, containers pay their more expensive cost once. They are protected from usage until they’re fully booted and registered to accept incoming traffic. For use cases that are sensitive to latency, containers win.
COMPLEXITY AND COST OF CHANGE
A microservice architecture divides your solution into separate bounded contexts, allowing each problem space to be encapsulated. This bounded context introduces an API between the seams of your system, creating an unavoidable physical boundary. By preventing direct access to implementation specific code and data, you are led to a loosely coupled architecture. Each microservice of the system now has fewer responsibilities and is easier to update. This enables the release of components independent from one another, reducing the dependencies required for a deployment.
This flexibility comes at a cost. Identifying a bounded context isn’t always simple, especially in greenfield applications. Often, a feature needs to reach critical mass before it can be identified as its own domain. Alternatively, starting with each API endpoint as its own stand-alone service leads to a nanoservice architecture. While a nanoservice architecture sounds fantastic, with great power comes great responsibility. Having such granularity in your distributed system makes it difficult for developers to trace flows of execution, increasing the time and effort to troubleshoot issues. To complete a new feature, you may need to touch a dozen different services. To troubleshoot a bug, you may need to trace through and understand the timing of multiple services, databases, and event queues. The creation of each nanoservice is simple, but the overall time to integrate, monitor and orchestrate increases as the number of services grows.
It is an architect’s responsibility to determine the best fit between a monolith, microservice and nanoservice architecture. Each shines under different use cases. When choosing containers, teams are free to select between these different architecture paradigms. When choosing FaaS, nanoservices is pre-selected for you. While the FaaS services will internally be lightly coupled, the testing and integration between them is crucial. Simple problems often require simple solutions. Nanoservices are easy to write, but complex to reason through.
One of the advantages that FaaS offers over a container service is decreased upkeep. Containers allow you to create an image that includes all of your dependencies pre-packaged in a known and tested configuration. While this is incredibly powerful, it requires teams to update these dependencies as new versions and security patches are released.
With a FaaS offering, cloud providers take ownership of this responsibility. Security patches, fleet health, and capacity provisioning are all handled on your behalf. FaaS allows developers to focus on writing code.
The cost to having a cloud provider manage infrastructure is a loss of control over the runtime available. For example, the latest version of NodeJS or .NET core may not be available.
Containers excel at portability, allowing them to run on any cloud provider or on a developer’s workstation. Enabling developers to test in a “production-like” configuration directly on their laptops greatly eases the development process. Now, developers can test using the same dependencies, same infrastructure, and same database schemas. This reduces the number of surprises that can occur as the delivery pipeline gets closer to production and each environment becomes symmetrical.
FaaS offerings provide limited libraries and support for unit testing a function. For example, AWS offers the AWS Serverless Application Model (SAM), which allows you to build, debug and test your function. This local runtime is a huge step forward, but unit testing is only one piece of the testing pyramid, and is one that becomes less critical as we move to toward a nanoservice architecture. Integration testing offers real value with fine grained services and that can only truly be accomplished by deploying to the cloud. This deployment cost money and introduces a remote dependency (the cloud) into developers’ testing cycles.
When a container is deployed, it is a complete binary that is designed for portability. They can run on any cloud provider, an on-prem datacenter, or on a developer’s workstation. For FaaS, the deployment unit of a function is source code, which must adhere to the cloud provider’s API spec.
Example handler function for AWS Lambda
To avoid potential vendor lock-in, you can utilize an abstraction layer such as serverless to decouple your code from a specific FaaS. However, vendor lock-in doesn’t just occur from the FaaS contract we’re adhering to. The majority of cloud lock-in comes with the integration of other cloud services, such as proprietary databases and storage service buses. If you plan to leave AWS or Azure, it is more likely that your choice of AWS DynamoDB or Azure Blob Storage will cause you far more rework than your choice of FaaS.
Since both a container and FaaS implementation will be integrating heavily with other cloud services (cloud database, cloud storage, etc.), vendor lock-in is roughly equivalent with both solutions. If you’re considering writing abstraction layers to protect against lock-in, I would strongly recommend you consider the time investment for doing so. Abstraction layers are often buggy and a race to the bottom, offering only the most basic and generic capabilities offered across all providers.
TIME TO MARKET
If optimizing for time to market, FaaS is a clear winner. Cloud providers offer quick and easy deployment models that forego the need to manage images, service registries, artifact repositories, etc. All that’s needed is the ability to write and deploy source code. Containers require additional infrastructure that requires a time investment to set up.
TAKEAWAYS ON COMPLEXITY
For problems of minimal complexity or a short time to market, FaaS becomes a strong candidate. It is simple to develop a new function and deploy it all while paying minimal cost.
As complexity grows, being able to establish microservices with cohesive-bounded contexts is attractive. Unit testing is simple in all solutions, but integration testing locally is something only offered by containers.
Due to integration with other cloud services, vendor lock-in is a concern regardless of which architecture is selected. However, the real risk of lock-in should be assessed. Lock-in rarely materializes as a problem on a project. Solving for it before it’s a real problem could add unnecessary time and complexity to a project.
There’s a lot that needs to be considered when evaluating the decision between containers and FaaS. To properly decide you need to estimate your traffic patterns, understand your development and operations capabilities, and assess the complexity of your problem domain. Answer these questions. How predictable is your traffic? Is it possible to go viral? How many concurrent requests are received and what is the cost of a customer suffering a cold start? Consider not only the operating cost of your system, but the manpower that goes into maintaining a distributed system.
Lastly, your architecture does not need to be only FaaS or only containers. There is a lot of value in decomposing your system and choosing one approach for some components and another approach for the others. Many companies choose FaaS for the majority of their architecture, but continue to employee containers for components that are latency sensitive. Conversely, they may choose containers for their microservices, but utilize FaaS to integrate multiple cloud-based services together.
If you have committed to a particular technology and are considering migrating, it is certainly possible to change stacks, but comes at the cost of manpower. The decision to change technology should not be taken lightly. If you’re interested in doing so, this will be the topic for a future blog post.
If you would like assistance with selecting your next cloud-native architecture, projekt202 is available to help guide you through all phases of the software development lifecycle.