The cloud industry has debated for a long time on SLAs and they have standardized on service credits as the compensation for IaaS when the service provider fails to meet the promised SLA. For example, AWS EC2, EBS, Fargate and ECS offer 10% service credit less than 99.99% but equal to or greater than 99.0% and 30% credit for less than 99.0%. Other providers also offer similar SLA. The rationale for this kind of SLA was based on the fact that end users are responsible for OS and above in the case of infrastructure services. It is the responsibility of the end users to ensure that their applications are resilient to failures and meet their uptime needs. Such a rationale worked well because end users had control over the uptime through application architecture and operational efficiencies.
Recent Meltdown and Spectre news brought into focus once again the responsibility of end users. In this case, the users are responsible for patching the operating system for infrastructure services whereas cloud providers are responsible for higher order abstractions like PaaS and Serverless. This lead to a lot of hype around how Serverless is helping end customers when something as dramatic as Meltdown and Spectre vulnerability hits users.
When AWS made the blog post about Meltdown and Spectre vulnerability, the discussions about user responsibility came up on Twitter and the industry veteran Tal Klein clearly pointed out that in the case of serverless, even though the responsibility to patch is with the cloud service provider, there is no accountability for any data loss or downtime. His tweet got me thinking again about a topic I was focussed on for quite some time as I advocated higher order abstractions like Serverless as the path forward. It is about SLAs for Cloud Functions offered by major public cloud providers.
Defining SLAs for Serverless
It is about the lack of SLA for Serverless functions. Neither AWS Lambda nor Google Cloud Functions offer any SLA. Microsoft Azure also offers no SLA for Azure Functions when used under consumption model but is backed by their 99.95% uptime SLA when used with App Service Plan (an equivalent of Reserved Instances in AWS). In other words, if you are using Functions as a Service from any of the cloud providers, you are on your own.
Let us take a look one level below the cloud functions in terms of abstraction, PaaS or CaaS layer. At this level of abstraction, AWS offers SLAs (service credit like their infrastructure services) for both Amazon ECS and Fargate. Google offers SLA for Google App Engine in both standard environment and flexible environment. Microsoft doesn’t offer any SLA for Azure Container Instances because it is a free service but it is driven by the SLA of underlying Virtual Machines. This is not surprising to me because all these services are billed using instance types (even though Google App Engine flexible environment is slightly different in their billing compared to their standard environment or other cloud providers, they do require you to pick the underlying VM type). In other words, SLAs are available for infrastructure like services.
Even though SLAs are not available today for Cloud Functions, it will be available in the future as more and more enterprises start using these services. However, we can take advantage of the lack of SLAs to have a discussion about the nature of SLAs for Serverless. Should we stick to the service credit idea of infrastructure services or do we keep cloud service providers more accountable? It is important to keep in mind that the end users had more control with IaaS but lose the control over both the environment and, to a certain extent, the applications itself. Does the idea of service credits even make sense in the case of Serverless? Or, are we going to give a lifelong free “get out of jail” card? As end users of such higher order services, it is your responsibility to demand the right insurance policy for your business. What do you think?
Don`’t forget to vote in this Twitter poll
Do you care about Serverless SLAs? https://t.co/cp6YSLK9wN
— Krish Subramanian (@krishnan) January 9, 2018