Physical Computers
We have come a long way from the massive servers of the dotcom era. Back in those days, server infrastructure was mostly on-premise. A business ran its solutions on a physical server. People used entire separate servers for different purposes (back-ups, mail server, web server, etc). When a certain server failed to keep up with the growing needs of the company, it was replaced by a newer faster server. You scaled by getting better hardware. You scaled vertically.
Hypervisors
Then came the era of hypervisors. It gained momentum with the rise of VMWare and people realized that they can get one rack to rule them all. One rack to run all the various use cases and provision each of them their own separate virtual machine. This was also gave rise to cloud computing and businesses stopped investing in server hardware, directly, and chose instead to ‘rent’ virtual servers.
Huge and expensive data centers were managed by cloud providers all over the world. Businesses took advantage of this by provisioning their services globally, using the widest possible array of data centers. This was done mainly to reduce latencies, improve customer experience and to target a bigger market.
This also made software authors think in terms of distributed systems. They wrote software to run not on a single giant computer, but on many mediocre ones in a consistent and reliable way. You scaled horizontally.
You can still scale vertically. In fact, because of virtualization, provisioning more resources became easier. You powered down the VM, adjusted its resources and paid your cloud provider a little extra. Piece of cake.
The underlying physical servers have not disappeared. Cloud providers are now responsible for managing complexities of network interfaces, OS compatibility and other terrifying pathologies.
Containers
Then came the containers. Containers were this amazing lightweight abstraction. A virtual environment with an operating system which allowed software to be packaged and deployed as a single unit. Like virtual machines each container ran unaware of other containers, but they shared the same operating system kernel.
This allowed people to deploy software on servers (physical or virtual it doesn’t matter) at an even higher level of abstraction. You didn’t care about the production operating system. As long as it supported your containerization technology it would run your software. Also containers are easier to spin up which made the services more scalable than ever.
This further increased the flexibility of distributed systems. With technologies like Kubernetes you can have legions of containers running a complex array of services. Distributed systems offer a lot of benefits high availability, robustness and the capability to heal itself from a node failure.
At the same time, because they are so complex, they are also harder to design, deploy, maintain, monitor and debug. This is against the original trend of abstracting the complexity out of your software and delegating that responsibility to your cloud provider. This is where serverless architecture comes in.
Serverless or Function-as-a-Service (FaaS)
The idea of serverless has gained traction mostly because of AWS Lambda, and here I will be using that as a model to talk about serverless. The principles that FaaS is based on are:
- You pay for what you use
- You don’t have to care about scaling
- You focus on your code, leave infrastructure management to AWS
When no one is accessing your services, the services are not active. This was not the case in the traditional hosting solutions where you pay for a VPS which is always up and running, even if it sat idle not doing anything more useful than listening for a new request.
In serverless architecture your service is not running unless someone actually wants to use it. When a request comes in, a service is created on the fly to handle it.
How does it work?
Your function ( for example a Python, Go, or Java program) sits as a file on AWS Lambda. With this function you associate certain trigger events, like an API gateway, or a new object coming into your S3 bucket. And certain resources like a database or another object store or an EC2 instance.
In response to any of the associated trigger events, AWS Lambda creates a container with your function inside it. The function executes and gives a response. For example, if a new image comes into your S3 bucket then AWS Lambda can have a machine learning code inside it, which would analyse this image and write its output to a DynamoDB (one of AWS’ datastore service).
You don’t have pay for an entire server but only for the amount of memory you allocated to your function, the number of requests you get, and for how long your function runs.
Moreover, you don’t have to worry about scaling containers in response to a heavy incoming workload. If a lot of trigger events happen simultaneously, then AWS will take care of spinning up new containers and scheduling workloads between them and all the other complexities.
Not a complete solution
When virtual machines came along, the physical servers didn’t cease to exist. When containers arrived we still used VMs. The FaaS is a higher level abstraction and it fits really well with the modern design of RESTful APIs, stateless services and lightweight languages like Node.js or Python.
However, still runs on a physical server (managed by AWS, for example), it still listens for incoming requests (you just don’t pay for that directly) and you still need to store data in a persistent fashion which is why it has integrations for S3, EC2, and other services. It is a useful abstraction nonetheless.