The time for serverless is now – tips for getting started
Just because everyone is talking about something, that doesn’t mean it’s actually worth your time. Chris Wahl shares his experiences getting to grips with serverless technology, what he learned throughout the process, and whether, ultimately, serverless is something worth considering.
The world of IT operations is rife with all sorts of “hot new things” being lauded by thought leaders and vendors alike. When your job is to reliably and consistently deliver services and applications to engage and delight your users, it’s tough to absorb the idea of serverless. Even the name makes it sound like you’re going to be tricked into throwing away your servers somehow!
With that said, the ability to deploy code via functions that can be run anywhere with the underlying layers abstracted sounded interesting to my team of engineers based on our collective experiences. In this post, I’ll go over some of the initial serverless sticker shock items from the past few years to help you prepare to bring IT operations into the world of serverless to drive higher value and ultimately do less manual work.
Where to start?
There are a number of places you can start. My team’s first use case with serverless was tackling a long list of cron jobs and scheduled tasks that were often just soaking up idle CPU time on a job or batch server somewhere. This is often the case with on-premises infrastructure; you will find a little pizza box server or set of virtual machines that spend most of their time just waiting to run code based on dates and times.
In one specific example, a series of data center environments used for demonstration purposes were being reset to their base configurations on a nightly basis. Rather than constructing a container or server somewhere within the data center to do this, all of the baseline functions were migrated over to AWS Lambda functions and triggered based on a CloudWatch Event set to a nightly datetime value. This had two immediate benefits:
- The team no longer had to maintain the cron servers. This eliminated thankless and time-intensive patching, securing, and maintenance tasks.
- The total cost of ownership (TCO) was reduced by CapEx (more resources available for other workloads, which avoids hardware spend) and OpEx (we can focus on other things). Now, it only costs a few pennies a month to run the majority of our cron functions.
Operationalize first, optimize later
However, things weren’t just magical and rosy from the first pass. Due to the iterative nature of constructing a function, you’ll often want to start by getting your workflow operational and understanding all of the various requirements, permissions, and non-obvious caveats. Later, after having learned more about how serverless operates, you can make more passes across the functional code to streamline your workflow. For example, once our team had the on-premises workflow operating in serverless functions, we then took another pass to optimize, refactor, and slim things down.
In addition, we learned a few things the hard way:
- Longer timeouts for functions become necessary when calling back to environments that require the construction of a network that lives within a VPC. This is often due to the cold start period where the default of 30 seconds is not quite enough.
- Python or Go can offer a much better user experience compared to other scripting languages, such as PowerShell, and can help solve the problem of hitting a timeout due to cold starts.
- Encrypted environmental variables are good for storing tokens and keys. However, that’s not enough – you’ll still want to rotate these objects on a regular basis. Our goal was to leverage a rotation service along with an encrypted vault in the future, but the use of environmental variables helped kick-start things in the beginning.
The biggest investment with serverless
As you seek to adopt serverless, you might ask: What was the biggest time investment for the move towards serverless? The answer to that would easily be documentation. The need for design documentation is critical. Because your serverless functions are typically standalone, loosely organized, and triggered based on a variety of different inputs (API gateways, CloudWatch Events, Lex inputs, and so on), it’s a good idea to maintain a high-level workflow that programmatically describes the triggers, functions, and outputs. We use a combination of git-backed repositories that contain JSON files along with Confluence pages that have embedded documentation and links to functions.
Today, some of our functions can be called via Slack commands that are triggered via an API Gateway. One, for example, is used to construct new GitHub repositories that have the correct naming standards, contributor file, code of conduct, labels (tags), and license file. An authorized user simply uses the slash command to answer a few questions (without any authority in GitHub) and a new, compliant repository is generated and handed over. This offers the ability to:
- Pass along workflows that are complicated or security sensitive to authorized users in a ChatOps style.
- Collaborate together using a messaging platform where the team can see what is being done in real time and offer input, troubleshooting, or guidance.
The time for serverless is now
If you’re on the fence about diving into serverless technologies, now is the time to make your move. Getting your hands on the workflows presented by serverless options such as AWS Lambda will put your ahead of the pack in the world of IT operations and is actually quite fun to learn, configure, and operate.