Serverless computing – What is the future for Dev and Ops teams?
Assuming you’ve been paying attention for the last 15 years or so, serverless is just the latest movement in the ongoing Ops switch from tactics to strategy. In this article, Dominic Wellington talks about the real danger to Ops from serverless, the potential downsides of serverless computing and more.
Serverless computing, much like the other variations on the theme of cloud computing which it inherits from, boils down to “it’s someone else’s computer.” In the case of serverless, it’s a little bit more complicated than that. This is where the last vestiges of superficial familiarity with traditional models of IT finally fall away, forcing anyone who is still treating cloud computing as just more of the same old thing to confront the truth.
People can and indeed still do treat an AMI much like a VM, which in turn they managed in more or less the same way they had always managed their physical compute infrastructure. Of course, this was to miss the point of both virtualisation and cloud computing almost completely, but it did not fail immediately. The problems only became apparent over time, when the predicted improvements in capacity and utilisation rates from virtualising mysteriously failed to materialise. Even worse, a significant proportion of available capacity was consumed in running zombie VMs which nobody was able to explain or justify, but everyone was afraid to shut down in case they turned out to be important.
The reason this happened is that – surprise! – Ops is hard. The idea of serverless computing is to get rid of the day-to-day transactional Ops tasks, letting Dev roll out code much faster, and leaving the infrastructure mostly to manage itself. Instead of trying to “do the DevOps” by having an army of Ops Morlocks toiling away behind the scenes to support the Dev Eloi, with serverless there is no wizard behind the curtain. It really is automated machinery back there, and this frees up developers to get on with building whatever they are building.
Dev teams have mostly taken to the change with enthusiasm. Anything that takes friction out of the deployment process is good, and if it does not require developers to pick up a metaphorical pager, so much the better. That’s not to say that there are no problems with serverless, of course; no chance of that, in this fallen world we live in. After all, if cloud is somebody else’s computer, serverless means that your code is now dependent on someone else’s code, running on someone else’s computer. This sort of thing is great when it works, but this assumption of unreliable remote services has not yet been fully internalised.
For an example of the sorts of dependencies which are being introduced, look no further than the left-pad debacle in 2016. In case you have managed to erase the relevant memories, left-pad was an 11-line module in NPM which implemented basic string padding. For whatever reason, tons of projects included this as a dependency, and when the developer pulled all of their modules, including left-pad, utter chaos ensued.
How is serverless relevant to Ops?
So much for the Dev side of serverless – but I’m an Ops guy at heart; I used to be a sysadmin, and even though I’ve drifted a long way from the light with my strategic architectural role at Moogsoft, I still mostly think that way, and I spend a lot of my time with Ops people. Here’s the thing: many Ops people miss the point of serverless because the consumption model of the applications is the same, and they run on top of familiar infrastructure – so what’s the point, exactly? Sure, the developers are all very excited, but how is it relevant to Ops?
Some Ops types even feel threatened: “My job is looking after the servers, and now you’re talking of getting rid of them!” This is the same category error that comes from forklifting physical servers first into VMware and then into the cloud without changing anything in your thinking. If you define your job as putting your hands to the keyboard any time someone wants to get anything done involving IT – which these days means pretty much everything – then yes, that job is going away. Serverless may or may not be the final nail in the coffin, but the lid is already firmly on.
Assuming you’ve been paying attention for the last 15 years or so, serverless is just the latest movement in the ongoing Ops switch from tactics to strategy. Instead of getting actively involved in delivering each and every request, Ops defines the capabilities and parameters of available infrastructure and then hands over both delivery and day-to-day running to automation.
This may sound like NoOps, but rather than kick that particular ant hill, I’d rather go back to the old distinction between operators (basically, tape jockeys) and actual system administrators. If you’re getting involved in day-to-day stuff, you’re not sysadminning right. A proper sysadmin is taking a nap, feet propped up on a decommissioned server, secure in the knowledge that everything is working just fine – because otherwise, something would have told them already.
Microsoft’s own pitch for serverless is this: “What if you could spend all your time building and deploying great apps, and none of your time managing servers?” This doesn’t mean servers aren’t being managed, just that they aren’t managed by hand. No developers or users are aware of or concerned with details of infrastructure, which is as it should be. Utility computing means that compute infrastructure is about as interesting as the electrical grid to outsiders. Sure, it’s vital and we’d all have a bad day if it broke, but unless maintaining it is your actual job, you just plug in and don’t give it a second thought – and the people maintaining it certainly aren’t spending their days driving around the countryside, wiring transformers by hand just because.
So far, so good – but what are the potential downsides of serverless computing?
First of all, because it’s so antithetical to how (some) Ops teams view themselves and their place in the world, there’s a good chance that it’s happening as part of that growing proportion of IT spending that is happening outside the IT budget. Yes, it’s that infamous shadow IT once again. I spend a lot of time with Ops teams, and if you ask them about serverless, often they scoff and imply that while somebody in a lab might be playing with that, nothing serious is happening. Then the very next week you see a press release profiling how the very same company is launching huge new business-critical services on AWS Lambda or whatever. This sort of thing does not happen just once, but repeatedly. The real danger to Ops from serverless is not that it makes them irrelevant, but that they make themselves irrelevant by ignoring it.
Ops can make a real contribution by helping strategise how these new approaches can be adopted safely. After all, there are significant conceptual overheads to serverless architectures for both Dev and Ops.
For a start, it’s good practice to test your stuff before it goes live in production. (Pause for howls of hysterical laughter to die down.) (Longer pause.) (Okay, I think we’re good now.) Even if you have good test coverage for your own code, how do you deal with something like left-pad? Or say Google deprecates something you were relying on, with their usual levels of inscrutability. How do you account for that in your testing?
How about pricing? Google talks about going “from prototype to production to planet-scale.” This is great for Dev at the start of building a new thing, because you can get a project off the ground for pennies. But what happens if you get slashdotted? (Yes, showing my age there.) A design decision which could go either way on a whiteboard might have significant business impacts once it hits the real world. If capacity can go from zero to unbounded, so can your spending. A good Ops team that is no longer running around re-imaging servers should be thinking about that sort of thing.
Just because they don’t have to reinstall operating systems doesn’t mean the Ops team is idle, though. Sysadmins can only take those restorative naps if they know for sure that nothing bad is happening to the systems under their care. This once meant that there were monitoring agents running on servers, and that failure conditions had been carefully defined up front: IF this_happens AND that_happens THEN wake_up_sysadmin.
These days, platforms are self-instrumented and largely self-propelled. The trick is figuring out what is important to know about and react to, and what can be safely ignored. The complexity of modern IT crossed the threshold where those sorts of deterministic approaches stopped being useful some time ago, but serverless really rubs it in. With services almost entirely divorced from specific bits of infrastructure, old ways of keeping track no longer work.
If Ops does not engage with the Dev teams – who, remember, are already doing serverless, somewhere and somehow, whatever Ops thinks – the result will simply be that Dev will interpret Ops as damage and route around it. Then when something inevitably breaks, it will still be the pagers of Ops teams that go off and interrupt their slumber.
Download the newest JAX Mag issue for free – Serverless vs. containers relationship status: It’s complicated!
Although a lot of people see the containers and serverless “brawl” as a war story, this is an excellent pretext for a partnership as they work best when they work together. The current JAX Magazine issue is all about serverless computing and containers – how they compete with one another and how they compliment each other. Experts discuss use cases for both technologies, and whether or not you should adopt the trend or stick with what’s been working.