Get by with a little help from ML

Predicting 2020 and beyond: Real time is out, predicting the future is in

Tim Armandpour
© Shutterstock / Andrey_Popov

It’s a tradition for news and information sites to ask industry leaders for their New Year’s predictions. We approached veteran industry leader Tim Armandpour, SVP of engineering at PagerDuty and a former senior director of engineering at PayPal, for his. His prediction is that it will all be about predictions.

The past few weeks have seen publications of all types herald predictions for the new year, just as they do every holiday season. Writing from a DevOps perspective, here’s mine: Predictions will catapult our business to higher levels of excellence.

By that, I mean technology that can predict events, incidents and failures before they happen. Soon, real-time notifications will no longer be good enough. Our industry needs to move past real time because it isn’t fast enough for modern businesses to compete and survive. We need to go one step further – several steps, to be frank – to predict what’s coming before it happens. Doing this means we’ll also need to reassess the way we perceive and utilize automation, specifically machine learning (ML).

SEE ALSO: DevOps – The Troubles Of Automating All The Things

I’m reminded of the author James Gleick’s landmark book from 1999, Faster: The Acceleration of Just About Everything. To make a point about how demanding we’ve become for speed, he used the example of elevators: “A good waiting time is in the neighborhood of 15 seconds. Sometime around 40 seconds, people start to get visibly upset.” One obvious analogy to that is in the field of customer experience: Consumers are often impatient and move on to other offerings when a digital service or app is unresponsive or slow. Research has shown that 53% of users abandon an application after 3 seconds of latency.

Gleick’s book, which predicted the need for speed would only accelerate, was published before the introduction and mass adoption of smartphones, tablets and Netflix, not to mention cloud technologies, big data, and ML. Think about how much faster everything is now than it was in 1999 – and how much faster it will get in three, five or 10 years.

When it comes to doing business today, lack of speed kills. Today’s organizations are fixated on the reliability of their technology. But any developer can tell you that the reality is not if it will fail, but when it will fail. To meet the need for speed, the success metric for leading companies will shift to resilience – or how quickly you can recover from failure – to the point that if I’m your customer, I’ll never actually notice it.

In a very real sense, predicting IT events before they happen is the same as meteorologists forecasting the weather. Like the weatherman, they look for signals and patterns from past events to determine, for example, that there’s an X-percentage chance of a Category 3 hurricane developing into a Category 5 when it hits land.

In our industry, we’re in the middle of a major shift from incident management to incident prevention because we also now have large sets of accurate data that can detect signals, emerging patterns and context to reveal high degrees of probability. That type of data can be leveraged to predict the future and prevent incidents and failures before they happen. And that will prove to be the differentiator for leading companies going forward.

The first step is understanding the context of past scenarios – what transpired, what worked, what didn’t work and bringing it all together to help teams determine the next course of action – not unlike meteorologists archiving historical data on past heat waves and storms. The next step will be fine-tuning monitoring capabilities.

What’s needed for the type of predicting I’m predicting is a little help from ML.

Having said that, however, I do worry about our industry being too enamored with the possibilities of ML because when it comes to technology and solving problems in general, ML is not entirely objective. A high degree of subjectivity exists because, after all, there’s a human developing and creating the software that powers the actual specs. And anytime you start to get into the world of algorithms, your cost of being wrong can be precipitously high.

SEE ALSO: The DevOps to AIOps journey and its future

So, the next year – and the next decade – will require us to reposition machine learning (ML) as an ally, not a replacement. This is a technology that teams should lean on to help cut down on inefficiencies, not completely depend on. For ML to be successful, we’ll need to find the sweet spot of putting human power and human thinking first, then adding just the right amount of ML to make things more efficient.

The most important thing to remember is that DevOps – and ITOps, SecOps and Network Ops – has a strong human component. At their core, they all require a creative process. They will always be about people. You can’t just automate operations out of the way.

On the other hand, when it comes to predicting the future in the future, I’ll put my money on ML helping humans to do the best, most accurate, job.


Tim Armandpour

Armed with a 20-year history of maximizing value to users through technological innovation, Tim Armandpour is PagerDuty’s SVP of Engineering. Prior to joining PagerDuty, Tim led product management and engineering teams at Yapstone, a global payment solutions provider. Tim also led the global engineering teams at PayPal that delivered new product experiences across tablets, mobile devices, and the web. He also served as Vice President Engineering at Zong, prior to its acquisition by PayPal. Tim began his career in 1999 as Lead Engineer with Yodlee. Tim is a graduate of University of California, San Diego, where he received his B.A. in Computer Science. He also holds 4 U.S. Patents.

Inline Feedbacks
View all comments