While most of you were out toasting the Sunday dawn sunrise with a glass of Cristal, after a marathon night of ballroom dancing and karaoke, HfS analyst Brian Robinson was busy pouring over the issues surrounding the now-infamous Amazon Cloud-outage. So what better than that 1,301st perspective you must be eagerly craving... over to you Mr Robinson:
Amazon’s Cloud outage created significant buzz in the markets this weekend: Google’s news archive showed that by mid-day on Friday nearly 500 news sources were reporting on the topic with over 1,300 articles written. I was surprised to see how little coverage some other world events received, many of which are arguably more important. And then I saw a true media casino – the upcoming Royal Wedding.
A quick review of Google stats shows that over 11,000 articles have been written from between 600 to 1,400 sources. It’s not hard to imagine why the Prince has asked for a two year hiatus from the intense pressure and scrutiny post the wedding. I bet Amazon wishes it had a similar ‘get out of jail free’ card to use this weekend. But as a public company, they don’t. Nor have they earned their seat amongst the world’s largest service providers, which has its privileges.
Going forward, Amazon can expect a similar spotlight to highlight both its successes and failures. Why? The Cloud offering is a disruption to the tried and trusted IT and BP services model. Much is at stake and whether it likes it or not, Amazon is one of a few companies that now sits at the fulcrum. I am certain Amazon will improve their response to and management of similar situations should they reoccur. As important, I hope the other participants in the sourcing ecosystem (i.e., buyers, research groups and consultants, and providers) also gain some much needed perspective:
Buyers: Know your customers’ availability requirements inside and out
Not that M. Zuckerberg and Facebook need any more airtime, but Mark gets his customers’ uptime requirements. As quoted from the Social Network, “…the difference between Facebook and everyone else, we don't crash, ever! If those servers are down for even a day, our entire reputation is irreversibly destroyed!” Mark understands the value his customers put on availability – do you? Moreover, do you know what you have contracted for and how you are compensated if failure occurs? If not, then it may be worth investing some resources into finding out. Granted, Facebook has a single platform to worry about. Large enterprises have poorly integrated, multi-vendor supported, multi-technology beasts of systems, so there are more points of failure and determining who exactly was responsible for an outage could take longer than measuring the customer impact of said outage.
Research analysts and consultants: provide pragmatic and objective advice
Traditional media outlets are responsible for reporting on the issues that may impact both customers and businesses. As respected and experienced industry analysts, our responsibility is to further focus discussions on the facts without feeding the media frenzy.
Following a review of the articles written immediately after the incident, I noticed a couple of comments from respected research houses that rained down on the future prospects of Cloud computing. Are these houses becoming as conservative as the IT organizations they influence?
Other, more pragmatic analysts, shared views similar to my own. For example, Vinnie Mirchandani, makes it clear that all service providers have delivery issues but few rarely receive the same media scrutiny. He goes on to propose that all public providers should face the same level of transparency. Although not pragmatic, his point is clear: Amazon has been singled-out: not for what is abnormal in the industry, but for bringing to market a disruptive service that some look to discredit.
Bottom line, big enterprises do not call the media when their own, mostly ancient, data centers take the shape of pears. We are quite certain this happens with more frequency than in outsourced data centers of both the “traditional” and “new” variety. Big data center providers are at greater risk, which is why their data centers tend to be more reliable—but they have downtime too. In fact, traditional providers are just more adept at handling it because, well, it happens more often than any of us ever get to hear about.
Providers (really the new guys/gals on the block): experience counts in the lion’s den
The world’s largest service providers have many more years of experience than you do – some with 20 with others reaching 30. Yes, your technology is disruptive, but that does not mean you need to drudge through the same pitfalls that your predecessors have already navigated.
The best organizations will reflect on what has already been learned from the industry veterans and incorporate it into their service delivery plans. This includes, for example, robust disaster recovery plans and proper communications plans. Today’s largest providers have perfected how to be profitable while managing their client expectations and guarding against future failures. They did not learn these lessons overnight, and today’s new companies will benefit by hiring managers that can bring this to light.
And finally here's even more perspective—the questions we should really be asking:
- What is the relative reliability and security of the “new” Cloud offerings from Amazon? Are these offerings objectively a greater risk? Bring us data—not innuendo!
- How much money was lost as a result of this outage? We know the damage done to the Amazon brand, but what about its customers' brands?