Linked by David Adams on Thu 1st Mar 2012 22:53 UTC, submitted by judgen
Microsoft The outage on Microsoft's Windows Azure cloud computing platform that caused the government's G-Cloud service to go offline was the result of a calculation error caused by the extra day in February due to the leap year. Writing on the Azure blog the firm's corporate vice president for service and cloud, Bill Laing, said while the firm had still to fully determine the cause of the issue, the extra date in the month appeared the most likely cause.
Thread beginning with comment 509219
To read all comments associated with this story, please click here.
Wtf? Really?
by Soulbender on Fri 2nd Mar 2012 03:28 UTC
Soulbender
Member since:
2005-08-18

The reason is that they didn't think about leap years? In 2012, this is the error they made? It's not like it's some unexpected even we didn't see coming.
You know, I would have found this acceptable in someone's pet OSS project but not in a global service from MS that you probably pay an arm and a leg for.
If I was the guy who was responsible for this in "the government" I would have been having a serious talk with my account rep already and it would not have been easy for them convince me to continue using their product.

Reply Score: 4

RE: Wtf? Really?
by Laurence on Fri 2nd Mar 2012 09:14 in reply to "Wtf? Really?"
Laurence Member since:
2007-03-26

The reason is that they didn't think about leap years? In 2012, this is the error they made? It's not like it's some unexpected even we didn't see coming.
You know, I would have found this acceptable in someone's pet OSS project but not in a global service from MS that you probably pay an arm and a leg for.
If I was the guy who was responsible for this in "the government" I would have been having a serious talk with my account rep already and it would not have been easy for them convince me to continue using their product.


agreed, but sadly British government like expensive and often vastly over-priced contracts with Microsoft, IBM and Oracle is simply because it takes liability away from the government.

If MS fsck up and take a government service offline, then IT managers within the government just say "not our fault, it's one of our service providers". For the government, contracts like this are just another form of outsourcing and thus it would take something monumental and hugely publicly embarrassing before any government body would even consider switching providers - let along bring the services back in house where they really belong.

This is just my experiences when I worked for the British government. Things might be different for the rest of the EU or western world (for their sake, I hope so).

Edited 2012-03-02 09:15 UTC

Reply Parent Score: 3

RE[2]: Wtf? Really?
by lucas_maximus on Fri 2nd Mar 2012 11:22 in reply to "RE: Wtf? Really?"
lucas_maximus Member since:
2009-08-18

Not only Governments but also quite a lot of organisations (I worked in a large charity for 15 months and this was rampant). The higher up you get the more you gotta watch your own backside.

Reply Parent Score: 2

RE[2]: Wtf? Really?
by zima on Thu 8th Mar 2012 23:53 in reply to "RE: Wtf? Really?"
zima Member since:
2005-07-06

sadly British government like expensive and often vastly over-priced contracts with Microsoft, IBM and Oracle is simply because it takes liability away from the government.
If MS fsck up and take a government service offline, then IT managers within the government just say "not our fault, it's one of our service providers"

Everybody likes to outsource responsibility. Certainly in some Central European places one can see a strong "nobody got fired for using Microsoft or Oracle" of sorts...

...and even when the projects, waaaaay down the line, largely prove to be practical failures - those initially pushing and implementing them moved on, several times already, each time adding another "success" to their CV - and the more expensive, the more lucrative such "successes" are, the better they look on the CV, it seems.

Reply Parent Score: 2

RE: Wtf? Really?
by B. Janssen on Fri 2nd Mar 2012 09:58 in reply to "Wtf? Really?"
B. Janssen Member since:
2006-10-11

The reason is that they didn't think about leap years? In 2012, this is the error they made? It's not like it's some unexpected even we didn't see coming.
You know, I would have found this acceptable in someone's pet OSS project but not in a global service from MS that you probably pay an arm and a leg for.

Agreed, that's just embarrassing. But...

If I was the guy who was responsible for this in "the government" I would have been having a serious talk with my account rep already and it would not have been easy for them convince me to continue using their product.

...you would only complain and try to get some monetary recognition out of it, but you wouldn't quit using the service. And you know why. This is not just picking up your ball and going, it's picking up the goal posts, the fences, the benches, the lawn and the parking lot, too. I don't claim to know how large the gov's data is on Azure, but I'm sure it is somewhere in the region where you don't move on a whim.

And on top of that 1 day in 366 is probably well within agreed outage levels (I'd guess they have 99.9%, so they would be covered.)

Reply Parent Score: 2

RE[2]: Wtf? Really?
by Soulbender on Fri 2nd Mar 2012 10:29 in reply to "RE: Wtf? Really?"
Soulbender Member since:
2005-08-18

This is not just picking up your ball and going, it's picking up the goal posts, the fences, the benches, the lawn and the parking lot, too.


In the short run you're probably right but the contract will be renegotiated at some point and I would make damn sure there's was a viable alternative at that point. Of course, I would probably not have bought into Azure in the first place so it's a bit moot.

And on top of that 1 day in 366 is probably well within agreed outage levels


Could be but on the other hand, isn't the cloud all about NOT having these kind of problems? You know, scalability, redundancy and all that jazz that the sales rep probably fed the gov't.

Reply Parent Score: 2

RE[2]: Wtf? Really?
by cdude on Fri 2nd Mar 2012 14:28 in reply to "RE: Wtf? Really?"
cdude Member since:
2008-09-21

"And on top of that 1 day in 366 is probably well within agreed outage levels (I'd guess they have 99.9%, so they would be covered.)"

Let me show you some magic:
100-1/366*100 => 99.73%
99.73>=99.9 => false

Reply Parent Score: 2