Linked by David Adams on Wed 16th Apr 2008 15:35 UTC, submitted by R_T_F_M
OSNews, Generic OSes Yankee Group's second annual Server Operating System Reliability survey polled 700 users from 27 countries worldwide. The latest independent, non-sponsored Web-based survey revealed that all versions of UNIX -- which typically carry very high workloads -- are near bulletproof, achieving 99.999% reliability. IBM's AIX UNIX led all server operating systems for reliability with just over 30 minutes of per server annual downtime but Hewlett-Packard and Sun Microsystems also got high scores.
Order by: Score:
Interesting statistics
by kiz01 on Wed 16th Apr 2008 15:56 UTC
kiz01
Member since:
2005-07-06

Those are really interesting numbers. It was interesting to see that Suse got almost the same reliability as Red Hat as I've always heard that Suse wasn't as stable (but faster). It also looks like linux in general has really stabilized. It has much less down time compared to a year ago.

The Windows numbers were also pretty interesting. I expected (with all of the hype that Windows is just as stable as Unix) that Windows would be at least close to the *nix OSes. Well, so much for hype.

The other interesting number was for HP/UX. It had a really low down-time number but that was limited to version 11.1 which, as far as I know, does not support Itanium. It's unfortunate that the downtime for Itanium servers (HP/UX 11.23) wasn't mentioned. I know our 16-way Itanium servers have a lot more down-time than I would have thought (although I don't have any exact numbers). Anybody know the reliability of Itanium servers?

Reply Score: 3

RE: Interesting statistics
by gustl on Thu 17th Apr 2008 15:14 UTC in reply to "Interesting statistics"
gustl Member since:
2006-01-19

I think this survey is not representative.

Why should the numbers vary THAT much, just from one year to the next. There usually is not that much operating system change to justify this large difference.

I think what we see here is statistical noise. A year has 8760 hours, and looking at values of 0 to 10 hours MUST be a noisy signal. For example, if you measure a current of 1000 A, and scale your Y-Axis from 999 to 1001, you will likely see "huge" spikes of probably 0.5 A. Then you would compare the "Microsoft" Powerline which delivers 999.5 A to the AIX Powerline which provides 1000 A.

You simply HAVE to see statistical noise. Even more so this has to be true for uptime. Make a survey with 10000 responses, then we start talking about reliability.

Another flawed piece of art by Laura Didio. This time not pro Microsoft, but pro UNIX. Probably she got invited to a nice Hotel somewhere by IBM to get a list of customers to ask.

Analysts sell out without obviously seeming to do so, that is their business model.

Reply Score: 2

RE[2]: Interesting statistics
by gilboa on Fri 18th Apr 2008 04:08 UTC in reply to "RE: Interesting statistics"
gilboa Member since:
2005-07-06

I think this survey is not representative.


Surveys tend to be, err, inaccurate by design.

Why should the numbers vary THAT much, just from one year to the next. There usually is not that much operating system change to justify this large difference.


Numbers -can- very YonY.
Windows 2K3 might have had a -very- bad year: Microsoft possibly invested more resources on Win2K8 and Vista, taking them from the Win2K3 team; Hackers might have figured out how Win2K3 works and started exploiting it, etc.
By itself, the variation in numbers between surveys doesn't necessarily nulls the survey.
As opposed to Laura Didio, that is...

I think what we see here is statistical noise. A year has 8760 hours, and looking at values of 0 to 10 hours MUST be a noisy signal. For example, if you measure a current of 1000 A, and scale your Y-Axis from 999 to 1001, you will likely see "huge" spikes of probably 0.5 A. Then you would compare the "Microsoft" Powerline which delivers 999.5 A to the AIX Powerline which provides 1000 A.

You simply HAVE to see statistical noise. Even more so this has to be true for uptime. Make a survey with 10000 responses, then we start talking about reliability.


While you might be statistically right, you're completely off mark here.
I work in the five-9's world. Both our software (that runs on a large number of servers - both RHEL and Win2K3) must be able to log ~6m of down-time per year. (Granted, I doubt that we are capable of achieving more then 4/9's - but that's something else...)

Look at it from my employer's perspective - 50% of our software solution is using Windows 2K3; According to this survey Windows 2K3 doesn't even come close to logging 3/9's while RHEL logs close to 4/9's. (~30m/y in our own experience - using a highly customized version RHEL5)
Given the basic requirement for 5/9's and these numbers (statistical noise or not), should my employer risk his head choosing Windows 2K3? Doubt it.

Another flawed piece of art by Laura Didio. This time not pro Microsoft, but pro UNIX. Probably she got invited to a nice Hotel somewhere by IBM to get a list of customers to ask.

Analysts sell out without obviously seeming to do so, that is their business model.


I must agree.
As much as I like these numbers (and plan on using them to get my employer to port additional products to RHEL instead of Windows 2K3), Yankee group's survey have a -very- problematic history. (TCO/Get-the-facts)

- Gilboa

Edited 2008-04-18 04:20 UTC

Reply Score: 2

What an achievement
by tomd on Wed 16th Apr 2008 15:57 UTC
tomd
Member since:
2006-10-16

Indeed, what an achievement for IBM. AIX, 0.60 hours of downtime per year, is almost as good as Mandriva (and Turbolinux), 0.38 hours of downtime per year ;)

Tom

Reply Score: 1

Using this measurement...
by theTSF on Wed 16th Apr 2008 16:07 UTC
theTSF
Member since:
2005-09-27

Downtime average per year may not always be a true test of the OS.

Windows/Linux tend to run on diverse hardware. Some without hotswap drives and such so harware failures could account for the downtime. Vs. the Big Unix systems which have hot swap drives and failover systems, build in at the hardware level.

Also the Userfriendlyness plays a role too, not the actual program reialability. So if it goes down how easy is it for the expert to fix the problem.

Reply Score: 3

RE: Using this measurement...
by mdoverkil on Wed 16th Apr 2008 17:22 UTC in reply to "Using this measurement..."
mdoverkil Member since:
2005-09-30

I would also like to see some hardware statistics. How does downtime with Solaris compare on x86 vs SPARC? What were the most typical causes of downtime from the sample of operating systems?

Reply Score: 3

RE: Using this measurement...
by Ikshaar on Wed 16th Apr 2008 18:12 UTC in reply to "Using this measurement..."
Ikshaar Member since:
2005-07-14

Well the downtime encompass the fact that server was down and the time it took to bring it back up. So no user-friendliness excuse... which would be questionable anyhow - I find my system very user-friendly to me ;) If your IT does not know how to fix his server, change IT.

Reply Score: 1

RE: Using this measurement...
by lemur2 on Wed 16th Apr 2008 23:22 UTC in reply to "Using this measurement..."
lemur2 Member since:
2007-02-17

Downtime average per year may not always be a true test of the OS.

Windows/Linux tend to run on diverse hardware. Some without hotswap drives and such so harware failures could account for the downtime. Vs. the Big Unix systems which have hot swap drives and failover systems, build in at the hardware level.

Also the Userfriendlyness plays a role too, not the actual program reialability. So if it goes down how easy is it for the expert to fix the problem.


Downtime average per year may or may not be a true test of the OS ... but it is a true test of downtinme average per year.

If you are running a server, and you want it to be reliable, what you want to know about is ... downtime average per year.

Reply Score: 2

Where are BSDs?
by vermaden on Wed 16th Apr 2008 17:13 UTC
vermaden
Member since:
2006-11-18

That survey is a joke.

A lot servers run FreeBSD because of its reliability and performance, the survey does not even mention a word about BSDs, for example chech netcraft: http://uptime.netcraft.com/perf/reports/performance/Hosters?tn=marc...

Reply Score: 7

RE: Where are BSDs?
by Crono on Wed 16th Apr 2008 17:27 UTC in reply to "Where are BSDs?"
Crono Member since:
2006-11-08

BSD isn't either UNIX or UNIX-based anymore?

Reply Score: 3

RE[2]: Where are BSDs?
by vermaden on Wed 16th Apr 2008 18:26 UTC in reply to "RE: Where are BSDs?"
vermaden Member since:
2006-11-18

BSD isn't either UNIX or UNIX-based anymore?


BSD is UNIX, there are two major trees in UNIX history, SVR4 UNIX and BSD UNIX, but what do that have to the survey? The survey even mentions Ubuntu Linux.

All BSDs (FreeBSD / NetBSD / OpenBSD) uses original 4.4BSD UNIX OS code as a base.

There are even books about BSD UNIX:

http://amazon.com/Design-Implementation-UNIX-Operating-System/dp/02...
http://amazon.com/BSD-UNIX-Toolbox-Commands-FreeBSD/dp/0470376031/
http://amazon.com/Absolute-OpenBSD-UNIX-Practical-Paranoid/dp/18864...
...

Why do people always ask questions about obvious things?

Reply Score: 2

RE[3]: Where are BSDs?
by Crono on Wed 16th Apr 2008 18:40 UTC in reply to "RE[2]: Where are BSDs?"
Crono Member since:
2006-11-08

What I was TRYING to say:
Why are you complaining that BSDs weren't mentioned???

The latest [...] survey revealed that all versions of UNIX [...]


BSD ⊆ "all versions of UNIX"

Reply Score: 2

RE[4]: Where are BSDs?
by vermaden on Wed 16th Apr 2008 19:43 UTC in reply to "RE[3]: Where are BSDs?"
vermaden Member since:
2006-11-18

Sorry my friend, seems that I did not get your point, propably std::misunderstanding

Edited 2008-04-16 19:46 UTC

Reply Score: 2

Inherent weaknesses
by fretinator on Wed 16th Apr 2008 18:13 UTC
fretinator
Member since:
2005-07-06

Windows will always struggle in this regard due to a couple inherent weaknesses:

1. File-locking. Windows (last I heard) couldn't replace a file that was in use. Most of the reboots in Windows is due to this problem.

2. Restarting services - many times, Windows servers are rebooted because it is the easiest way to "get things working". Many Windows administrators are not aware of the means of restarting subsystemes like the network, etc. The quickest way (and safest way due to problem #1 above) is to reboot the box. It is common on Linux, etc. to do things like /etc/init.d/network restart after a major change.

Reply Score: 3

oh no, not that B**** again
by karl on Wed 16th Apr 2008 19:25 UTC
karl
Member since:
2005-07-06

Laura DiDio is the person who created this report. She is also the one who lost all credibility when She attacked Pamela Jones of Groklaw. Why *anyone* still listens to Her, submits articles from Her, or even links to pages associated with Her, is a mystery to me. I went to the sink to scrub my hands after I clicked on the TFA without realizing She was the author. The content of this article is *absolutely* worthless due to it's source.

Please stop contributing to Her by submitting articles. Please stop giving Her any kind of publicity or recognition.

Reply Score: 8

RE: oh no, not that B**** again
by chemical_scum on Thu 17th Apr 2008 03:20 UTC in reply to "oh no, not that B**** again"
chemical_scum Member since:
2005-11-02

Didiot cant even draw the correct conclusions from her own report. She says:

"Additionally, there is far less disparity now, in the number and severity of unplanned server outages and the time that businesses experience on their standard Linux, Windows and UNIX platforms, than at any time in recent memory."

From her graph at:

http://www.iaps.com/exc/yankee-group-2007-2008-server-reliability.p...

You can extract the following numbers:

1. Between 2006 and 2007 Windows 2003 downtime increased from 7.09 to 8.9 hours downtime.

2. All Linux server OS downtimes decreased drastically over the same period.

3. RH downtime decreased from 7.14 to 1.73 hours downtime.

4. SUSE downtime decreased from 4.06 to 1.08 hours downtime.

5. Ubuntu new in the survey came in at 1.10 hours downtime.

6. Windows 2000 server downtime also increased over the same period to 9.86 hours achieving the distinction of being most unreliable server OS in 2007 and beating Windows 2003 server into second place.

Real conclusion: Windows servers now really stink compared to all Linux server distributions for reliability and they are getting worse.

Edited 2008-04-17 03:37 UTC

Reply Score: 4

nice...
by 2501 on Wed 16th Apr 2008 20:18 UTC
2501
Member since:
2005-07-14

I think that is why my Slackware server never goes down!

Awesome...

-2501

Reply Score: 1

A waste of space
by TechniCookie on Wed 16th Apr 2008 20:52 UTC
TechniCookie
Member since:
2005-11-09

This survey is very weird. There is provided no details on how it was done. How big is the uncertainty on these numbers? There are obvious omissions in the chosen 'unices'. There are some of the titles of the bars just don't make any sense. The descriptions are really weird like:

'OpenSource Linux' (given the GPL I thought all Linux was open source)
'Linux from Suse'
'Linux with Suse with Customizations' (can I please have some Linux with a bit of Suse and some Customizations on the side, please)

And it gets even more vague with 'Other Linux' and 'Other Linux with Customizations'. This is meaningless. It conveys absolutely no information.

Reply Score: 2

Ubuntu
by stestagg on Wed 16th Apr 2008 21:10 UTC
stestagg
Member since:
2006-06-03

There's a lot of snobbish distain of Ubuntu in the systems world, but, if this report is to be trusted, then ubuntu comes out pretty well. Un-Customized*, the only named OS/Distros that beats it are: AIX, and SUSE.

Given that Ubuntu is known for being used by less experienced users, these figures are really quite reassuring.

* It's not clear what 'Customized' means here. If they're talking about using custom kernels in production, then I'm assuming that 'they' know what they're doing enough to have stable systems.

Reply Score: 2

Admin reliability
by sb56637 on Thu 17th Apr 2008 04:36 UTC
sb56637
Member since:
2006-05-11

I would hazard to guess that a major factor in the stability statistics is the quality of the administrators. Windows admins are a dime dozen, many of them poorly trained on "click, click, DONE" GUI interfaces, and often have no idea how the OS functions. On the other hand, UNIX admins, especially the ones for a high-end proprietary OS like AIX, are more scarce, and tend to be highly trained and very experienced.

Edited 2008-04-17 04:42 UTC

Reply Score: 4