I had the pleasure earlier this month of attending a demo day at HP’s Cupertino campus to commemorate the ten year anniversary of the Superdome server, see what’s new in the high-end server market and learn about what’s going on with HP-UX.
Hardware is really a secondary interest to OSNews, but in this case the hardware is pretty impressive (and it had better be, considering that you can easily spend several million dollars on one of these). What I found most interesting was how much effort HP had put into making the hardware platform upgradable and modular. The Superdome was first released in 2000 with PA-RISC processors, and can be outfitted with as little as one “cell” (the cell is like an overgrown blade) containing 4 CPUs, and that same ten year old hardware could have been expanded up to its modern limit of 128 Itanium 2 “Montvale” EPIC processor cores and 2 TB of memory, in two adjacent cabinets with ancillary I-O cabinets.
In contrast to today’s practice of enterprises commonly relying on a large number of smaller, less-expensive servers that are periodically pulled out and replaced with new hardware, users of these large servers appreciate having never been required to do a “forklift upgrade” over ten years. Seeing some of the photos of large servers being installed into upper floors of office buildings using a crane and portions of the wall having been removed, I can see why.
You can run a mix of older and newer processors, adding cells with newer, faster hardware without having to decommission the older stuff unless you’re running out of slots, as long as they’re run as separate partitions — and partitioning and virtualization are a big part of how people use these servers, as we’ll cover more in-depth later.
Now, on to software. HP invited several bloggers and “new media” types to their event (and paid our travel expenses) and I think one of the most memorable comments, made by blogger (and long-time OSNews reader, I learned) Ben Rockwood (and I’ll have to paraphrase): “I was just surprised to hear that people are still using HP/UX.” If you read Ben’s blog, you’ll learn that he’s a Solaris guy, and not a fan of HP/UX, but even so, I think it might be a reasonable reaction. We haven’t heard much about HP/UX in a while. But as Ben points out in his coverage of the event, HP’s decision ten years ago to focus on partitioning and virtualization, therefore centralizing one company’s various computing tasks onto one machine really gave the Superdome more relevance than Sun’s high-end servers, which seemed to be more focused on raw power bragging rights. Since that time, virtualization technology has become more sophisticated, and gone downmarket, but HP’s early lead has helped it keep pace.
The Superdome actually has two kinds of partitioning supported, a “hard partition,” wherein individual cell boards can be separated into electrically-isolated partitions, and the more-familiar virtual partitions we all know about these days. The virtual partitions give you more flexibility, and obviously if you’re going to get anywhere near the 1000 VMs that the Superdome will support, the majority of them will need to be virtual ones. Since the hard partitions support dedicated i/o, they’re best for applications where low i/o latency is a factor. The modular architecture let’s you “scale out” with more virtual partitions to perform varied tasks or “scale up” by adding new CPUs or cells when you need more power.
On the OS front, the Superdome can run HP/UX, Windows, SuSe, Red Hat, and OpenVMS, and can even run them all at once. Of the operating systems running on the Superdome, though, 80% is HP/UX.
In addition to the hardware platform and software tools that help a Superdome user augment the server’s power with new processors (you can add a new cellboard without powering down the system), and dynamically monitor and re-allocate computing resources, HP has a really interesting service wherein you can have hardware installed on your Superdome that you won’t use, and it’s kept in reserve as a sort of insurance policy. You pay for 30% of its full cost. If you have a spike in utilization, you can bring that extra capacity online instantly, and only pay the full amount while you’re using it. HP notes that in their experience, companies are running at 15% utilization of their computing resources on average, in large part in order to be prepared for spikes (that never come). It’s like utility computing, only instead of offloading it to the cloud, you just temporarily beef up your existing computer.
These big servers are expensive; dramatically more expensive per processor cycle than a bunch of racks of smaller servers would be. HP has a couple of answers to this. First is the utilization issue. If you’re getting 15% utilization of your bank of smaller servers, but you have to waste that capacity in order to not end up with overloading some of your more important servers before you can shift resources around, then you may be better off with having a more expensive server that can balance your computing load better and let you run at 60% utilization but still have peace of mind. The second answer is that “companies are spending way too much money on operations and maintenance.”
I was having lunch with an old colleague of mine the other day, who’s the CIO of a local company, and he was bemoaning the can-do hacker ethos of his staff. Normally we’d think of that as a good thing, but he was mentioning how they were planning to hack together an appliance using open source software (software that needed a bit of customization to suit their needs). He pointed out that he could buy a commercial appliance to do what they needed for about $10,000, and what they were proposing was to spend about $50,000 in labor and six months to get it up, then who knows how much more in ongoing maintenance. His staff saw it as a way to advance the state of the art in open source software, which it probably was, but it was also a waste of money and effort that could have gone to advancing the companies true business goals.
That’s kind of the point HP was trying to make, I guess. You can save a lot of money by having 100 smaller servers do the job that one larger server can do, but unless take into account all the labor costs of achieving that goal, you won’t be able to measure accurately which method is really cheaper. And don’t forget that you can’t just add up the salaries of the sysadmins who are doing the extra work. You need to take into account what they could be doing with their time if they had their operations and maintenance load lightened. I’m not totally convinced that the economics of a Superdome-type server would work in most cases, but it’s food for thought. I’d appreciate hearing from OSNews readers about their experience with big servers vs. multiple smaller ones.
Not being a big hardware guy, and not being any kind of HP/UX expert, that about sums up what I took away from the Superdome Tech Day. But one of the nice things about the fact that HP invited a bunch of bloggers to this event is that we can have a little Rashomon moment, and you can see what the other participants’ take on the event was. Since each one of us came from a different branch of the tech media tree, it makes for an interesting comparison:
- Ben Rockwood @ Cuddletech comes from a Sun server administration background.
- David Douthitt @ Administratosphere covers HP/UX, OpenVMS and Linux system administration.
- Andy McCaskey @ SDRnews captured the presentations on video
- Shane Pitman @ Techvirtuoso covers enterprise technology.
- Saurabh Dubey @ Activewin covers Windows.