When Sysadmins Ruled The Earth – A short story by Cory Doctorow
Media organizations that are traditionally a day late and a dollar short, like InfoWeek, have been talking again about whether you should keep your organization’s IT infrastructure in the cloud or on local servers.
The analogy that’s used in that article is that many industrial businesses used to generate their own power, and used to keep engineers on staff to support that power generation capability. These days, it’s more efficient for them to tap into the grid, because electrical power companies that can operate at scale will generate power at more economical prices. It’s not a bad analogy.
What was left out of the analogy is all of the other things that came along with that physical plant that benefitted an industrial campus. Most of these plants were steam plants in some form or another; they burned something or used waste heat from somewhere to make steam, which is an incredibly efficient way to transmit massive amounts of motive force over short distances. You can easily tap the line for a steam plant to heat office space and break rooms. Steam plants generally require heat exchangers and cooling towers, which can be rigged to provide chilled water even in the summer, providing a way to cool offices so that executives can wear suits in the summertime.
For most small businesses, the target market for software like Microsoft Small Business Server, in-house infrastructure doesn’t make sense as long as you have a sufficiently advanced or reliable internet connection. (This generally means that you need 2 or more internet connections and an edge device that can handle switching back and forth between them, because even a Metro-E connection provided by Time Warner only comes with an SLA that isn’t worth the paper it’s written on.) If you lose your internet connection during the day because of a fiber cut down the street or because your latency spikes through the roof, you may as well send everyone home.
No, you probably can’t keep running off of the WRT-54g that’s been in the “closet with all the wires” for years and a “business DSL” line from your local telecom monopoly, because they won’t even bother sending a tech out unless your line is dropping more than 25% of it’s traffic.
On top of that, you still need a techie of some sort around to diagnose that and to talk in the language that the telecom monopoly can understand. This function is already filled pretty competently for small to mid sized businesses by numerous Managed Service Partner companies.
For businesses that do software development in-house, the cloud might make sense. Developer machines are typically powerful enough to run Vagrant or another virtualization solution so that developers can test locally. If not, replicating your production environment for your QA process can get moderately expensive if you’ve got a decently sized development team and you do continuous integration internally.
For businesses that do any significant amount of data processing, the cloud probably doesn’t make any sense. You simply require too much compute. Let’s take a look at some numbers. First, instances are expensive. Comparing a compute-optimized node that matches the production environment that I typically specify (m3.xlarge – 4 cores, 7.5 gb of RAM, 80gb of SSD storage) – it’ll run you $0.45 per hour. That’s $10.8/day. Obviously, the reserved rate is much cheaper.
On top of that, as soon as you hit AWS (or DigitalOcean or Azure), the premise that you can get rid of the expensive engineers goes out the window. You need those people, either on a full time or on a contract basis.
Case Study #1
I have a consulting client that does a (relatively) small amount of data processing and has their servers in a Tier II colocation facility. For a 1/3 rack, they pay $450/mo for power, 10mbit/s up/down, and cooling. (Yes, that is cheap, they’re grandfathered into a contract rate. The Tier IV side of the datacenter would also be more expensive.) The rack has five 1u machines, a 3u tiered SAN chassis with a flash cache, a 1u chassis with backup drives in it, and a Cisco router and firewall. All five 1u machines are approximately double the specs of the AWS instance I specified and they generally host two or more production VMs and a handful of development or testing instances, which means that the overall CPU utilization is rather high.
For comparison pricing, I was able to throw the production instances into various AWS categories that would meet their needs, but with a big penalty for contract costs, the data in EBS, and the database. I used the EC2 calculator and specified ten m3.xlarge instances at 80% utilization and thirty t1.micro instances that are only on when developers are working (assume 40 hours per week, but it’s probably more like 60 or more by the time you count the continuous integration machines), which is a decent measure of how many dev and staging instances the group has running around. Worse, the data processing instances process data feeds, store them, and regularly re-poll them; this means that they do a lot of IOPS and use a lot of bandwidth during reporting cycles.
Total outlay to purchase the equipment in my client’s rack was $42,000 with warranty, and are guaranteed to work for at least three years, and will probably work for longer. Additionally, at the second year mark, we started swapping in new processing equipment at about $3,000/yr for a total annual cost of $8400/yr. I expect the SAN chassis will need to be replaced at year 7, but that’s outside of the horizon of this discussion. The equipment also has residual value that can be recouped by selling it if it does not fail after we rotate it out.
What gets really expensive is database traffic. There’s a RDBMS that does a big chunk of the work. It requires consistency (eventual consistency is not good enough), chews a lot of bandwidth, and has about a 3:1 read:write ratio. Every pageview in the web app does about 150 queries (yeah, I know, I’m not the programmer) and every piece of data that we ingest gets written to once, read twice, written to once again, and then read monthly from there. Our warehouse is largely flat file at this point.
Total cost for AWS, excluding database, is $11,300 one-time fee for reserved instances, and $2200/mo ($26,400/yr) to run. The real joker, price-wise, in the stack is DynamoDB. By the time you include a fully consistent database, you’re looking at over $10,000 per month. You could buy this client’s existing environment every six months and still save money.
With a thorough re-engineering of the environment to use a more economic data storage method and to minimize database traffic, we could probably bring the cost down to where we’d only be able to buy the existing environment every two years instead of once a year. AWS just doesn’t seem to make sense for this workload. That re-engineering would require a significant investment.
Case Study #2
Back on the other hand, I have another client who already uses AWS heavily (SES, others), but primarily uses two leased servers, one for database and one for web. He’s I/O bound during peak time on the leased servers and they consume a significant portion of his hosting budget. Most of the time, though, those leased servers sit idle. The database ticks along at about 10% CPU utilization. The web server gets bound when a traffic spike coincides with scheduled data processing tasks that exchange data with various providers’ APIs and create summary reports.
This environment, which is a traditional webapp, is far easier to engineer for the cloud. We’re standing him up in DigitalOcean. While DigitalOcean is a much newer host with a lower availability level and a less feature-rich toolset (no auto-scaling or beanstalks here), we can do most of the work with $5/mo droplets for the load balancer and a couple of web heads. Their database server suffers from some of the same constraints as the previous case, but we can still run that on a $10/mo droplet. Best yet, we can fire up individual small droplets for the data processing jobs, and only incur a small hourly charge. This is a significant departure from the way the web app is currently architected, but it’s easy to squish all of the pieces into place with a sufficient application of sysadmin glue.
Back on the other hand, that glue gets expensive. I’m not that cheap these days, even if I’m still easy. (To get along with, of course.)
If you understand your environment thoroughly and you can engineer it for cloud computing, you can do cloud computing pretty cheaply.
If you’re working in a traditional office environment with just a file server, you’re good to go for the cloud as long as you are cognizant of what you need to get in touch with that cloud.
My number one fear in my day to day work life as a Systems Administrator who works heavily with data processing workloads is the type of non-technical C-level manager (i.e. a CFO) that would read one of these articles and think that “the cloud” could replace all of the expensive colocation bills, capital expenditures, and salaries that come with doing everything in-house. Non-technical senior management sometimes has the ability to block or stall essential infrastructure purchases. If that C-level had read the InfoWorld article, and I was one of my bosses, I might have a difficult time explaining why our workload isn’t suitable for a cloud environment based on the understanding that the nontechnical C-level has. In the past, I’ve seen that lead to a pretty hostile environment between business management and technology management. Thankfully, that particular discussion is above may pay grade.
Except when I’m working as a consultant. Then I just charge double my rate for that type of work.
I’ve done a little spring cleaning around here — I’ve shifted hosts to DigitalOcean, updated WordPress, and thrown a new theme up.
One or two things in the menu and with the contact form may be broken while I’m changing things around. Stay tuned.
My doctor told me to take it easy over my winter break after a recent injury. I’m utterly incapable of taking it “easy.” He did expressly forbid me from working on my home improvement projects for a month. However, he did not say anything about giving my brain a good workout, even though that’s the part I physically injured. So I thought it might be fun to learn another language and to do some of the cross-platform game development that I’ve been thinking about for the past couple of years…
F# looks interesting. It’s a strongly typed functional language that is also focused on object-orientation and the asynchronous (threaded) programming tools are being backported to C# in C# 5.0. It seems like it’d make a lot of the (hopefully) asynchronous AI and world generation programming that an infinite-world RTS requires somewhat simple. Since there’s GTK# bindings available to make basic UI chores easier, I’m wondering if it’s possible to do some moderately complicated game development in it. Initial research indicates that this is definitely the case, and there are some people working on it.
I’ve seen a few other tutorials, but they don’t seem to be using the most recent version of MonoDevelop. Getting started with F# is now insanely easy on OSX as of MonoDevelop 3.x.
Note: This will install F# 2.0 — apparently F# 3.0 is available if you instead install the Mono Project 3.x+; since I had MonoDevelop install it and was running the 2.x (stable) branch of Mono Project, I got F# 2.0.
- Install the Mono Framework From the Mono Project website and install it.
- Download MonoDevelop (I used 3.0.5) from the MonoDevelop download site
- Install the app and run it.
- Click MonoDevelop->Add-In Manager
- Click Gallery, and expand Language Bindings
- Click F# and Install
After the install runs (it took about thirty seconds), you can select F# projects from the new project tool, and the fsi, fsc, and other command-line tools are available and work well.
In the past, I’ve talked about how we’ve had problems getting hardware from major vendors to live up to it’s specifications and intended use, and to get the vendors to support and fix problems with the hardware that they ship us.
It’s just bitten us in the ass again.
HP’s current generation of DL580 and DL980s will not boot with a bad DIMM or DIMM slot. Period. Even with RAM mirroring enabled. There is no option in the UEFI menu to bypass it, and there is no option to boot anyway, even if you have RAM “Pairing” enabled. There is, in fact, no way to boot at all if you have a bad DIMM in a slot, even if the DIMM is good and the slot is bad.
HP support, after two days of phone calls (aka “wasted time”) insists it works. They finally dispatched a technician, who proved it didn’t work and there was no way to make it work. Ball in HP’s court. We don’t expect to see the ball back any time soon.
Our $30,000 HP boxes won’t boot (and we’ve had a DIMM failure in the same slot in two machines now), but the $4,000 Supermicro ones will still boot. Guess which ones we’ll be buying more of?
The wife of a person who I wish I could count as a close friend has passed away, leaving Kris the single father of three young children who will grow up without their mother. I’m not even sure that words can describe how I feel at this point — I’m part of a small family that is not used to bereavement.
A bit of unwritten PHP/Symfony history: I hired Kris in late 2004 to be a part of the contracting company I was running at the time. It was immediately obvious he was a better programmer (and better person) than I was; it was also immediately obvious that I was a better programmer than I was a business owner — that is to say, I sucked at both roles. The company quickly folded, and I had to lay off Kris. The mayfly of a company was the first exposure that Kris had to the kind of mess that in-house developed PHP frameworks could be.
After I’d had to fold the company, Franya still invited me into the home she, Kris, and Sadie (at the time, a newborn) shared. They stored some of my furniture in their basement for a year until my family could come pick it up. I was too ashamed of myself to pick up my own possessions.
Tonight is my own personal wake for Franya. I remember the times that she tolerated the nickel poker games that Kris invited people over for, I remember her cheering on her father-in-law’s bottom-ranked indoor soccer league team as if it was the next incarnation of the Timbers. There were holes in the floor of the bathroom where the toilet was supposed to be in their home, but there was never a tattered heart in Kris and Franya’s home until tonight.
Rest well, Franya, we will remember you.
I’ve had my Nike+ Fuelband for three days. It’s a neat little bit of technology, and has replaced the simple stride (pendulum) pedometer that I used to take on walks. Buying it is part of my goal to slim myself down. One of the problems that I feel I have is that I don’t always live in the same place. I’m either at my house, my girlfriend’s house, or on the road for work or consulting. As a result, it’s difficult to maintain a sense of how much exercise I get from day to day — after a day in a datacenter, I might feel exhausted and not want to exercise, but need to burn a few more calories to continue my weight loss.
And I do need to lose weight — I ballooned up to almost 200 lbs in the first quarter of this year. I spend most of my time firmly to a computer desk… and on days like today, when all of my coworkers are out of the office, there’s not much of an opportunity for breaks.
The upside of the fuelband is that it’s neat and unobtrusive. The idea is that you wear it all the time (hence the wristwatch function) and over time you build up a picture of when you’re active during the day. It helps motivate people like me who have been known to plant their ass in a chair for twelve hours, breaking only for food and bathroom, to get off of their cans and on their feet.
The downside that I’ve discovered is that I don’t think it’s very accurate, or that Fuel doesn’t provide an accurate idea of how much you’re really exercising. As an example, I went on a four mile walk on rough terrain with a dog, and received about 1000 Fuel points for it. Making the bed and driving to pick up one of my friends resulted in 300 points. How’s that work again? There is no way that in five minutes of bed making and fifteen minutes of driving I burned a full third of the calories (or shuffled a full third of the oxygen or whatever Nike’s Fuel measurement is actually trying to measure) that I burnt on a four mile hike. I am an animated talker, and it might have been interpreting all of the arm-waving and gesturing I was doing as exercise. If so, that means that the FuelBand should not be worn full time. The only thing I can think of is that I didn’t get enough arm movement in because part of the time I was holding the dog leash. On the other hand, it accurately measured the distance of the walk and came at least close to approximating how many steps I took. I might have to pull my original pedometer out of wherever it’s hidden to compare results.
Later that same night, I drove the 80 miles back up to my house. The FuelBand apparently doesn’t recognize that you’re sitting still when you’re driving; it registered about 200 fuel points burnt during an hour of freeway driving. The only motion I was engaged in was holding my cell phone to my ear and jawing with my mom. Again, this might be a case where the FuelBand should not be worn full time.
I’ll continue to experiment with it, but so far — my initial interpretation is that it is not very accurate at measuring actual exercise and activity.
The battery life is excellent. The battery was not yet half drained after three partial days of use. However, the only way to check the battery life is to plug it into a powered slot on your computer to charge it. My laptop only has two USB ports, so that means I either need to unplug my keyboard or my mouse to charge it. Although it charges very quickly, this is annoying. We could use both a battery readout and a wall charger. Not having a battery readout seems like a big oversight.
I haven’t experienced the reported synchronization problems. Synchronizing via computer or iPhone is easy and fast. The website is excellent and really helps you break down your activity times. The additional motivation of unlocking ‘accomplishments’ and trophies is a really neat idea that definitely helps motivate me.
I was looking forward to a bit more social network interaction, and being able to set up groups to compare what my friends who also had Nike Fuelbands were doing. There’s at least three of them on Facebook. But there’s no easy way to do this. The only way we can set things up is to do an email invite group with a definite goal. What if we don’t want definite goals? This is another poorly thought out area.
The band is unobtrusive, extremely comfortable even when working at a computer all day or sweating out in the yard, and is definitely wearable all the time without any issues. I found that even though the sizing guide said I was on the line between medium and large, the medium fits quite nicely without the included expansion joint.
Overall, I’ll give it a 2.5/5 unless I can figure out the weird aspect of the Fuel calculation that seems to reward me more for cleaning house than taking a long walk. Plusses are comfort and usability of the band itself. Downs are the fuel measurement, having to charge via USB, and some boggling usability or feature choices.
We’ve got a little project running to build out our new cloud environment. More on that later. The first problem is the hardware, which is left over from another project. Except that project never bought RAID cards for the hardware, even though they did buy disks.
The enclosures are 2u SuperMicro machines that support 12 3.5″ disks, but require low profile expansio cards. The disks are 1TB Western Digital WD1001FALS. We have this new policy of testing hardware before we put it into production to make sure that something isn’t seriously wrong with it, so we started out by benchmarking single drive performance on the crummy onboard IC10H SATA controller.
Our testing methodology is basic. We’re using fio 1.58 and running a mixed read/write across 1GB of files, in order to exceed any cache. 8 of the drives will be in a RAID0 (yes, I know, but we’re basically doing RAIM — Redundant Array of Independent Machines and it doesn’t matter if we lose a drive…) and the remaining 4 drives will be used for various system tasks. As a result, we evaluated 16, 12, and 8 port cards — we can have the 4 drives used for various cloud system tasks on the onboard, and the bulk storage on the 8 port card, and tried to evaluate one from each major manufacturer where we could.
We used this FIO configuration file for the testing.
Many thanks, by the way, to Joe of Scalable Informatics fame, who helped me with the mostly undocumented fio configuration and interpreting the results. The intent was to really load down the system, as we’ll occasionally see performance at it’s worst while we’re rebalancing GlusterFS nodes.
All drives as tested were formatted with XFS, and XFS was given the proper parameters to match the RAID stripe size, block size, and RAID member count. Stripe size did not affect performance, and misconfiguring XFS only knocked a few MB/S off of the performance. The systems being tested were based on Debian Squeeze or Wheezy.
The bare drive performance was … barely acceptable. These drives are just not fast. We have several hundred of them sitting around unused at this point, but they’re just not fast. I’m aware of this, and aware that if I wanted faster drives, I would need to … buy faster drives. That doesn’t change that the results of the single drive on the on-board IC10H were faster than a hardware RAID0 array, independent of card manufacturer. As expected, a RAID0 of four drives on the motherboard did perform better than the single drive.
The number to pay attention to here is the “aggrb” number, which is the Aggregate Bandwidth (I’m assuming average?) of the Read and Write performance.
Run status group 0 (all jobs):
READ: io=3899.1MB, aggrb=43205KB/s, minb=5389KB/s, maxb=6518KB/s, mint=79407msec, maxt=92432msec
WRITE: io=3854.2MB, aggrb=42698KB/s, minb=5411KB/s, maxb=6329KB/s, mint=79407msec, maxt=92432msec
Disk stats (read/write):
sdb: ios=11269/116, merge=0/4, ticks=990700/7375576, in_queue=11339444, util=99.99%
four-drive md RAID0 on motherboard
READ: io=3865.4MB, aggrb=58034KB/s, minb=7693KB/s, maxb=8684KB/s, mint=56725msec, maxt=68203msec
WRITE: io=3888.4MB, aggrb=58379KB/s, minb=7188KB/s, maxb=9407KB/s, mint=56725msec, maxt=68203msec
Disk stats (read/write):
md3: ios=14434/6379, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=3608/1532, aggrmerge=0/18, aggrticks=174913/2779239, aggrin_queue=3007156, aggrutil=98.52%
sdb: ios=3610/1599, merge=0/15, ticks=162252/1948784, in_queue=2111332, util=91.81%
sdd: ios=3622/1615, merge=0/19, ticks=158384/1409576, in_queue=1568244, util=88.70%
sde: ios=3621/1452, merge=0/23, ticks=210640/4865588, in_queue=5220932, util=98.52%
sdf: ios=3581/1465, merge=0/18, ticks=168376/2893008, in_queue=3128116, util=96.82%
Now, on to the hardware cards.
HighPoint RocketRaid 2740
RAID0, 64kb stripe
Run status group 0 (all jobs):
READ: io=3837.8MB, aggrb=29420KB/s, minb=3727KB/s, maxb=4083KB/s, mint=122403msec, maxt=133571msec
WRITE: io=3903.0MB, aggrb=29921KB/s, minb=3672KB/s, maxb=4227KB/s, mint=122403msec, maxt=133571msec
Disk stats (read/write):
sdb: ios=61319/953, merge=0/6219, ticks=7324792/16807752, in_queue=24753304, util=100.00%
Wow! Just to be on the safe side, I put the card in JBOD mode and ran a test against a single disk.
Run status group 0 (all jobs):
READ: io=3928.8MB, aggrb=41145KB/s, minb=5176KB/s, maxb=6050KB/s, mint=85203msec, maxt=97775msec
WRITE: io=3822.0MB, aggrb=40027KB/s, minb=5062KB/s, maxb=5900KB/s, mint=85203msec, maxt=97775msec
Disk stats (read/write):
sdc: ios=62770/523, merge=0/3073, ticks=5474732/8123704, in_queue=15781220, util=99.99%
That’s a bit slower, but pretty much rules out the cable or drive being tested — it must be the logic of the card or the driver that’s causing the slowdown.
LSI / SuperMicro AOC-USAS2LP-H8iR
While this card was fastest, we couldn’t get a solid result from FIO no matter how dumbed down the configuration we gave it. The test would run overnight before it finished with an error, after one of the 8 threads was consistently writing to/from disk. I’m not sure if this was a bug in FIO or a bug in the LSI/SuperMicro card, but SuperMicro’s support was so stellar in pointing fingers at LSI that we never resolved the problem. LSI, in turn, referred us to SuperMicro, saying they held no responsibility for cards purchased through the third party. It’s interesting to note that we could reproduce similar results by using LVM striping across 4 drives on any other card. We had so many problems with this card’s BIOS (which has a Windows 3.1-like shell as a configuration tool and doesn’t support USB mice) and drivers (wouldn’t support Debian Wheezy or Linux 3.0) that we just gave up on it.
Run status group 0 (all jobs):
READ: io=32206MB, aggrb=36641KB/s, minb=4585KB/s, maxb=4788KB/s, mint=900006msec, maxt=900035msec
WRITE: io=31983MB, aggrb=36387KB/s, minb=4534KB/s, maxb=4796KB/s, mint=900006msec, maxt=900035msec
Disk stats (read/write):
sdd: ios=108411/106883, merge=0/515, ticks=9122710/8694050, in_queue=17816860, util=100.00%
Out of all of the cards, I liked Areca’s drivers and management interfaces the best.
Run status group 0 (all jobs):
READ: io=3906.8MB, aggrb=41960KB/s, minb=5380KB/s, maxb=5783KB/s, mint=89043msec, maxt=95339msec
WRITE: io=3847.9MB, aggrb=41328KB/s, minb=5085KB/s, maxb=5782KB/s, mint=89043msec, maxt=95339msec
Disk stats (read/write):
sdd: ios=19236/19941, merge=0/965090, ticks=1565890/1470100, in_queue=3035980, util=99.96%
Since this card came closest to the single drive performance, we ended up choosing it.
My methodology here is lacking. The few variables that I could control for were different machines/motherboards of the same model, two different cards from the same manufacturer, different 1TB drives, and multiple runs of tests. We ran tests in RAID5 as well. I didn’t publish the RAID5 results here as they were superfluous; just knock 10mb/s or so off of the RAID0 time and you’ve got the RAID5 time. We tried different stripe sizes and different filesystem configurations; with explicitly aligned partitions and without. Nothing seemed to make a difference.
I would expect that any RAID0 array would be faster than a single disk. I would further expect that any cache would improve the performance, even in a heavily loaded random read/write environment. These expectations were proven false. I’m not sure why this is, unless no one else has truly measured performance in a random read/write environment, and people mostly rely on benchmarking tools like bonnie++ that aren’t as brutal about keeping queues flooded and randomizing I/O.
If I were working in a real hardware testing lab, I would move on to test these cards with different motherboards, backplanes, and drives. Unfortunately, I’m not — and this article is the result of 3 months of testing, ordering another card, testing again, tinkering with settings, reading obscure mailing lists, tinkering further with settings, and finally ordering another card in exasperation. I’d welcome someone else’s reproduction or refutation of these results with their own hardware, drives, and cards, a critique of my fio configuration and methodology, and suggestions on optimizing the other variables for improved performance.
Stephen Foskett is running a series about server blades — and as usual for someone who gets a lot of trial equipment to review, he’s pretty bullish on them.
After a few years with blades at my current company, I’m not. Unless you need the density that they offer, they’re probably not worth your time and money — and if you can afford them, you can probably afford to lease another rack or a larger cage.
While Stephen does an excellent job of covering the high points of blades, he skips or glosses over the downsides. The downside is that you re-introduce several single points of failure in the form of the backplane and modules that are plugged into the chassis, and extra management overhead of switches attached to the backplane, and add risk of heat because of the miniaturized and densely packed nature of the components.
Think this is doom-and-gloom? We’ve got a bunch of hardware sitting in a pile that says it isn’t. One of our IBM BladeCenter chassis has only one slot that will work in it — the rest of the slots give you strange PCI bus errors, KVM won’t work, or the management module will fail to connect to the hardware that’s installed properly. Since the blade’s backplane and management modules are a part of the chassis, IBM declined to replace it under our parts-and-labor warranty agreement — they said that we’d have to replace the entire chassis at our cost since the chassis is not a Field Replaceable Unit.
Troubleshooting problems with parts or upgrading parts on individual blades is a chore. Again, many of the parts aren’t technically Field Replaceable Units (and this includes parts like on-blade flash disks), so you’ll need to get out your oddball collection of Torx heads. It’s like laptop repair, with fine ribbons and cooling ducts stitching together byzantine layers of circuit boards. And let’s add another negative in — even if you cool the systems appropriately and your cooling systems aren’t overloaded and don’t ever overload, you still face heat death problems after the term of a normal warranty. Many higher ed institutions are starting to buy on a five year lifespan instead of the traditional three year lifespan, so high density systems like blades or thumpers are not an advisable solution there.
Many blade chassis are limited on expansion module space. Depending on your I/O configuration, you need at least six expansion slots to have some semblance of redundancy — two management modules, two I/O bus (Fiber Channel, Infiniband, 10gbE, SAS, etc.) modules, and two switch (ethernet) modules. The IBM BladeCenter S and E chassis options only support four modules. The higher end newer options, H and HT, support four high-speed and four legacy — keep that in mind when you’re thinking about expanding. Most of the modules only support six ports, which means that you’d need three modules (high speed slots only, of which you have four!) to support a single full-bandwidth fiber channel connection to a server in each of the 14 bays in an BladeCenter H-series — with no redundancy, and no way to expand further. For environments that need both Fiber and Infiniband, you’re pretty much out of luck.
Let’s not forget that each of the modules usually has a management interface of it’s own. The fiber channel modules have a console that you need to manage separately from any other fiber channel interfaces you might have. The switches have a cisco IOS-like interface to them, unless you buy actual Cisco modules for your blade center. Why’s that a hassle? Keep in mind that you need to manage VLAN and trunking assignments and limits on both your core switch and your blade center’s switch.
So: High-bandwidth environments need not apply, since shared connections are the rule instead of the exception. Environments where an addition or switch to a new technology might be managed by adding three or four PCI cards to the affected servers need not apply — your chassis won’t have room for it.
For all of those “Features”, you gain the ability to save some floor space … and you pay a lot more.
Let me introduce to you a new technology called the “40 blade server” — you take a 42U rack, set up appropriate power modules on it, and then plug 40 1U servers and a pair of switches in at the top. Sure, there’s a bit more wire, but that’s easily managed. The 1u servers are individually less expensive than server blades and have a host of nice features — such as independent KVM and individual expansion card slots — that you won’t find in any blade server.
Admittedly, one place we have been very happy with “Bladed” components is our Cisco routers. The ability to hot swap modules and fail between modules is nice — but it’s something that we could manage without; it’s simply a better way to do things in the Cisco world since the price differential isn’t that high and the equipment lifespan is closer to ten years than to three.
But for compute? Heck with that. I see very few environments where blade centers are a good solution compared to a rack of 1u servers.
Dad and I are both hobbyist renovators. Dad has been doing it for about thirty years longer than I have, though, and is a LOT better at it than I am. When I bought my house, he sent me a list of products and notes on them. I’m re-posting here, with some of the discussion edited out, for posterity, reference, and linking purposes. I actually keep a text copy of the entire thing on my phone for reference when I’m away from the internets…
We have a saying in my family — “Ask me how I know!” — which usually means that we’ve screwed something up, bought the cheaper tool or the cheaper product, and ended up regretting it. Each and every one of the below points can be followed with “Ask me how I know!” from personal experience, because I don’t always believe good ol’ Dad either. The below is in completely random order.
- Spackle: MH Ready Patch – Spackle is used for small repairs to drywall, such as picture holes or where you’ve pulled out a molly/plug. MH will stick to lots of things and will harden up well without sagging. It’s solvent based, so it won’t cause problems with oil finishes and can be used on both wood and drywall.
- Caulk: Two products. Polyseamsealis available at Lowe’s and Home Depot and many other places. Unfortunately, Loctite recently bought the parent company, Henkel (also makers of the much beloved PL Premium), so the tub/tile stuff is no longer available in matte and they may have changed the formulation. Outside, we use Sashco Big Stretch Caulk & Seal, 10.5-Ounce, which is pretty amazing stuff. It’s the most elastic caulk I’ve ever seen, but you really have to fill a crack in it — and larger cracks fill better unless you can sort of ‘glue’ the two pieces together. I especially like to use it on Hardi Board/Panel, which shifts a lot in the Texas heat. Once it dries, though, it can only be removed with lacquer thinner, if that.
- Paint: Benjamin Moore. Super Spec is contractor crud, I use a LOT of Regal Aquavelvet, and Aura is amazing with dark or highly pigmented colors. We use Satin Impervo (oil) on trim. I have recently had decent experience with the HGTV and Duration lines of paint from Sherwin, but anything except their top of the line has performed poorly for me both in application and over time.
- Wire Nuts: Use only wire nuts with springs in them, like the tan Ideal ones available at Home Depot. Don’t buy the cheaper mixed bag of Buchanan ones with the different sizes/colors… you’ll spend more time dropping them or having them fall off in the j-box than you will actually screwing them on to wires.
- Screws: Don’t use drywall screws for construction, fencing… or really, anything but hanging drywall. For construction, I put everything together with screws because it makes it WAY easier to take apart later. I’ve done several structural changes to my house and used SPAX screws to hold interior framing together. (I typically use 2 inch or 3 inch screws.) For fences or decks, use fencing or decking screws.
- Paint Brushes: We tend to use Purdy brushes. Remember that you use different brushes for latex and oil; I don’t trust the “all purpose” ones to leave a clean line. My personal favorite brush is a Purdy Cub XL 2″ brush. Wooster brushes are also OK, but I don’t like them anywhere near as much as I like Purdy brushes. Clean them well and they’ll last you a long time.
- Drop Cloths: We put a 3-4 mil plastic drop cloth down with a canvas drop cloth over it. Why? Because without the plastic, the canvas will let the paint soak through. And without the canvas, when you spill or dribble some paint, you will step in it and then track it all over the rest of the house. Ask us how we know…
- Extension cords: Don’t get the cheap 14 or 16 gauge ones. Spend the money for a decent 12 gauge one. You can’t pull enough amps down a 14 gauge one in most cases for the tool to perform properly, and most power tools worth the name are going to pull 10-15 amps under load, not to mention any spikes or high draws.
- Sand Paper: Don’t bother buying Gator. They’re garbage, even the “higher grade” black stuff… the grit doesn’t last long enough and the paper loads up far too easily with dust and other leavings. I have had really, really bad luck with Norton discs leaving horrible dual action orbiter marks because of the “universal” five hole/eight hole design. The Dad-approved product is Mirka sandpaper.
More blog posts along this line coming down the pipe…