We learned as kids. Here is an article I wrote about it in Mainframe Executive, which has been cross-posted to MainframeZone.
I am also pleased that an article by moi — the REAL storage blogger wannabe — has just hit the wire at ESJ.com. It covers file proliferation and the lame efforts of the industry to scale NAS to accomodate the data burgeon.
Frankly, I could have embellished it a bit further with two developments. One I will treat here, the other I will save for another post.
The more I learn about it, the more I like the Novell File Management Suite. NFMS is about policy-based file management based on user role: a concept I like a lot. I had a chat with them today and one of their really smart folks commented that in this DO MORE WITH LESS environment, people seem to be preoccupied with the WITH LESS component more than the DO MORE component. As we have been ranting about at length here over the past couple of months, folks are still trying to throw hardware at the data burgeon — relying on thin provisioning, de-dupe, on-array tiering, etc. functionality joined at the hip to their array controllers to tackle the difficult problem of data management. Technically speaking, data management has virtually nothing to do with hardware.
Oversubscribing storage via thin provisioning might make a dent in capacity allocation inefficiency, but it is downright idiotic to believe that it does anything whatsoever to economize on storage over the long haul. Plus, as I have argued here, it exposes you to really embarrassing failures if an app ever does a margin call and asks for space it technically owns, but that has been allocated somewhere else.
We have already discussed on-array tiering, so I won’t repeat it here except to say that Tears of Storage do not substitute for real tiering, which should be based on a granular understanding of data and its business context, not mindless movements of anonymous bits across infrastructure.
Some proffer de-dupe as a substitute for data management. I call these vendors the 420s, because I suspect they were out smoking a joint when the memo came out to do more with less.
The de-dupe pitch offends intelligence on many levels. Once de-dupe is part of every file system, all of the software and appliances you are currently wasting money on can go bye-bye. For now, it seems more sensible to do de-dupe in software, not hardware. Here’s why:
Today, Sepaton announced a new class of de-dupe rigs. They sell a software-only de-dupe package, but claim that some IT folks actually want a “one stop shop.” So, they are doing the storage plus Sepaton “head” now, as well. At $310,000 for a 30TB rig, that works out to about $10K per TB. Since their software sells for $2500 per TB, that means that they sell a TB of commodity storage for $7500 — quite a bit more per TB than the $79 1-TB SATA drive I can buy on NewEgg. In their defense, they say their cost is much less expensive than a roughly comparable Data Domain rig from EMC, which proffers a non-clustering (hence inferior in Sepaton’s view) box of rust for $15,665 per TB (not including 10GBe cards).
I will let Sepaton and EMC duke it out over the numbers. But I think the comparison is silly on its face.
Let’s look at the reality. A spokesperson for a major telco spoke at a show I attended recently and observed that he was planning to ditch tape in favor of Data Domain VTLs, until the costs of management for the DD rigs bit him in the tuccus (you need to stand up another box of drives and manage it separately when you run out of space on the rust) and he discovered that he needed to classify data to determine which files to direct to the box so that the rig would be utilized efficiently (aka would give him the best de-dupe ratios). He never dreamed that data management was a prerequisite for getting any kind of value out of that value add technology that hardware vendors insist on adding to their already bloated array controllers.
Folks, the thing about data management is that it focuses on data management, not on capacity management. Novell’s stuff is best of class in the products I have reviewed. Dave Condrey’s team has done an out-fracking-standing job with this software, which lets you set policies for data movement based on user role. Is it perfect? Nope. Will the Britney Spears files that that HR guy downloads at lunch be exposed to the same policies as the files created by his legitimate work effort? Possibly, depending on how you write your exclusion policies. But, now we have a sustainable way to tag a file, expose it to appropriate integrity, protection, and security services, and move it around infrastructure in an intelligent and compliant way. AND IT IS ALL INVISIBLE TO THE USER.
The cost per TB? No clue. They sell software on a per user license basis: MSRP is $37 before volume discount. No magic mushroom inflation of commodity disk price there!
I would like to see any hardware vendor compete with this. That is how you not only do the WITH LESS part, but also the DO MORE part.