Yesterday, after writing here, I jumped on Twitter to get some input from the peanut gallery. Actually, it turned into a rather engaging exchange.
I started with a shout-out to the world that on-array storage tiering struck me as technology for lazy people. I knew that would provoke someone. Shortly after, I had an exchange with a fellow that went something like this…
Inbound tweet: …and tractors are tech for lazy farmers?
Me: Figured that would get a rise. Hey, we used dfHSM in the mainframe world, but it was a software architecture — provided by the operating system, not a process performed by DASD. I don’t know that your analogy is valid. Not saying don’t use a computer or an array. Question is whether on array tiering solves any real problems…
Inbound tweet: Well, as I pointed out recently, an analogy makes the point the analogist wishes
Basically, I believe there are use cases…and strong use cases for on array tiering – there may be valid reasons why admins can’t touch the file system to use HSM. If HSM were cheap, easy and worked well, on array tiering wouldn’t be so damned interesting to a great number of end users.
Me: Would really like to hear your use cases. I know we’re short staffed with a lot of data in our storage, but I have a prob with brain-dead algorithms and overpriced hardware stovepipes. Do this in software.
Tweet from an EMC guy: You’re also relying on the assumption that HSM is available on every system; that’s not necessarily true.
Me: True enough. But tools are getting there for policy based file management, which is what we are really talking about here…
Original Inbound Tweeter: Use case #1 – storage admin doesn’t own the application but is under the gun to lower storage cost. Use case 2 – complexity of data management will increase administrative cost while hopefully lowering capex; no win for HSM. Ok, 2 wasn’t a use case, just a reason, sorry…
Me: I get Use Case 1 all the time. The truth is that policies should be set by the front office regarding when and where data should go…
Inbound Tweeter: Oh, i agree in whole, but the front office doesn’t back it up, doesn’t build strong policy or constantly makes exceptions…we’re talking reality versus theory.
Me: Maybe we were yesterday. Today we are talking survival. The legal and regulatory mandate, particularly around personal data privacy, is moving from breach notification to breach liability. Seems like understanding your data and applying appropriate services to it based on its business context is becoming a must have, not a theoretical nice to have.
Interesting exchanges happen on Twitter. I also got this note via email today from a fellow who is a former EMCer and preferred not to have his name disclosed…
As someone who just left EMC, I’d rather not post openly in the blog at this point, but I agree with most of what you are pointing out.
The real dillusion in the tiered storage approach is that the data owners – the end users – have no incentive to participate in data classification, so the decisions get left to data administrators or worse (software). Just because data is accessed frequently doesn’t mean the access is high priority.
Tiering inside the box will only make sense if a particular application is smart enough to control the layout of it’s data in partnership with the data owners and then only if it understands how the tiers are organized. But if Oracle or Exchange are really that smart, why do you need an enterprise storage array?
Why indeed?

{ 2 comments… read them below or add one }
Oy…the case for external intelligent AUTOMATED teiring is already made – it’s solid, it’s customers vs vendor doing the speaking…
Talk about obfustication – front office, business users don’t have time or interest in facilitating the teiring of their files – they get no benefit from it, and wouldn’t know how to do it anyway. They want a limitless supply of high performance, always available capacity – they don’t want to hear about how hard or expensive it is to give it to them – they don’t even want to worry about ensuring their data is safe – just ask one of them when was the last time they backed up their laptop.
Only storage admins, IT architects, operations people, and infrastructure managers care about the HOW of delivering that perfect, limitless supply of capacity. They know they won’t get help from the business, and they also know that the business wouldn’t know what files were valuable if they asked. They know that if they asked the business 3 times in 3 days they would get 3 different answers. The fact is IT DOESN’T MATTER!!!
IT can implement effective, thoughtful, and economically powerful tiering, based only on external metadata and simple business policies. This is truly a case where 80% is good enough. Classification is a red herring. If a file hasn’t been used in 90 days, move it to Tier 2. If the user writes to it, move it back to Tier 1. If it makes sense in your business, move all MP3 to Tier 2. Or move all JPEG to Tier 2. Suddenly, Tier 1 contains frequently updated relatively valuable data. The actual value doesn’t matter if all the sudden you aren’t having to backup 80% of your files everyday. So what if you still back up a few files of lesser value? You were backing them ALL up yesterday.
For more on the economics here see actual customers giving examples at http://www.techvalidate.com/product-research/f5-arx
Tiering internally may increase utilization of a single array. Tiering to SSD internally or externally definitely can help with performance as Rick Gillett proved in his SNW presentation in Spring 2007 http://www.snia.org/education/tutorials/2007/spring/storage/Accelerating_Application_Performance.pdf
BUT – the value of automated external storage teiring is multi-faceted – including increases in capacity utilization, decreases in disruption to end-users, ability to use less expensive storage, and reduction in backup time and cost. Tiering internally provides only a subset that includes only weaker facets.
Jon, response on my blog if that’s OK – http://www.storagegumbo.com/2010/04/dear-jon-use-cases-for-block-level.html
You must log in to post a comment.