I participated in a tweetchat today covering auto-tiering as an annoying member of the audience. (The full transcript is here.) It was sponsored by IBM and featured the smart ex-industry analysts at Wikibon, who were very nice to me. I have just completed a column about it for Storage Magazine, but I have a few more thoughts that wouldn’t fit in that space.
In advance of my rant, I invite you to check out Wikibon’s thoughts on auto-tiering. They have a nice article about it here.
The panelists were of the opinion that auto-tiering is very much needed to move older data off of performance storage without bothering the poor overworked storage admin. They also asserted that traditional tiering was changing significantly with the introduction of Flash SSD. I didn’t get that. Here is the traditional tiering model c. 1980s from IBM.
Even way back when I was a bright-eyed lad in my first data center role, memory was at the top of the storage food tree. We just moved data quickly from that tier to the lower ones because of cost. Because I am slow witted, I didn’t see how Flash SSD did anything to change the model.
The exception would be the use of Flash SSD as a temporary location to store data that is getting hammered by multiple, concurrent accesses — basically augmenting and optimizing disk. When the data cools, it moves back onto disk (or data access requests are simply repointed back to the original copy of the data that never left the disk).
This strategy uses Flash to optimize disk, using Flash as the source when data is “hot”…
When accesses decrease, data is cold and access requests are repointed to disk. The copy of the data on the Flash SSD is erased.
That’s an interesting loop back on traditional one-direction tiering, but we are still talking about traditional tiering for the most part. X-IO does this really well, by the way, supporting the data movements with its HOT SHEETS patents.
Other than adding cost to storage, I fail to see why Flash is considered so disruptive to traditional tiering.
Next, someone said that the entire media stack should be deployed as a monolithic unit. (Tape was not included until I brought it up, which made for a nice sidebar about LTFS.) This provoked a discussion of the merits of monolithic storage and of homogeneous storage. They seemed to like it when I said that if I followed Forrester’s guidance, I would build all of my storage infrastructure using just one vendor’s wares to ensure that you could tier between platforms — such a strategy would limit you to one of a handful of vendors with a deep enough product bench to provide Flash storage, high performance disk storage, capacity disk storage…and maybe tape. In fact, if you included tape in your stack, your options for single vendor sourcing come down to IBM, HP and maybe Oracle/Sun/STK. (A Dell guy on the chat didn’t like the sound of that.)
My suggestion that we might virtualize storage and create share-able pools from heterogeneous infrastructure, then tier between the pools, met with a mixed response. The absolutists said no sharing in high performance tiered storage, while another camp said what I was thinking: sure would make things less expensive.
What I was trying to drive to was that it wasn’t necessarily the kits you used that made tiering workable, it was the software with which you define policies on data movement and execute data moves per some trigger or calendar function. That conversation went nowhere, since we were short on time and I guess no one on the call knew of any software that could tier data across targets that wasn’t embedded on a proprietary array controller.
Someone mentioned SVC and DataCore Software. I was delighted but left the subject there.
In the final analysis, I left the discussion thinking that auto-tiering probably doesn’t save nearly as much money as the vendors say, and I am almost certain that whatever cost-savings accrue as eaten up by hardware costs. Most multi-tier rigs sell commodity performance and capacity components at a huge mark-up, because they are preintegrated and complemented by value-add software. Then there is the matter of vendor lock-in in monolithic arrays. No one wanted to go there.
Oh well, I guess I was a fly in the ointment. I just wish folks would recognize that you can do all of the capacity allocation efficiency improvements that you want — dedupe, compression, thin provisioning, auto-tiering — but you aren’t going to win the infrastruggle. You need to focus on capacity utilization efficiency — placing data into archives based on business-facing policies and a granular understanding of data and the business process and apps that it serves.