ISE Launches

by Administrator on April 8, 2008

A key building block of enterprise network storage architecture, first dreamed up at Digital Equipment Corp and Compaq Computers in the late 1990s, has finally seen light of day.  Xiotech, which acquired the assets and management of Seagate’s advanced technology group late last year, released today the world’s fastest and most resilient storage component since papyrus and rock tablets:  the Intelligent Storage Element (ISE).  Read my VERY POSITIVE review here.

At dinner with Xiotech last evening, the atmosphere was one of glee and foreboding.  The techs were clearly delighted with the decision by Casey Powell to introduce this disruptive technology into the market — a daring undertaking that makes Xiotech’s product offerings every bit a competitor to the likes of EMC, NetApp, HDS, IBM and HPC and propels them ahead of the pack of knee knippers and ankle biters led by 3PAR.

SPC 1 and SPC 2 performance results, released as of midnite last night, show ISE to be NUMERO UNO.  Plus, the on-board drive re-manufacturing processes are worthy of emulation by NASA.

I love this thing and will pay close attention at the efforts of Xiotech to parlay the product into the success story that it deserves to be.  ISE, combined with a superior web services based management technology from Xiotech (ICON), has the potential to change everything.

{ 11 comments… read them below or add one }

LeRoy Budnik April 9, 2008 at 11:53 am

What was the foreboding part?

Administrator April 9, 2008 at 1:43 pm

Hi LeRoy,

My sense is that foreboding was driven by equal parts “How will the rest of the industry treat our announcement: will it be met with a counter-barrage of hype and FUD?” and “Will the trade press folks and analysts get it?”

Same stuff that folks in the industry sweat with any new product announcement, I suppose.

TimC April 9, 2008 at 10:10 pm

I fail to see the *disruption*. A sealed block of disks isn’t exactly my idea of a step forward. What happens when I lose the third disk? Hell, what happens when I lose the second disk? I have two hot spares, then what? I get to buy a whole new block of disks and copy everything over so that I can replace the original?

Sun already tried this with their blade storage system. It was a bad idea then, and failed miserably, and it’s a bad idea now. Having to rebuild 5TB of data because your *block* failed hasn’t ever been a good idea.

sanguy April 10, 2008 at 11:03 am

Still no snapshotting and no built in replication. Totally bogus. And really, how hard is to to swap out a drive when one goes bad in your enclosure? We have a Xiotech MAG 3D and a Compellent SAN at my site and the total amount of time I’ve spent dealing with bad drives is about 5 minutes. I’m sorry, but I fail to see how this really changes the game.

Administrator April 10, 2008 at 1:44 pm

Thanks for your comments, TimC and Sanguy. I think you are missing the key points though. I will try to make my view more explicit in a white paper or something.

Frankly, I think that if you are looking for SAN-in-a-Box wares or traditional big iron, you don’t see the point. Unfortunately, I had a long talk with someone today who said the presentation around the product seemed to encourage such comparisons rather than to introduce the the big differentiation. I wasn’t there so I don’t know.

From where I’m sitting, we have a building block. Software guys will flock around it to add the kind of value that you seem to say is lacking in the product. Frankly, I would prefer all “value add” to be outside the product rather than built in. That way, I can buy the fastest, most resilient, most morphable (in terms of interconnect) datapacs at the lowest price and selectively add services that I need. I don’t pay for shite that I don’t use and I buy manageable boxes at a price that is not severely different than the commodity disk drives inside.

I for one am sick of paying $53K for one TB of HP/HDS storage or $73K for 2TB/360 GB usable of NetApp storage… Why aren’t you?

TimC April 10, 2008 at 11:31 pm

I understand what you’re saying, but it doesn’t change the fact *datapacks* are a HORRIBLE idea. They always have been, and EVERYONE who has tried it has failed. Why would I *EVER* hamstring myself by not being able to swap a single disk? This is the same reason you don’t build a 30disk 1TB sata raid group. Rebuild is a nightmare.

I don’t know how anyone could actually sell this to an end-user with a straight face. How does that conversation go?

“What happens when a datapack fails”
“They never fail”
“So what happens when it fails?”
“well, if you mirrored to another datapack it’s only a 1 week resync… or about 6 weeks from tape”

As for pricing… this changes nothing in that arena either. The reason you’re paying that much for a TB of storage isn’t because of the cost of hardware or software, it’s because of the profit margin tagged onto it. This isn’t *lowering* the price of anything. You mean to tell me that’s as cheap as Xiotech could sell this thing for? THAT is a good one. They’re getting theirs just like everyone else…

“$32.25/MB/Sec for SPC-2″ Wow, THAT’S impressive. What a “game-changer”. Whereas I have a whitebox sitting next to me running ZFS on a bunch of SATA disks in a supermicro case. That runs about 10$/MB/sec for SPC-2 and that’s being EXTREMELY conservative. I can serve iscsi, cifs, nfs, or fcp and I could do it cheaper today with better hardware to boot… Oh, I can also swap disks when one fails…

TimC April 10, 2008 at 11:35 pm

As for your comments about building software around this:

1. Buy an FC JBOD
2. Populate it with “low-cost” off the shelf disks
3. load up solaris
4. make your own filer on the cheap

Anyone is free to do so.

jasondbaker April 11, 2008 at 2:52 pm

TimC, you are thinking too small. You have to think of a datapack as a large, fast, resiliant, self-healing drive. It is a building block for larger storage. Nobody has built this sort of datapack before. The characteristics of the device cannot be compared to a 10, 20, or 30 disk array.

The pricing is a huge change. I’m not sure if you buy high end storage. If you do you probably know that you pay 2x for the cost of a disk shelf over 5 years (at 20% annual maintenance). So that $25k shelf really costs you $50k. And after 5 years you probably have to replace the whole shelf. The cycle begins once again. Xiotech is basically saying you buy the shelf and it is guaranteed for 5 years with no annual maintenance. In five years I might have to replace it. But when I replace it I will probably be able to buy 2x as much storage for the cost of the original shelf. You are probably looking at a 40%+ reduction in storage costs.

ericmb April 15, 2008 at 11:11 pm

Admin,

I read about this ISE thingy and then I clicked on the link and I seriously fail to see anything new here. I think maybe you suffer from not having worked doing BAU (business as usual) in a large environment of late?

Im an x netapp employee, my last active assignment was with one of the biggest US banks globally for close to 2 years.
they performed an internal TCO and Netapp came 3 x cheaper than EMC. So please next time you throw out your figures of “cost” maybe you should specify that your figures will not fit all environments? because taking cost of hardware ONLY is a very simplistic way of looking at it. it just doesnt work mate.

Also, Ontap has had disk maintenance for a while now, its not new. Disk maintenance is
“on-board disk remanufacturing” as the article called it, and it came with Ontap 7.x something. I forget which specific Ontap version it is.

Figures from big sites show that there is 75% less returned disks to manufacturer. needless to say this is time saving in an environment where getting rid of a failed disk is a 2 hour process at least, taking into consideration all the admin/change controls you have to go through to send someone to grab a disk.

all the best,
Eric

Administrator April 16, 2008 at 10:09 am

Hi Eric,

I have been in large environments of late and I don’t see them clamoring for any new technology. They don’t route as the saying goes and are content to have FC fabrics and one size fits most storage. A guy from a major NY financial a couple of weeks ago told me that he has to buy what the purchasing department has on its play card (deals made in the front office with the likes of EMC, NetApp, HDS, etc.) — so he can’t even think of taking advantage of anything new. I don’t see the large enterprise space as the early adopter of this technology, regardless of speeds and feeds.

That said, I am delighted that your TCO analysis shows NetApp to be 3X cheaper than EMC in the use cases you are citing. I am in full agreement that TCO covers more than acquisition price (duh), however the price tag is what we have to work with at the outset. $83 K for 360GB of usable capacity off a baseline NetApp filer is pretty steep from where I’m sitting.

I wasn’t aware that OnTap has already done the on-board remanufacturing thing. That will be news to everyone, since this is patented technology purchased from Seagate by the Xiotech folks. Maybe there’s a lawsuit opportunity there and they can get their money back from Seagate. (Tongue in cheek, here.) Please tell me more.

The stat you quote on 75% less returned disks to the manufacturer is one I haven’t heard. What is your reference? Certainly not Seagate. They have shared out their numbers on drive returns and they are much higher than what you cite. Also, the preponderance of drives returned have no discernable internal fault, which they discover through the use of the testing and refurbishing processes that are on-board ISE now.

Thanks for commenting.

jspiers May 15, 2008 at 11:00 am

I just receved a Xiotech advertizement quoting John -“Xiotech… released today the world’s fastest and most resilient storage component since papyrus and rock tablets.” with a link to this page. I’m surprised that they would do this given all the negative comments.

I spent 15 years in the disk drive industry and I’m still not sold on this technology. In fact I’ve considered developing it myself, but as you dig into the details it’s hard to build a business case when it’s full of risky assumptions.

The specifications/assumptions for the product are based on drive reliability models that are from the manufacturer, which I know from personal experience are full of sunny day assumptions, and are being proved wrong on a regular basis. See http://www.cs.cmu.edu/~bianca/fast/index.html

Granted, many of the “failed” drives that are retuned from the field are diagnosed “No Defect Found.” Data has proved than many of these drives fail erroneously because of their environment. For example, if the disk drive is mounted into a poorly designed carrier or chassis, the rotational vibration of the disk drive in the carrier or those adjacent to it can cause drives to generate R/W errors and fail. When you take one of these failed drives out of the carrier and test it on a rubber mat it’s perfectly fine. Hitachi disk drives have incorporated an accelerometer to measure rotational vibration and can adjust their servo system on-the-fly to compensate for it. I know other disk drive makers have implemented this or similar technology as well.

Xiotech’s analysis says if you mount drives and cool them appropriately it eliminates these “NDF failures”, thus improving overall reliability. This is very true, but all the major storage vendors have figured out proper disk drive mounting and cooling years ago, and the sophistication of today’s disk drive servo and read channel technology prevent many of these “environmental failures” anyway.

What the disk drive guys don’t talk about are things like thermal asperities, erasure, media corrosion, firmware bugs, contamination, part defects, and the many things that escape the factory’s tests on a regular basis. All it takes is one contamination event, a bad batch of heads or media, or a firmware bug to wreck your reliability calculations. I could write a book about all the things that can go wrong with drives that are not considered for these types of designs. In fact, even the disk drive guys are still trying to figure out what causes many of their field failures that don’t show up during reliability testing. This is why the big OEM’s make the drive guys screen for certain defects across the street from their factory, which I doubt Xiotech is doing. Disk drive guys also bin drives based on quality and ship the best to the big OEMs and pump the rest into the channel, which is where Xiotech probably buys from.

The box appears to be exposed to a double disk fault from BER events during reconstruction onto a hot spare, because it appears that there is no RAID 6 or equivalent protection against a 2nd drive failing before its data is successfully reconstructed onto a spare. Background scrubbing and using drives with a BER of 1 in 10^15 or better reduces this risk substantially, but doesn’t eliminate it. Mirroring also reduces it because you only have to read one drive to re-construct the failed drive.

What blows their assumptions apart is the varying environment in the data center, which they can’t control. Having a drive replacement once every 1-2 months (a very high failure rate compared to published MTTFs) in a standard array in a poor IT environment is manageable, but if the environment takes out a drive every month in one of these closed boxes it could completely fail in 6 months or less depending on the sparing model used.

What really scares me is their claim that they can fail a disk surface in a drive and use the remaining good surface(s). First of all, you just lost 1/6-1/2 of your performance of that disk drive, because all heads read and write in parallel. The more heads you have the faster the disk drive. If that drive is in a striped set, it degrades the performance of the entire set.
What scares me most about this is the possibility of a head crash, which can result in magnetically charged media particles floating around the drive flipping bits everywhere. This practically guarantees silent data corruption, and T10-diff may not save you. If a disk drive surface fails in a standard array, the drive is dead, like it should be.

Bottom line:

1. Xiotech’s reliability models are full of “sunny day” assumptions.
2. Having predictable storage behavior when a drive fails is important.
3. Failing only surfaces of a drives exposes the system to data loss.
4. Many things can cause one of these boxes to run out of spare drives before their model says, causing unplanned downtime, data migration, and possibly data loss.
5. Today’s failed drive replacement and RAID rebuild model, although cumbersome, delivers known reliability and performance metrics.
6. Upgrading the array with newer drives when they come out is not possible.
7. After 5 years you are forced to completely replace the system, because the warranty is not renewable, and the likelihood that the box will fail in the 6th year is close to 100%.

Seagate peddled this to me and everyone else in the industry. I can’t imagine Xiotech, a former Seagate owned company, being the only one “smart enough” to sign up.

Someone on another blog accused those bashing ISE as a sign that we’re all running scared. Well, that can’t be true in LeftHand’s case, because we are hardware agnostic and would be intersted in OEMing this technology once it’s proven in the field.

John

Previous post:

Next post: