- casino online slots

Data Domain Brags About File Locks

by Administrator on June 27, 2008

Okay.  So we all know what a litigation hold is, right? 

The Fed or some other plaintiff is suing your company and says all of such and such records need to be secured or locked immediately because they may be germane to the case at bench.  That means you can still access the data, but you are no longer permitted to alter its contents.  For compliance-related holds, such as those required in Sarbanes Oxley or the SEC, you need to retain full and unaltered copies for a lengthier period of time — say 10 years.

Data Domain has just released some technology for putting a hold lock on data stored in its, well, domain.  Here’s the story in eWeek.

Using Retention Lock, IT administrators can store deduplicated files in an unalterable state for a specified length of time. This introduction of WORM (write once, read many) for active-archive, high-performance deduplication storage enables enterprises to implement a range of corporate IT governance policies that require data be retained and unchanged for fixed periods of time before removing it.

During the specified Retention Lock period, users cannot change or delete the files, but trusted operators can manage files and space as required. This level of compliance protection is designed to support those regulations that focus on protection from inadvertent or malicious data modification by storage users.

While some industry-specific regulations require immutability guarantees that assume even IT administrators cannot be trusted, many others allow that files may need to be deleted under court order—for example, to protect identity information or to remove other inappropriate confidential information. In these governance regulations, data still needs to be held with retention enforcement, but administrators need file-level policy flexibility.  

Cool.  But while we all have our Perry Mason hats on, let’s ask ourselves a question.  Will de-duplicated data, which is what Data Domain is hosting, pass muster with the Fed as “full complete and unaltered data?” 

A lot of my financial clients don’t think so.  They are excluding from de-duplication platforms data that may fall under SEC rules or other regulatory requirements. 

Moreover, with the changes in the Rules of Evidence, almost any data can be construed as potentially germane to future litigation.  That seems to extend the worry to ALL data, not just specific docs.

Why wasn’t this issue even raised in the eWeek article?  Where are the legal precedents that support the case that de-duped data fits the legal definition of “full and unaltered”?  Both NetApp in our questionnaire, and now Chris Santilli at COPAN (see previous entry), have acknowledged that there is a nagging legal concern here. 

So, I wonder who will become the test case for de-dupe going forward?  And who will fall on their sword if the company gets hit with a suit, followed by a rejection of data that has been subjected to de-duplication? 

Will it be you, Mr. Storage Manager?  We are told in the article that you are now being included in the list of untrustworthies.  This issue is the big elephant in the parlor that no one is talking about.

Additional thoughts:

Readers (those of you who are data managers), I am not trying to fan FUD with this post.  A lot of your vendors are claiming that electronic data is already altered by its placement on disk owing to optimization schemes at the file system and disk mapping level.  In responses received to our de-dupe survey a couple of months ago, it was pretty clear that the bulk of vendors selling you de-dupe technology are sincere in their belief that de-dupe causes no harm and does not increase risk of regulatory non-compliance or other legal issues.

That’s what they say, and they might be right — from a technical perspective.  However, the legal system has an interesting way of taking years to adopt the generally accepted views of the vendor community.  I don’t see anyone in this space offering an additional codicil on their warranty agreement that says they will stand between you and the Justice Department or SEC administrators if a challenge is mounted to the compliancy of electronic records that have been compressed or de-duped.  You are on your own.

The record is clear, from 1977’s Bank of New York software crash forward, that the legal ball rolls downhill and the IT guy is at the bottom of the incline.  You deploy technology like de-dupe because you think it will let you store more bits more economically.  You think doing so will save the planet (or at least reduce some power demand).  You think it will create a second tier repository that will make the restoration of data more expeditious or the transfer of data across networks more cost-effective and less bandwidth intensive.  These are all admirable and lofty goals that seem to be well addressed by de-dupe.

However, in a court of law, they are meaningless.  A good lawyer will rip you to shreds for placing capacity optimization ahead of governance and compliance rules, which protect shareholders, patients, customers, and the common weal generally.  You need to consider the risk to the company (and to your own professional career) and get it in writing from somebody in the front office that they were made aware of the potential pitfalls of the technology, but approved its use.  Might also want to get in writing (and in the presence of witnesses and a notary) a clear statement from your vendor about the known issues with the technology and the fact that it has NOT been approved by any legal entity for its compliance. 

As further protection:  Under no circumstances should you scan these docs electronically then submit them for long term storage using de-dupe.  Keep them on paper in a lock box that is fire proof, water proof, and permanent.

Articles like the eWeek story have a comforting ring, but not when you consider that the data being locked may already be considered corrupted, from a legal standpoint, because it has been de-duped.

Don’t let a blind acceptance of what “everyone who’s anyone” is saying about de-dupe at storage conferences and in the press turn you into the ultimate dupe.  C-Y-A is the rule of the day. 

{ 9 comments… read them below or add one }

IndyTgrFan June 30, 2008 at 11:14 am

Words like “compliance” and “regulatory” and “legal entitity” seem to get thrown around in articles like this. You say that this isn’t meant to Fan FUD yet how could it be construed any differently?

What are the compliance requirements that people should be worried about? And which “regulatory” agency is responsible for testing the technologies and certifiying them? It seems to me that a “requirement” that doesn’t have some “regulated” methodology established for “certifying” acceptable technologies is at best a hollow document that allows vendors to twist and manipulate their interpretation for their purposes and at worse a scare tactic re: LEGAL RAMIFICATIONS!!!

I would guess that without the above in place a good lawyer could just as easily rip to shreds the “requirements document”.

Haven’t these “compliance” issues been what EMC has used for years to convince companies to lock themselves and their data into the Centerra Platform and it’s WORM? Doesn’t this technology offer a “single instancing”? Why hasn’t the issue been raised before now?


Administrator June 30, 2008 at 2:59 pm

What do you propose I do, Indy? Leave the questions unanswered? I would consider that to be a huge diservice to readers even if it would get me projects with de-dupe vendors.

EMC calls CENTERA “compliance certified” — which it obviously is not. I have ranted about this point here, even challenging their paid analyst defense of the product, and have incurred the wrath of Hopkinton on exactly the same grounds as you are now criticizing my attack on de-dupe.

We should at least try to have legal provide an opinion before we deploy stuff that has specific legal ramifications. As long as de-dupe is just being used with backups and an original version of data is hanging around, I couldn’t care less what you put in the de-dupe repository. However, when the vendor represents the solution as a compliance solution, I have to ask about its compliance features, don’t I?

IndyTgrFan June 30, 2008 at 9:16 pm

You miss the point.

What I’m trying to figure out is WHO gets to say what is or isn’t “compliant”? And WHY?

I’m not all knowing but I’ve never been made aware of some overarching entity that specifically tests and certifies technology for “compliance”. Without this, as I asked above, isn’t everything left up to broad interpretation (or you could call it ambiguity).

Does such an entity exist?

MJ2784 July 1, 2008 at 8:03 am

In the end, I think that it comes down to case law. (Note: I am not a lawyer.) The risk being run is that a lawsuit is brought up and the plaintiff’s attorney claims that the storage subsystem used by the plaintiff changed the data and questions the validity of data provided in the case. Now is this really a valid concern? Who knows, probably not, but the result will then be complex legal wranglings where one of the main issues on trial becomes whether the data presented by the storage subsystem is 100% identical to what was originally written.

The plaintiff will throw out all the points they can to try to discredit the storage device and win the case. In the case of the DD, they will use the argument that the technology is based on hashing which has a mathematical probability of hash collision and data corruption. Yes, the likelihood may be small, but it is still there and they will use it.

The more complex the technology gets the more difficult it can be to prove the reliability which is one of the reasons why there may be more more concerns about subfile dedupe as compared to Single Instance Storage or CAS devices.

While many of the plaintiffs points may be misleading or just plain wrong, it will cost substantial time and money to defend the claims. This is why Jon is suggesting that a legal opinion be considered. If IT saves $ by implementing the solution, but opens themselves to these legal challenges, does the purchase make sense? There is no clearcut answer and it depends on the individual company and industry.

To close the loop on the case, if the case is won by the defendant then the court must have acknowledged that the storage system was accurate and the device then has court precedence for its accuracy. This means then that future lawsuits will have a difficult arguing the same point because the court will now have case law supporting the storage solution.

In the end, end users should understand the implication of this technology before purchasing and specifically the legal ramifications. The best place to get feedback on the latter point is someone with legal expertise.

Administrator July 1, 2008 at 10:40 am

Unfortunately, both Congress and the regulatory bodies have a tendency to push out rules and regulations without a) any sort of detailed implementation guide or b) any sort of certification authority. You don’t know if a technology will bite you in the tuccus until a legal case establishes a precedent.

When EMC said that Centera was compliance certified, I asked the same question you are asking. Who certified it? I was told that they simply sent a letter to the SEC describing the product and did not hear back for 30 days (probably because the letter went to a dead letter office and there was no one there who certifies anything). Their contention was that silence equals consent, which is not a valid legal tenet from what I have been told. So, bottom line, it is bullshit.

Now, we have de-duplication providers arguing, in this case, that they have locks to hold data for specified intervals, wrapping themselves in compliance, when it remains to be seen whether de-duplicated files will pass muster with the Man in any case.

We can’t just throw a blanket over the issue and hope it will go away. It is important to repeat that many financials are going to great pains to segregate data from de-dupe that fall under the aegis of regulations. That seems, to some degree, to defeat the purpose of de-dupe altogether.

That ain’t FUD, it’s just what folks are doing.

Jered July 2, 2008 at 2:40 pm


The point you ought to be focusing on is “trusted operators can manage files and space as required”. This is 100% in violation of rules like SEC 17a-4. A trusted admin can’t be allowed to delete the files, period — retained is retained.

As for whether or not deduplication affects data integrity for things like discovery, claiming it does is FUD plain and simple. As I responded in your dedupe questionnaire:

Dedupe does not change data any more than compression changes data, or traditional file systems change data. Plain old LZW compression gives you a different output bitstream than what went in, with redundant parts removed. Conventional file systems break up files into blocks and scatter those blocks across one or more disks, requiring complicated algorithms to retrieve and return the data. Dedupe is no different. Nonrepudiation requirements are satisfied by the reliability and immutability of the system as a whole, deduplicating or not.

Jered Floyd
CTO, Permabit Technology Corp.

Administrator July 2, 2008 at 6:53 pm

Hi Jered,

I am not disagreeing with any of your points from a technical standpoint. I don’t think that storage admins should decide what is important from a business perspective, deleting as they see fit electronic files of the company. But, you have made my point as well. Letting the IT guy field technology like de-dupe and expose all files to it, without a full understanding of the legal ramifications of doing so, is equally foolhardy.

Deduplication has NOT yet been subjected to the acid test of litigation. Nor has lossy compression to my knowledge.

What you claim about dedupe may be 100% true technically. That doesn’t have anything to do with the law, however.

How many people turn to the courts for justice? Their interpretation of fair and just is often quite different from the interpretation applied by the court based on case law and precedents.

Does Permabit provide any assurances to its customers that legal hassles will not result from deduplicating files? Are you offering your services as an expert witness or proffering your corporate attorneys to catch the legal bullet aimed at your customer if the need arises?  Will you pay whatever penalties or fines that may accrue?  Does your warranty or maintenance agreement so state?

Call it FUD, but I wonder if this is simply just a discussion that you would prefer not to be had at this juncture. I welcome a legal opinion on this matter and think that someone in the dedupe business ought to get just such an opinion in writing from a judge or the American Bar Association or somebody authoritative. Until that happens, we are all engaging in speculation that probably goes well beyond our areas of expertise.

If the question exists, and to many of my financial clients it most certainly does, why not get it out in the open and address it?

Jered July 3, 2008 at 9:50 am


[quote] Deduplication has NOT yet been subjected to the acid test of litigation. Nor has lossy compression to my knowledge. [/quote]

Permabit’s systems are used in the field for the archive of critical data, including data subject to the SEC 17a-4 rule. While I don’t know if any specific cases, I’m quite certain data stored on our storage, incorporating deduplication technology, has been subject to litigation discovery. The integrity of the deduplicated data has never been called into question.

Also, deduplication is more along the lines of lossless, not lossy, compression. But, even for lossy compression, audio recorded onto portable devices that record in the MP3 format has definitely been used without problem in court.

[quote] Does Permabit provide any assurances to its customers that legal hassles will not result from deduplicating files? [/quote]

For our customers with SEC 17a-4 requirements, we assist in the process of filing a letter describing the archiving system and technologies; in doing so the customer can receive a “no action” letter indicating that the SEC does not believe they are in violation of the rules. (This is not quite the same as saying they are not in violation of the rules, a matter on which the SEC does not opine.)

In the event of litigation, I would be happy to be an expert witness to explain the integrity of the recording process. We also have an independent report from Compliant Systems Consulting LLC, finding that Permabit Enterprise Archive satisfies regulatory compliance requirements.

[quote] Call it FUD, but I wonder if this is simply just a discussion that you would prefer not to be had at this juncture. I welcome a legal opinion on this matter [/quote]

You could get an attorney’s opinion, if you really value that. You could probably get contradictory ones for that matter. What does matter is if deduplication is even considered an area to try to attack in a legal discovery case, and evidence is that it is not.

There are far more dangerous areas we should be concerned about if we’re looking for legal opinions. For example, should it be acceptable for companies to use RAID 5 to protect important data, given how vulnerable that is to data loss with modern, high-capacity drives?

[quote] If the question exists, and to many of my financial clients it most certainly does, why not get it out in the open and address it? [/quote]

Consider it addressed. 🙂 Have a great Fourth of July!

Jered Floyd
CTO, Permabit Technology Corp.

Administrator July 3, 2008 at 10:39 am

Thanks for the response, Jered. I think you may have gotten Permabit a discriminator for its product:

“With every sale, Permabit promises to provide an expert witness should any litigation arise around the use of our technology.”  Add it to your sales contract and its golden.

Your whitepaper on compliance would make an interesting read.  How does it measure up to EMC’s paid for whitepaper saying that Centera is compliant.  As previously blogged here, there is a page and a half disclaimer by the company that wrote that one saying that the paper does not replace competent legal counsel and that your experience with the courts may be quite different than what is represented in the paper.

By the way, I agree that RAID 5 alone is not sufficient to protect data in the manner required by many regs.

Happy 4th to you and yours.

Previous post:

Next post: