Me and Clouds 3

by Administrator on February 19, 2013

Up til now, I have been conflating two ideas and referring to them collectively as clouds.  The two ideas are outsourcing and service provisioning, what some might call a “public cloud.”

As the industry has insisted on drilling into our vocabulary, this is only one of three types of clouds:  public clouds (network-based outsourcing services) are what I have been criticizing in the previous installments), but there are also private clouds (internal clouds, might “fog” be a better description?) and “hybrid” clouds — where public and private clouds get together (like JDATE, Christian Mingle or Bootycall.com, but for clouds.)

Forget the hybrid stuff for now.  Too many issues — all of the fogginess of private clouds and all for the WAN SLA challenges of public clouds.  Let’s focus for a second on private clouds.

What is a private cloud?  Heck if I know.  From the literature of vendors, I have some descriptive sound bites:

  1. Dynamic allocation and deallocation of resources to workload, which translates into agility
  2. Secure hosting of applications and workload
  3. Predictable service levels
  4. Automated, less labor intensive
  5. Well managed infrastructure
  6. Less costly infrastructure and lower energy costs

Who could argue against such wonderful benefits?  Heck, I like the vendor who promises these things so much, he can marry my daughter.

Truth is, though, the vendor just described a mainframe.  Hey, maybe a mainframe is a cloud.

A few decades ago, mainframes were tricked out with capabilities for multitenancy.  Instantiating apps in Logical Partitions, insulated from one another for security and stability, managed via a coherent system-wide management model and leveraging an infrastructure of de facto standards-conforming equipment, mainframes provided (and continue to provide today) a scalable, well managed, predictable and secure platform for hosting lots of different workload.

Contrast that with x86 hypervisors and you quickly realize that you are comparing tanks to tinkertoys.  Give VMware 30 more years and they might catch up with PRISM and LPARs, but IBM will have perfected critical components of zEnterprise by then, further extending its lead.

If private cloud means embracing this or that server virtualization technology, seriously flawed as each x86 hypervisor is, it must fail.  By contrast, new technologies for streamlining the delivery of application services to client devices are always worth considering.  But what does that mean?

At one level of analysis, we are talking about rationalizing infrastructure.  We want to drive cost out of commodity gear so that everything is white box.  A server, VMware argues, is a server:  doesn’t really matter if it is Dell or HP or XYZ on the outside of the box.  All of the parts come from one of a handful of mostly Chinese/Taiwanese factories, and most are assembled in Mexico.

As for storage, an array is an array:  a shelf of Seagate disk drives with connectivity, backplane, and controller — which these days is increasingly a 1u rack system running Linux, Windows Server, BSD, etc. (Ask EMC to see a VMAX or VNX boot and look for the Microsoft copyright screens.)  So, we are running a generic server ahead of disk so we can load it up with value-add software and charge a lot more for commodity components.

Anyway, these components, so long as they can be managed (configured, monitored, maintained) in common, are increasingly irrelevant from an architectural perspective.  We should be able to abstract the “value add” away from the hardware and establish a virtualization layer that lets us carve up and allocate/de-allocate the underlying resources as needed by the apps.

Server virtualization “hypervisor” vendors try to do this, but they are locked in an older pre-virtualization operational model: they still instantiate an app on a server then wait for a user to use it.  That is a waste of resources, including the all important one — electrical power.  I would be more interested in a model in which apps are launched directly from storage in response to user requests:  what Parallels was striving to achieve a couple of decades ago when they were competing for mindshare with another start-up at that time, VMware.

Before you start debating the merits of various hypervisors on servers, two things are worth noting.  First, storage virtualization is far more advanced and far more stable than server virtualization.  Maybe this is because of the fundamental simplicity of storage and its function set, or maybe because storage was originally conceived as a shared resource in a DASD pool.  Whatever the reason, virtualizing your storage offers far more potential for cost-containment, risk reduction and improved productivity (business value) than server virtualization done in the current manner using current generation tools.

Virtualizing storage a la DataCore does not disrupt infrastructure plumbing or create performance issues that require additional layers of technology to resolve.  Contrast that with server virtualization a la VMware:  highly disruptive and requiring an ecosystem of partner technology companies whose products exist only to fix problems created by VMware.

Virtualizing via DataCore abstracts value add storage services (mirroring, snapshots, CDP, etc.) away from underlying hardware.  That reduces the storage hardware to its basic raison d’etre and enables its management as a pool of resources from which virtual volumes can be created and presented to workload with corresponding services for data protection, data management, etc. appropriate to the workload.  Volumes can be created and dissolved at will, as apps require.  Migrating data between volumes and pools — and tiering — can be accomplished without much fanfare.  Service levels can be guaranteed, security (encryption) can be applied to volumes or pools as needed.  And all of this is centrally managed, well automated, and not labor intensive.  That is what I might begin to call “private cloud storage.”

In truth, however, there is one fly in the ointment.  While the allocation and deallocation of virtualized and service-appropriate storage capacity to workload is enabled by storage virtualization, this level of management does not substitute for the resource level management that is still required.

We still need to configure and monitor the condition of the hardware and transports to deal with the many many interruption potentials (aka disasters) that can occur at the bare metal level.  That is the domain of what we jokingly call SRM or storage resource management.  SRM is needed in addition to Storage Services Management (SSM?) to create a truly managed infrastructure that can be allocated and de-allocated to workload in a well-managed-hence-service-level-predictable way.

We could make short work of the SRM requirement if we just implemented a RESTful management stack on all hardware in the infrastructure:  standards based, cheaper than SNMP or SMI-S and ready to go already as proven by X-IO on its ISE Array (check out the code at coretexdeveloper.com).

In contrast to storage virtualization, servers present a more challenging virtualization target, in part because server hypervisors are still too limited when it comes to carving up platform assets so they can be allocated efficiently, and also because the application workload hosted on the platform has a high degree of variability.

Yes, I have heard the same tale as you have:  apps are all the same — they read data, process instructions and write data.  True enough.  But there are significant differences in the resources required by a high performance transaction processing database and say email.

A key problem with server hypervisors and the consolidation methodology they seem to have been designed to follow is that consolidating multiple apps into a high availability server complex requires that all potential application host systems be configured with the I/O channels, memory and other resources required by the most demanding hosted app.  I have yet to see a valid model that compares cost savings from consolidating a bunch of small servers together with medium and smaller servers into a cluster of very big servers with failover or vMotion, versus leaving apps distributed on smaller servers.  I can tell you in the shops I have visited, the powerful, overconfigured super server eats a lot more power and using it to centralize a bunch of apps creates a huge burden on surrounding networks and storage fabrics. More than enough to offset whatever cost savings is thought to accrue to consolidation.

A friend of mine recently introduced me to another limitation of most current server hypervisors:  they can’t divide and allocate processor cores to workload.  He knows this because he just wrote code to do that and is awaiting patent protection.  Maybe once CPU cores can be allocated and deallocated at will to different workloads, server hypervisors may move closer to mainframe workload virtualization capabilities.  There we have consumable processors that can become resources for use by different workloads.

So, bottom line, , the concept of a “private cloud” makes sense as a goal of architecture — mainly as a synonym for application-facing infrastructure management.  And who knows, maybe as the IBM SmartCloud guys suggest, this application-facing infrastructure management also has a heavy dose of do it yourself — providing the tools for end users to provision their own resources with the appropriate attributes and services.  At a minimum, the cloud management layer would need to include tool for presenting SLA conformance data in a manner intelligible to those who hold the purse strings.

Beyond this interpretation, private cloud woo strikes me as meaningless and much too laden with hype.

Later…

{ 2 comments… read them below or add one }

rhodesr February 20, 2013 at 1:23 pm

I know that EMC VNX uses Windows for it’s core OS. I once spent a long debugging sessions with EMC support via a WebX session in the Windows OS on a CX4 array. But I have never before heard that VMAX has Windows in it’s core.

signal_lost March 4, 2013 at 5:35 am

Rhodesr,
The windows is not actually the SAN/NAS OS, its just used for management functions. FLARE/DART are Linux/Unix OS’s used for SAN/NAS functions. VMAX is not running windows at its core (Think the management or datamovers might use windows, but actual core OS is still custom). John is correct that these systems are open systems with relatively little custom hardware.

“Maybe once CPU cores can be allocated and deallocated at will to different workloads”
I’ve heard this same thing from a AIX admin on the plane the other day to Scottsdale and to try not to laugh.

While VMware does support CPU affinity for 1:1 allocations of physical resources (I think a terrible idea, but an argument for a different day) it also supports hot adding of CPU (If the OS allows, which Windows Datacenter does) and more importantly it also supports MHz based reservations in resource pools. I can dynamically give hard or soft reserve any crazy fractions of a CPU separately of how many vCPU’s I’ve presented at it. I’d argue pooling the MHz and then pooling this is a better play than reserving individual cores, but then again there may be some use case here.

Lastly for people who REALLY want LPAR like management on open systems, Hu and the gang over at Hitachi Data Systems will sell you x86 blades that support LPARs. (which you can then layer a hypervisor on, so you can virtualize while you virtualize).

Previous post:

Next post: