Techdirt Insigit Community Share your feedback on the rapidly evolving
Storage Area Network (SAN) market.
Powered by the Techdirt Insight Community.

Storage Virtualization - Where Should It Go

 

Lukas Kubin Submitted by Lukas Kubin on May 13th, 2008

Writing about storage virtualization, I should start by defining which virtualization I mean. Generally, a simple RAID volume can be called “virtual” too as it is a logical representation of some more complex logic behind it. Don’t worry, I’m not going to write about RAID. Instead, my mind is full of mirrors, snapshots, clusters, recovery sites and a single question: At which layer of SAN infrastructure these features should live?

Today, we can find storage virtualization implemented mostly on two places:

  1. Built into array controllers, or
  2. running as a software installed in a hardware appliance or Fibre Channel switch placed in between the arrays and SAN clients.

The first usually doesn’t go far beyond a semi-working mirroring feature you receive a huge bill for. Also, a common problem with these vendors is their scope ends at the block level; they don’t really care about host applications. Sure, the primary function of a storage controller is different, and mixing it with the complete stack of virtualization features in one box might create more troubles both in design and during operation.

The second method of virtualization many people presume to be an in-path obstacle wearing another vendor’s label, box they have to learn how to manage, pay maintenance fees for etc. Don’t bother explaining to them how it’s full of features, how it’s not necessarily a single point of failure or how it creates just a minor latency.

As a result, there is a large set of SAN installations lacking modern virtualization features. Is it bad? It is, I think. Safety features like mirroring with transparent failover or consistent snapshot replication should be an obligatory part of each SAN installed in 2008. At least, they should be available as part of storage solutions from SMB up through all the marketing labeled levels.

What’s a way to avoid these drawbacks and bring storage virtualization to more SAN users? As always, I believe it’s through simplicity and standardization. Let’s devide each virtualization feature into two parts. One that inevitably needs to be implemented at the controller level, and the second which would reside at the host. No need for any third in-band level in this design.

Suppose you setup synchronous mirroring in such a design. The host would then send the blocks it writes to all arrays configured to be part of such mirroring. The benefit is there is no retransmission from the primary controller to the mirror one, no central point of the in-band appliance. In case of array failure the host itself selects another array. From such a perspective, it could be just an improved MPIO driver. I’m an optimist, so I believe there is a way how to write such drivers to be vendor-independent. Thus you could mirror your HP to IBM, LeftHand to EqualLogic ;-)

There is already similar implementations of such out-of-band virtualization in Fibre Channel world. It’s LSI StoreAge. Most of its features work on Windows only however and yet it requires hardware appliance to be set up in the SAN. There are no similar implementations in the iSCSI world that I know of.

Having the host part of the storage virtualization brings another advantage: It’s close to the applications. It’s application data we need to protect, not low-level blocks. Application support is necessarily important for creating snaphots and replicating data to remote sites. We could manage SAN data much more safely and in a simpler way if the SAN border moved closer to applications. Of course, some work on standardization has yet to be done here.

To summarize, I see the current storage virtualization too in-band-ish. Although there are some rare efforts to put selected tasks on SAN hosts (eg. FalconStor’s DiskSafe agent), they stay alone without further plans to replace the central appliance. If I was an array vendor I would consider pairing with FalconStor to strengthen the market of interoperable, application-centric SANs, bringing more ways how to use “my” arrays.

 

5 Responses to “Storage Virtualization - Where Should It Go”

  1. Joerg Hallbauer Says:

    Lukas,

    You have this exactly backwards, at least for what I believe people are trying to achieve with block storage virtualization. See my blog on this topic (http://joergsstorageblog.blogspot.com) for all the details.

    In the general case I think that BSV should be used as a layer of abstraction between the servers and their storage so that you can “plug and play” any vendor’s storage into your environment transparently and on-line as well as move data around without disruption. If you can do that then you can drive down both the CAPEX costs of your storage while also reducing your OPEX costs by maintaining a single interface for management of your storage and reducing the costs of “migrations” between arrays by eliminating the outages associated with those “migrations”.

    –joerg

  2. Lukas Kubin Says:

    Joerg, I understand your feelings, I thought this too about two year ago, when I started working with virtualization appliances. Since then, however, I’m trying to find, why many people are not willing to adopt SV in it’s current form. That lead me to wrote what I did.
    If you look into history of many IT systems, you’ll find that many of them went through a process of various proprietary forms until they got to some extent of standardization. The vendors agreed on some common concepts as they found it was better for the market as a whole. Then the technology is usually ready for a mass deployment.
    The same I believe can happen with SV. As I wrote, I don’t think of controllers doing many intelligent tasks. However, they could run some thin, “virtualization ready” layer for those features which can not happen anywhere else. On the other side, a storage appliance is a part which could be efficiently withdrawn to make the whole system simpler.
    As to the management argument: Even though I agree a central management of SV makes some tasks easier, it still doesn’t replace all the tasks you have to perform on the disk systems below it, ie. raid management, firmware upgrades etc.

  3. Joerg Hallbauer Says:

    *blink* standards? To quote a friend of mine “what I like about standards is that there are so many to pick from!”

    Can you name a single major initiative in IT in the last 30 years where “vendors agreed on some common concepts”? Heck lets go way back; UNIX was supported to be a unified standard. Can you name a single vendor that’s running a “standard” UNIX? Solaris, HPUX, AIX, etc. They are all different.

    OK, so maybe it’s better in the storage industry? hmmm … how long have we been talking about a single management standard for storage? As far back as I can remember, and there’s still nothing that really works well.

    As for management I think you miss the point. You would treat a storage array as a “black box” in that environment. I would tell the vendors I want 100 TB of storage, carved up into, say, 20GB LUNs. Then I would lease that box for three years, and never touch it. There’s no reason to unless there is something very seriously wrong with it, in which case I would have the vendor in there helping me fix it. So, no raid management, no formware upgrades, no etc. Anyway, that’s the vision, and in that environment, life gets a lot easier for the storage team.

    One last comment, we did the “storage close to the application” bit before, it was called DAS, and we moved away from it for a lot of good reasons I won’t bother to repeat here.

    –joerg

  4. Lukas Kubin Says:

    To name few standards - you mentioned UNIX, so POSIX comes to mind. Majority of Linux vendors accepted FHS. Ethernet switch vendors integrated 802.3ad with LACP for link aggregation. However, standardization was not the core point of my insight so why to stuck around it in this discussion.

    Regarding your last comment - perhaps my English doesn’t allow readers to fully understand my thoughts in the insight, sorry for that. By “storage close to application” I don’t really mean something as physically close as DAS was. Let’s take the term less strictly and translate it as storage better integrated with applications, helping them in keeping consistency.

    That’s also one of reasons for thinking of standardization - there are lots of applications communicating with storage. To help all of them keep the consistency, a generic layer (multi-platform, in best case) is better solution than writing application-specific drivers.

  5. yfeefy Says:

    Seems like you are just describing

    3. host-based virtualization, aka LVM, host-based mirroring

    which is probably one of the oldest forms of storage virtualization…

Leave a Reply