Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Software-based vs. hardware-based replication solutions

In response to your statement in your reply "I would perform array-to-array mirroring via software, since I am not aware of any hardware-based solution that would permit mirroring between two arrays. Hardware split-mirrors are not the answer since again, all disks must be in the same array."

There are many hardware-based (microcode level) mirror (or copy) products from IBM, EMC, HDS. These products use disks that are mirrored between different physical arrays. None of the integrated cached disk arrays use a single back plane, all have at least two and most have four now. Granted all single arrays are in and of themselves a single point of failure that could be the downfall. However most IT shops have clusters at the host level, multiple arrays purchased for different projects which double as array clusters and redundant networks, etc. for high availability.

My question now is with a little education on storage arrays and capabilities would you still recommend a host-based software mirror that uses the host level cycles or a storage array (hardware microcode) level mirror that frees the host to do the work the host was intended to do?

Let's step back for a second and consider a few things here. My attitude about all things related to availability is one of practicality and pragmatism. It's nice to treat the pursuit of availability as if you have an unlimited budget, but nobody does.

Dual disk arrays are a lovely, though very expensive, idea. And certainly, the amount of additional downtime protection that they provide is usually not worth the added expense. (There are very few absolutes in the availability game since you may not be willing to spend for something but the next guy might be.)

Part of the expense in disk arrays (single or dual) is in the CPU that gets tucked inside. In general, that CPU is significantly more expensive than the CPU that comes inside a modern computer system. If I can reduce the expense of purchasing a CPU that might go inside the disk array and replace it with a cheaper one inside my system, I think it's worth using host-level cycles to do so. But even more than that, it's not whose CPU you use but how many cycles are required and how much time the CPU spends waiting for I/Os to complete.

If you insist on assigning resources only to perform the work that they were "intended to" then don't listen to music on your computer, as you have a perfectly good CD player that you can use the way it was intended to be used. And don't dial into your computer network, as telephones were designed for voice only. And as you said, a single box disk array still does have single points of failure, if only the box itself.

Bottom line? Compare price performance. Look at high-end disk arrays for internal mirroring using their own CPUs and compare that to using the same arrays with software-based mirroring and see which method works out to be faster and which method works out to be cheaper. Then, compare both of those to using regular JBODs with software-based mirroring.

If you consider JBODs, make absolutely sure that you can get all of the same functionality through the software that you can get from the high-end disk array. Then, look at price and performance.

When you run into bottlenecks on performance, compare the cost of buying additional capacity under each method and see what that does to your price performance.

Finally, if you do your own performance benchmarks, and you should ALWAYS do them when contemplating large purchases, make absolutely sure that the benchmarks are performed with your own data, not using standard benchmark tools. Use your own data in terms of quantity of data, size of data packets, burstiness, rate, number of parallel users, and so on. The more your test data varies from reality the less valid your testing is. And, it takes very little overall variance to make the benchmark totally invalid.

In my experience, computer and disk performance have very few absolutes. The most common answer to most good computing questions, especially when related to performance and availability is, "it depends." Never lose sight of that simple statement.

In fact, if you only gain one piece of wisdom from my columns and my writing, I ask that it be this statement, which I repeat here: "The most common answer to most good computing questions, especially when related to performance and availability, is 'it depends'."

Despite that, I will tell you that in my experience, software-based mirroring (and replication for that matter) "tends" to be less expensive and faster than hardware-based. But, please, don't take my word for it. Test it in your own environment with your own data.

Hope this helps.

Editor's note: Do you agree with this expert's response? If you have more to share, post it in one of our .bphAaR2qhqA^[email protected]/searchstorage>discussion forums.

Dig Deeper on Data storage backup tools