The recent proposed acquisition of 3Par by Dell and/or HP has made me think a little more about the direction the storage industry is taking in terms of their storage array design architecture. Since storage arrays became a category of devices in their own right, we’ve seen the growth of the monolithic, sometimes called Enterprise storage array. Hu Yoshida discusses the subject on one of his recent blog posts. Looking at the wide range of storage devices, I’ve categorised arrays into the following groups:
- Monolithic – this architecture is characterised by Hitachi USP, HP XP & EMC DMX and consists of a shared memory architecture and multiple redundant components.
- Multi-Node – these devices use loosely coupled storage “nodes” with a high-speed interconnect providing scalability by adding extra nodes to the storage “cluster”. Products in this category include EMC VMAX and 3Par InServ.
- Closely Coupled Dual Controller – this is the typical “modular” storage architecture characterised by IBM DS8000, EMC CLARiiON, Hitachi AMS and HP EVA.
- Loosely Coupled Dual Controller – this category describes technology that are capable of device failover but aren’t closely coupled to enable individual LUN failover as the Closely Coupled model permits. This category is characterised by arrays such as Netapp FAS filers and Compellent Storage Center.
- Single Controller – this category covers devices that act as standalone products, including SOHO storage devices such as the Iomega IX4 & Data Robotics Drobo series.
The above list isn’t exhaustive and it’s my own personal categorisation. There are many more vendors of technology than I’ve listed here. In addition, none of these lists qualify as “Enterprise” in their own right. The use of this term is a hotly debated subject.
Monolithic arrays use a shared cache architecture to connect front-end storage ports to back-end disk. This is shown clearly in the architecture diagrams shown here, representing the internal connectivity of the EMC DMX and Hitachi USP storage arrays. Each of the memory units is connected to each of the front-end directors and the back-end disk directors. Hitachi divide their cache into two halves for Clusters 1 & 2 in the array; EMC have up to eight cache modules. This architecture has positive and negative benefits; firstly having director connections connecting to all cache modules ensures resources aren’t fragmented; unless cache becomes completely exhausted there’s always connectivity to another cache module to process a user request. It also doesn’t matter on which port that request comes in; the cache module can process any request from any port to any back-end disk. This connectivity is also beneficial in terms of failure. If a cache module fails, for example, only the cache on that module is lost; in a fully deployed architecture the total cache would drop (by 1/8th in EMC’s case), but front and back-end connectivity would remain the same. With this model it is possible pair up storage ports and have a single LUN presented from 1 or more ports with no performance impact; the path length between a storage port and disk adaptor will always be the same.
This any-to-any model also has disadvantages. The connectivity is complex and therefore becomes expensive and requires overhead to manage and control the interaction between the various components. In addition, there’s a limit to the practical scalability of this architecture. With eight FE, BE and cache modules, there are 128 connections in place; (8x8x2). Adding a single cache module requires an additional 16 connections; similarly, adding more front or back-end directors requires more connectivity. Also monolithic arrays are based on custom components and custom design, increasing the ongoing maintenance and development costs for the hardware.
One other point to remember; front and back-end directors have their own processors. It is possible for the traffic across the directors to be unbalanced and for some processors to be more heavily utilised than others. I’ve seen a number of configurations where USP V FED ports are running at 100% processor utilisation due to to small block sizes. This means manual load balancing is required both in initial host placement and subsequently as traffic load increases. This fact is worth bearing in mind as we move to more highly virtualised environments as it is likely host port utilisation will start low and rise over time as more virtual machines are created.
Now that the DMX platform has been put out to pasture in place of VMAX, it appears Hitachi are the only vendor continuing down the monolithic route. Next time I’ll discuss Multi-Node arrays and why they may (or may not) be a replacement for today’s monolithic devices.