Home | Storage | Understanding the Value of Seagate Kinetic
Understanding the Value of Seagate Kinetic

Understanding the Value of Seagate Kinetic

0 Flares Twitter 0 Facebook 0 Google+ 0 StumbleUpon 0 Buffer 0 LinkedIn 0 Filament.io 0 Flares ×

Once in a while a new technology comes along and despite all your best efforts, it is a challenge to put a measure on the value.  So it is with Seagate’s Kinetic announcement, which has been covered across the media recently.  In case you missed it, here’s the press release.  Seagate are claiming it as a breakthrough in hard drive usage as the devices use Ethernet for connectivity and store data as objects (more on that in a moment).  However, I have to say I’m not fully convinced.

Drive & Interface

First of all let’s look at the drive itself.  Generation 1 of the Kinetic HDD is a 4TB device spinning at 5900RPM in a standard 3.5″ form factor.  Rather than use SATA or SAS interfaces, Kinetic uses two 1Gb/s Ethernet connectors, with a standard Serial Gigabit Media Independent Interface (SGMII) connector that looks like the ones found on standard SAS HDDs.  However today we already have 6Gb/s SAS and SATA interfaces with the promise of 12GB/s around the corner.  SAS has plenty of additional features like tagged command queuing, multi-path I/O, multiple SAS initiators and is used across the industry as the back-end protocol in most high-end storage arrays.  Other than peer to peer replication, I don’t see any of these features in the Kinetic specification.  Based on their interface alone, a Kinetic drive will never be able to move more than approximately 100MB/s of data (assuming one active interface and one for redundancy).  For a completely full 4TB drive, that’s a transfer time of around 11 hours at best to replicate the content elsewhere.  However the current range of Constellation drives can already perform a sustained 175MB/s throughput, which means SAS/SATA is essential to get best performance.

Physical Deployment

Seagate cite an example configuration that stores 60 drives in a 4U chassis (or 1U per 15 drives).  This in itself presents problems of maintenance as all of the drives will be extremely closely located.  As an example, take HP’s MSA2040 storage array.  This supports 12 drives in 2U, all at the front of the unit, based on drive size.  Imagine a Kinetic chassis that uses this same layout.  In 4U this could support 24 drives, with an additional 24 at the rear.  This still leaves 12 drives to be located somewhere inside the chassis and no space for power, the chassis controller or Ethernet connectors, meaning some drives will have to be physically moved to service others.  Seagate’s rack configuration shows ten of these chassis in a single rack – 600 drives.  At current specifications, these drives would weigh around 420kg (excluding the controllers) and require a minimum of 7.2kW of power.  They would also be extremely difficult to cool, requiring significant airflow.  We saw a few years back how Copan Systems had issues with floor loading in such dense configurations.

Count-Key-Data

One of the main selling points of Kinetic is the use of object-based data storage to and from disk.  Currently, disk drives commit data as either 512 byte or 4KB blocks, accessible using logical block address or LBA.  Kinetic uses a key-value system that stores objects based on a user defined key (limited to 4KB) and a maximum 1MB object size.  The object size limit is too small to be of useful value, as many objects being stored on disk will be larger than 1MB (imagine each object is a PowerPoint presentation).  This means additional metadata needs to be stored elsewhere to track objects that have to be sharded or subdivided into multiple chunks.  So there is a management overhead.

By the way, object storage on disk isn’t a new phenomenon.  If we look back at the way mainframe data was stored on disk, we find a similar architecture called Count-Key-Data or CKD, which allowed data to be stored on disk in key/value pairs.  Admittedly, most of the time disks were formatted with consistent standard sized blocks, but the principle already exists.  Welcome to the 1960’s…

Management

Seagate implies in their description of The Ecosystem that all drives will be accessible to all “nodes” in a configuration (presumably meaning servers or hosts here) and that storage servers are no longer required.  They say “a whole layer of processors, memory & associated power/cooling…. go away”.  Unfortunately this isn’t the case.  There still needs to be some process or system to handle failed drives and do capacity management.  Imagine having no central point of control; all nodes could store data across all drives.  This seems like a useful concept, but what algorithm will those nodes use to choose the best and most appropriate drive?  Will they use performance, capacity?  Will they know the physical architecture in order to place data replicas across controllers and power boundaries?  Imagine a multi-tenant scenario, where tenants are all creating data based on capacity.  Without a central repository, each tenant would have to poll every device regularly just to see the status of space available.  Imagine if a tenant had a software bug and didn’t commit or save the details of a stored object.  It would be easy to end up with hundreds of orphan objects without any easy way of tracing them back to an owner.  Centralised management is still needed.

Enough of the Negativity

OK, so I’m be overly negative.  There are scenarios where this technology could be implemented and used within distributed databases or object stores and perhaps over time, the limitations on object size will be removed.  I understand that and can see those use cases.  However I don’t think Seagate’s positioning on efficiency is fair or balanced.  There’s a large amount of management overhead still required to keep track of data stored on any disk format.  The ability to store data as an object doesn’t solve all our problems.  This is definitely a version 1.0 product, but with more work, I expect we may see version 2.0 or probably 3.0 delivering on what’s been initially promised.

 Related Links

Comments are always welcome; please indicate if you work for a vendor as it’s only fair.  If you have any related links of interest, please feel free to add them as a comment for consideration.

Subscribe to the newsletter! – simply follow this link and enter your basic details (email addresses not shared with any other site).

Copyright (c) 2013 – Brookend Ltd, first published on http://architecting.it, do not reproduce without permission.

About Chris M Evans

Chris M Evans has worked in the technology industry since 1987, starting as a systems programmer on the IBM mainframe platform, while retaining an interest in storage. After working abroad, he co-founded an Internet-based music distribution company during the .com era, returning to consultancy in the new millennium. In 2009 Chris co-founded Langton Blue Ltd (www.langtonblue.com), a boutique consultancy firm focused on delivering business benefit through efficient technology deployments. Chris writes a popular blog at http://blog.architecting.it, attends many conferences and invitation-only events and can be found providing regular industry contributions through Twitter (@chrismevans) and other social media outlets.
  • Matt Breitbach

    So I agree that it’s an interesting product, but you’re getting a little _too_ negative on the product.

    1 – Drive interface – The speed at which that drive is going to transfer data is going to certainly be limited to much less than 100MB/sec if there’s any hint of randomness. Only streaming reads/writes will exceed 100MB/sec.

    2 – The 60 drive layout is being used by several vendors right now, including EMC. The top-loading drive system isn’t that difficult to manage or maintenance. The power and weight isn’t a big concern for a lot of newer datacenters. 7.5KW per rack isn’t unheard of.

    3 – Most object-based systems right now seem to be shipping with 12-24 drives, plus CPU/RAM/OS/etc. If you can put an entire rack of these drives behind a smaller set of controllers, you can decrease the controller-per-drive ratio, likely reducing your power draw.

    4 – Count Key Data – There’s always going to be some management overhead in any system. There’s management overhead in RAID controllers, there’s management overhead in converting a file to an INODE, an IO for reading that INODE from disk, and then an IO for reading the file from disk. Finding where it makes sense to use traditional block vs object storage is the challenge here. A good read is Doug Beaver’s paper about Facebook’s project Haystack talking about objects vs blocks. http://www.stanford.edu/class/cs240/readings/haystack.pdf

    5 – I think you’re mis-understanding the use case here. What I gathered is that these disks were implicitly for projects like SwiftStack and Basho Riak, not for a webserver or client to access directly (although they _could_).

    Disclosure – I’m a tech nerd working for a Managed Services provider, and I am in no way an industry professional in this area.

  • http://thestoragearchitect.com/ Chris M Evans

    Matt, thanks for the comment.

    On point 1, as we move to larger objects, surely the throughput does become an issue as the I/O is inherently less random?

    On point 2, I guess I’m implying Seagate isn’t being fair in comparisons; they are taking almost a worse use case in their example.

    I don’t disagree with your point 5 – I think these are intended at distributed data projects, however Seagate themselves reference file system examples and multi-tenancy, as possibilities.

    It will be interesting to see how quickly we get some real deployments of this.

    Chris

    • Matt Breitbach

      I agree that as we move to larger objects I’m sure throughput will begin to become an issue. Fortunately 10Gbit is becoming more available and we’re seeing interesting things like the Intel FM5224 Switches coming out that support 2.5Gbit connections (http://www.intel.com/newsroom/kits/atom/c2000/pdfs/Intel_Ethernet_Switch_FM5224_Product_Brief.pdf) If we had active/active 2.5Gbit ports on the drive, I think throughput becomes a moot point. I suspect that at some point we’ll see faster interconnects on the drives. I’ll give them the benefit of the doubt for a first-gen product.

      As for the density, I’ll give you that it’s a poor/worst-case comparison. I think there’s a lot of people that are probably doing 48-60 drives per controller already using backend SAS loops.

  • http://juku.it/ Enrico Signoretti

    Chris, good points.
    As you already said this is a v.1.0, that there will be an evolution and some of the current limits will be mitigated.

    I would like to point out that some object storage vendors could take advantage from this technology, some of them already store data without using a filesystem (Caringo and DDN WOS for example) and this kind of drive validates their approach in some way. I’m sure that some of those ISVs are already looking at it.

    I’m also confident that this technology is already in use at Evault datacenter to build their Glacier-like service. 😉

    Ciao,
    Enrico

  • Seth

    I would like to see a drive manufacture do this with an already mature open standard. A drive with embeded AOE (ATA over Ethernet) would be awesome. Unlike iSCSI, it dosen’t need IP confiuration or authentication. Just plug them into their own switch and plug one cable into the server on an additional NIC. A linux server with the AOE driver loaded treats the drives just like internal ones, no router, no network configuration needed.

0 Flares Twitter 0 Facebook 0 Google+ 0 StumbleUpon 0 Buffer 0 LinkedIn 0 Filament.io 0 Flares ×