Why You May Want to Keep Your Data Out of the Storage Cloud for Now

| | Leave a comment

Cloud storage is generating a fair amount of interest in the press, among analysts, and even, to some degree, among the end-user community who actually store their data in the cloud. The big attraction of cloud storage is that it provides companies with an economical, always available pool of storage that they can use for their data storage needs while off-loading the task of storage management to a third party provider. On the surface, this sounds great. But companies really need to understand exactly what problems that they hope to solve using cloud storage and then only use it under those circumstances.

In reviewing the current status of cloud storage and analyzing what current providers (primarily Amazon S3 and Nirvanix) and users of storage are saying about it, it is still far from a slam dunk in most corporate environments. Issues that cloud storage still encounter include:

  • Data availability. Cloud storage's promise of making data more widely available on more economical storage is bound to intrigue most companies. Yet in a statement that recently appeared in a Byte and Switch article, Amazon S3 only guarantees 99.9% data availability. That is not bad but that is still about nine hours of downtime every year and it is unclear if that downtime is scheduled or unscheduled. Further, that guarantee still seems unrealistic. S3 has experienced about 10 hours of unscheduled downtime in the last nine months alone according to an article that recently appeared in InformationWeek. While companies may not need much of their data most of the time, will businesses ever come to a point where they want zero access to their data during the time that their cloud storage provider is offline if that can occur at anytime?
  • Hidden connection charges. The monthly charges for cost per GB (15 - 20 cents) are the rates that everyone likes to cite as the big advantage for using cloud storage. However companies also need to pay attention to fees associated with uploads and downloads of storage which is where the costs can add up. Every time you transfer data, the cash register is ringing to the tune of about 18 cents per GB. Again, that may not sound bad, but when using services like Amazon S3, something as simple as changing the name on a file or copying it to another location can trigger file uploads and downloads which also trigger upload and download fees.
  • Platform maturity. The recent unexpected and extended outage of Google's gmail that is based on a cloud computing platform exemplifies the developing maturity of these cloud-like platforms. Aggrevating the situation, Google did not immediately come clean that their gmail service was offline. Now, granted, gmail is free for the majority of people who use it and this outage in no way diminishes the long term value that companies can eventually expect to derive from cloud storage. However examples like what occurred with gmail help to explain why some of these cloud computing platforms are free.

There is obviously a host of other reasons that I could delve into in regards to why cloud storage is still an emerging technology and not quite ready for prime time. However offloading the task of storage management is probably one of the main reasons companies are intrigued by the entire concept of cloud storage. Companies are seeking an easier and more cost-effective way to host their infrequently accessed data. But are companies really ready to turn this data - and its availability - over to the storage cloud? I think not.

My thoughts are that companies are far better exploring existing products that are based on grid storage architectures such as Permabit's Enterprise Archive. Grid storage architectures deliver most if not all of the benefits of cloud storage, are already mature and do not come with the baggage that the storage cloud still carries. Plus, you get the additional benefits such as high availability (can survive multiple simultaneous failures and perform upgrades seamlessly), security with built-in encryption and WORM technologies, and at a price point that is cheaper than tape today.

Companies may someday move some and possibly all of their data into the storage cloud but that day is not today. There are just too many uncertainties and unknowns associated with making a move of that magnitude now, especially when one can address many of those concerns and achieve similar benefits by using existing and more mature products based on grid storage platforms.

1 Comments

Jeff Treuhaft said:

Jerome,

You raise key points here. For primary copy storage in the cloud Data Availability and Cost of Network are key but a far more important and useful metric will be Quality of Service.

The public WAN is ripe with routers, NAPs and peering points that can each separately and together have an impact on an IT administrator's perceived performance.

Unlike other IT infrastructures, storage is the one area where contention for resources at the disk level can lead to large negative performance impacts for customers who are accessing a common infrastructure. Saying a service was "available" 99.99% of the time is not nearly as important or valuable as saying that it was available AND the target latency, throughput, I/O and data integrity promises were also kept throughout the time period.

IT administrators in the enterprise should look for cloud storage systems that:

guarantee data integrity (i.e. never lose my data, always protect my data privacy)

host and manage data in a system that can independently promise data availability and data access performance regardless of # of other customers or size of volume in the cloud

present storage targets that integrate seamlessly into existing enterprise IT environments (full POSIX compliance, snapshots, replication, multi-protocol interfaces, etc.)

Leave a comment

Optional: Sign in with   |  

Entry Sponsorship

This entry is sponsored by Permabit Technology Corporation

About Permabit Technology Corporation

    Permabit Enterprise Archive is the only enterprise-class, disk-based storage system to archive petabytes of information at a fraction of the cost of tape. The system combines space saving compression and deduplication with multi-petabyte scalability to provide Scalable Data Reduction™ (SDR)