Maxim_Kazmin - Fotolia
When leveraging cloud storage to complement or replace on-premises storage, one of the biggest factors to consider is how data will get to a cloud storage provider.
The cloud data migration strategy an organization will implement varies by use case. For a cloud archive, for example, how quickly data is transferred to the cloud is not as pertinent since this kind of content is rarely accessed. Meanwhile, organizations moving production data need to consider a cloud migration strategy with as little downtime as possible.
Here are some of the best cloud data migration strategies as they pertain to the top use cases.
Backup may be the most popular use of cloud storage, but it has the biggest potential challenge when it comes to getting data to the cloud. Cloud backup products -- which minimize the amount of data transferred -- leverage deduplication and compression. But deduplication in particular needs to compare the data it is deduping to a copy of the data already in the cloud, so its efficiency is realized once the first backup is completed. A cloud data migration strategy for backup is focused on getting the initial baseline data set into the cloud, and there are two basic techniques to make that happen:
- Back up all the data across the Internet connection and wait until it completes. Depending on the amount of data, this process can take up to several weeks. The process will get faster as you transfer data because deduplication begins to work within the first backup while comparing across backup sets. For example, if 100 Windows virtual machines need to be backed up, once the first Windows VM is backed up, much of the operating system will not need to be copied from the other 99 VMs.
- Perform a bulk copy of data on premises and then ship that copy overnight to the cloud storage provider's data center. The cloud provider will copy that data to its storage infrastructure, and the organization can then start sending updates. The initial sync may take longer than normal, but it should not take as long as the full-copy method through the cloud.
A cloud archive has the lowest migration impact because it works by copying data based on the last access date. An organization can move or copy data to a cloud storage provider if someone does not access it after a predetermined time frame.
Most organizations do not access more than 80% of their data once it is created, so a storage administrator needs to use care when setting this parameter or the initial data transfer could be quite large. An organization may start by archiving data that hasn't been accessed in the last three years. Once that job is complete, it could lower the user-defined setting to copy data that hasn't been accessed in two years and so on. Ongoing updates to the cloud storage service should be relatively small as only a few files per day will cross the lack-of-access threshold.
One challenge with a cloud archive is that most of the data is unique; this means deduplication will not be as effective as in the backup use case noted earlier.
Cloud disaster recovery
With cloud DR, an organization is only concerned with the most recent copy of data that is being accessed and used. Data centers that choose this process typically archive to a different service, or even archive on premises. Most cloud DR products on the market can replicate natively to a cloud provider. A user can also define what data to replicate, such as applications and the most frequently accessed files.
Deduplication will have a small impact here since there is likely redundancy between applications and VMs, but it won't be as impactful as the backup use case. Cloud DR should be able to finish its initial cloud data migration quickly since it has only to replicate 20% of data at most. Once the baseline working set is in the cloud, only updates need to be sent to the provider.
A data migration strategy for a cloud computing use case typically means an organization has decided to host an entire application in the cloud. Migrating data for cloud computing is, once again, well served by a brute force initial data migration or replication job. Some applications that perform this task will also convert the VM to a cloud version. In other cases, the application is rewritten in the cloud, so all that is needed is the data itself. In both cases, the amount of data transferred is relatively small and typically done one application at a time. There is seldom a case where that data is replicated back to the on-premises data center, so the organization just has to work on this first transfer.
Cloud transfer optimization
A significant capability to consider is adding a cloud transfer optimization utility to the process. These are often virtualized versions of WAN acceleration appliances that perform a much more efficient WAN-optimized transfer across the Internet connection than a data efficiency-optimized transfer. These offerings typically reduced transfer times by 50% to 100% just by optimizing the IP packets for bulk transfer.
Migrating to the cloud is much easier than it used to be, but the strategy that makes the most sense for your organization depends on how it will use the cloud. There are few one-size-fits-all products, so it's not unusual for an organization to have more than one cloud data migration strategy.
Hybrid cloud data migration strategies
Migration plans for multi-cloud models
The ultimate cloud migration checklist