Virtualization technology, widely adopted for its cost efficiency and ease of administration, has implications for data protection and recovery solutions. The data and applications running on virtual machine environments must be protected just as those on physical servers, but differences in physical and virtual servers have implications for your backup strategies. At the same time, virtual machine technology can significantly contribute to streamlining disaster recovery operations. In both cases, proper planning and deployment of virtual machine environments and backup solutions can positively influence the return on investment (ROI) in these technologies.
Virtual machine technologies such as VMware's Infrastructure 3 and VMware Server and Microsoft's Hyper‐V technology, available in Windows Server 2008, are gaining in popularity. Virtualization technologies such as these insert a software layer between the hardware and the operating system (OS), which traditionally ran directly on that hardware. The new software layer, known as a hypervisor, can act as a mediator between CPUs, memory, storage, and other hardware components and one or more OSs.
Figure 1: Virtual machine environments enable server consolidation without sacrificing the benefits of running applications in dedicated OSs.
Some of the key drivers behind this adoption are
With these advantages, though, come differences in the way typical IT operations are performed. Backups, in particular, require special attention.
Moving to virtual machine environments does not change the need to protect these servers at the same level that physical machines are protected. Although virtual machines can be installed from images to new hardware ("raw iron") faster than OSs and applications can be installed and configured from scratch, there is still a need to protect the data on virtual servers. It is reasonable to presume that virtual machines will require similar service level agreements (SLAs), recovery point objectives (RPOs), and recovery time objectives (RTOs) to those of physical servers.
Conventional backup methods, however, can adversely impact some virtual machine environments and slow the performance of other virtual machines. For example, if one virtual machine is running a full backup, the performance of other virtual machines on the same physical server may be so degraded that they no longer meet their response time objectives. In other cases, a systems administrator may be forced to choose between delaying the backup of one virtual machine to allow another virtual machine's backup to complete or risk degrading the performance of all virtual machines on the physical server.
Specialized solutions can improve upon conventional backup methods but at the cost of increased complexity. Take, for example, solutions that offload the backup process to another server. This type of solution offers the ability to perform full and incremental backups of virtual machines, make full image backups of virtual machines, and perform these operations from a centralized management system. It does require some changes to the typical backup system:
This method eliminates the overhead of performing backups from the virtual machines and consolidates it on the proxy server. It does, however, introduce another component (the proxy server) to manage. Another approach is to reduce, rather than shift, the overhead of creating backups.
Backup overhead is reduced when the amount of data copied, transmitted, and stored is reduced. There are a number of ways to do so, including source‐side decompression and target‐side decompression; this article will discuss the efficient source‐side deduplication method.
See the first article in this Essential Series for details about source‐side deduplication compared with other data reduction strategies.
With source‐side deduplication, block‐level incremental backups are performed. This is more efficient than file‐level backups because only changed blocks rather than entire files are backed up. As a result, backups finish faster with less demand on CPU, disk, and network resources. This approach eliminates the need for a proxy server to reduce the load on virtual servers, which, in turn, eliminates the two‐step recovery process required with a proxy‐based solution. Fewer recovery steps means less complexity, less room for error, and faster recovery times.
It should be noted, though, that some applications, such as databases, may perform block‐level operations of their own within the context of logical transactions. Block‐level backup systems must take this into account to ensure that backups represent a consistent database state. If this is not possible, backups should be performed only when there are no changes being made to the database.
Depending on the backup solution and the hypervisor constraints, a data protection system with source‐side deduplication may also offer:
The last feature is especially important if you are required to recover a single user's email folder or a single table within a database. Restoring an entire database, for example, in order to restore a single table can add significantly to the time and space required to perform a restore operation.
Solutions that reduce overhead and reduce complexity yield favorable ROI for a variety of reasons:
Optimizing backups for virtual machines is just one way to improve overall ROI with virtual machines and it contributes to more effective disaster recovery as well.
As the number of virtual machines in business grows, so does the need for disaster recovery support. When selecting a backup solution, look for features that support creating virtual machines from a single backup instance. This approach brings several advantages. Perhaps the most important from a disaster recovery perspective is the ability to rapidly recover both virtual and physical servers with the instant availability of virtual machine images. Certain data protection products allow you to iSCSI‐connect and LUN‐map from the virtual server to the backup image, creating a temporary fully operational virtual machine without even a data transfer.
There are cost advantages as well. The ability to recover physical servers to virtual machines reduces the cost of maintaining a disaster recovery center. For example, if a production email server running on a dedicated server completely fails, a virtual machine can be rapidly deployed to a shared server in a disaster recovery site. Performance may be slower but email services can be maintained until a production‐level server is available. In addition, jobs, such as backup jobs, can continue to complete even after migration.
The overall impact is the need for fewer servers in a disaster recovery site. Rather than a worst case scenario of having to support a failover server for every production server that runs a single application, you can consolidate both production and disaster recovery servers. Presumably, virtual machines are already used in production to optimize the number of physical servers needed to meet performance requirements and SLAs. If those SLAs allow for reduced performance during disaster recovery periods, you could further reduce the cost of maintaining a disaster recovery site by running more virtual machines on physical servers during recovery periods. Favorable ROI is also realized by using a shared backup and disaster recovery procedure: the same backups used for data recovery are available for disaster recovery.
Virtual machine environments are a staple of enterprise IT infrastructure. The cost and management advantages of virtual environments are compelling, so backup and disaster recovery strategies should adapt to their particular requirements. In the process, though, you find that the same techniques you deploy to provide backups for virtual machines without undo overhead on physical servers yield favorable ROI. The lower overhead performance of source‐side deduplication allows for more resources dedicated to virtual machines and their applications and reduced storage costs on backup devices. In addition, the same procedures provide for cost‐effective disaster recovery.