vRA VM Backup, Delete, Restore – What’s the Problem?

Virtual machine lifecycle management is a key feature and function of VMware vRA. Policies can be put in place to control virtual machine requests, approvals, provisioning, expiration, and reclamation. With the EHC solution, we can also include automatic backup and restore in the VM lifecycle.

But what happens when a user decides to destroy a VM, only to realise shortly afterwards their mistake, meaning they now need to restore a VM that has been destroyed? Well, that’s the thing …

We’ve all been there, whether it was in Production or a Test Lab/PoC. You think you are done with the VM, no need to keep it is there? Let it go. Delete it. You sure? No. Maybe. Probably. Yea, what’s the worst that can happen? Ok, <click> & done.

… then obviously after a couple of minutes …

mybaaad

Fortunately vRA provides the mechanism to restrict users from having the absolute power to delete a VM. But it can still happen if the proper policies or guardrails are not put in place.

There are 2 different scenarios to consider here, which unfortunately are very VERY different use cases:

  • Restoring from Backup a VM which is still in the vRA inventory
  • Restoring from Backup a VM which has been deleted from the vRA inventory
    • This is especially problematic for MultiMachine deployments

Much depends on the retention and archival periods that should be set in vRA, as well as the VM Actions allowed to vRA users. It also helps to have a backup that we can restore from!

Looking at the typical VM lifecycle in vRA, basic flow below, we want to focus on the Retire and Archive end of things.

vra_vm_lifecycle_basic

First, let’s clarify the difference between Retire, Archive and Destroy.

Retire a VM: An expiration date can be set on the VM Blueprint by means of setting a Lease (see below), or an entitled VM user may elect to retire a VM.

setleaseperiodinbp

When a machine lease expires, or a user manually retires the machine, it is powered off and is then Archived.

Archive a VM: An archived VM is powered off and remains in the vRA (and vCenter) inventory until such time as the specified Archive Period expires. This archive period is specified in the VM Blueprint, as shown below.

setarchiveperiodinbp

Setting this Archive value to zero means that there will be no Archive period, and the VM will be destroyed immediately when the lease expires. When the archive period expires, the machine is destroyed. (You can reactivate the archived machine by setting the expiration date to a date in the future to extend its lease, and then powering it back on)

Destroy a VM: Once a VM is destroyed, it is removed from the vRA inventory, deleted from vCenter, and all of it’s resources are freed up to be used by vRA for other workloads. An entitled vRA user can elect to Destroy a VM directly rather than Expire a VM, in which case there will be no Archive period, the VM will be destroyed immediately.

setvmactionsinbp

The ability to Destroy a VM can also be managed/restricted using a vRA Entitlements which can be applied to the appropriate vRA Users and Groups.

setentitlement

Official VMware documentation on VM Leases, Archive Periods, Reclamation and more can be found here

The following are some considerations with regard to what happens when the VM is finally deleted, causing vRA to automate the deprecation of all the related configuration information, which cascades throughout the system:

  • Metadata associated with single and multimachine blueprints, tracked and updated by various systems, is deleted
  • This includes for example IPAM updates, firewall rules, micro-segmentation specific to the blueprint, CMDB..etc
  • The general deletion and release of resources back to various resource pools, (CPU, Memory, IP address, etc)

So, providing you have a recoverable backup, once you recover the VMs and Data, the overall recovery still has a lot of manual process to complete before being fully restored.

That’s the main difference between restoring a VM that is still in the vRA inventory (still active, expired, or not yet deleted) and restoring a VM that has been deleted from vRA and vCenter.

EHC Backup-as-a-Service provides automated Image-level backup and restore protection for virtual machines managed by vRA. More information is available on this to reference for Creating a Backup Service Level, which includes a long term retention period for the VM backup itself, and also for Automated VM Protection.

With EHC BaaS, virtual machine Restore can be managed through the vRA portal so long as a VM has not been deleted/destroyed by vRA.

Any VM that has not been deleted, retains all of it’s required IDs and place in the vRA DB/inventory. In this case, when you perform an image-level restore of the VM with Avamar through the vRA portal, it is only the .vmdk files that are restored within the VM, the IDs of which never change within vCenter or vRA.

In the case of a VM that has been destroyed/deleted from vRA (and hence vCenter), then a VM restore from the Avamar Admin UI is required. This restore will create a new VM within vCenter, which will then need to be imported into vRA. At that stage the VM itself is back, but we are then faced with the problem of manually restoring and re-registering the VM with the various systems and metadata that existed in it’s previous incarnation.

To wrap-up, some key takeaways from this:

  • Do protect your vRA VMs with a Backup (preferably EHC BaaS), and set a suitable long term retention policy, so that even though the VM is deleted from vRA/vCenter, a full backup of the VM will be available to restore from backup. If not EHC BaaS, then please use something suitable that you will be able to restore from!!!
  • Set a suitable Lease period in the VM Blueprint. Leave the Lease value blank if the VMs deployed from this Blueprint are intended to be permanent. Maybe it’s better to err on the side of “Leave it be” rather than give too short a Lease period and it gets archived before being destroyed. Maybe. It depends 🙂
  • Take care and consideration to set an appropriate Archive period in the VM Blueprint. This Archive period is the period of time in which you have a safety net providing the ability to re-activate the VM while maintaining all environmental settings and other system relationships
  • Only allow the Destroy action on a VM blueprint or for a vRA user/group where required.

Do double-check with everybody you know before pushing/clicking that button and deleting the VM, take a long walk, and then think about it some more  🙂

bigredbutton

Hope that helps!

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s