Disaster Recovery

Microsoft have updated the must read Active Directory document on Active Directory Forest Recovery.

The guide contains best-practice recommendations for recovering an Active Directory forest if forest-wide failure renders all domain controllers in the forest incapable of functioning normally. The steps, which you must customize for your particular environment, describe how to recover the entire Active Directory forest to a point in time before the critical malfunction. They also ensure that none of the restored domain controllers replicate from a domain controller with potentially dangerous data.

The steps in this guide apply to Active Directory forests where the domain controllers run Microsoft® Windows Server 2012, Windows Server 2008 R2, Windows Server 2008, and Windows Server 2003 operating systems.”

Please ignore the fact that the document is titled “Windows Server 2008: Planning for Active Directory Forest Recovery” it covers all supported versions of Windows Server that can run Active Directory.

April 2013 Update.

Download it here.

This week I have been implementing Quest Recovery Manager for Active Directory Forest Edition.  As the implementation is a global implementation I have a requirement for more than one management console (if you know the product, you’ll know what I mean). The official documentation from Quest states that the process for this is:

Step 1: Export Data from Each Backup Registration Catalog to .Xml File
Step 2: Copy the .Xml File to the Forest Recovery Console Computer
Step 3: Provide the Exported Data to the Forest Recovery Console

Syntax

Get-RMADBackup | Export-RMADBackup -Path C:\Backup.xml

Import-RMADBackup -Path C:\Import\Backup.xml | Add-RMADBackup

Straight forward enough, except it did not work.  It created a file with only one server in it and not all of the the servers that the server was responsible for backing up (and knew about). On closer inspection I noticed that the file when created was going 0K, 12k, 0K, 12K…, as if the server details were being exported and overwriting the last one, one at a time rather than appending to the XML file.  I managed to open a few of the files as they were being created and confirmed my suspicions.

Solution

I managed to get it to work by executing the following syntax wrapped in a PowerShell PS1 file.

$b = Get-RMADBackup
Export-RMADBackup -Path C:\RMADFE\backup.xml -InputObject $b

The XML files produced were now all over 1MB (and growing) so I knew that I now have a solution that works.

(NB. I got around step two by implementing a DFS-R replicated DFS-Namespace and saving all files and scripts in the replicated target.)

Hope this helps and saves you some of the frustration I went through.

Microsoft have updated their Active Directory Forest Recovery whitepaper to reference Windows Server 2008 R2.

Hopefully nobody every has to go through a forest recovery – but just in case the day ever arises  – you should practice a forest recovery regularly; because you do not want to be learning how to do this on the fly with the Director of your organisation watching your every move as his entire organisation cannot work. 

Note: please ensure you  have the latest version of the “Planning for Active Directory Forest Recovery paper” which was recently updated and then quickly republished to correct an error in the procedures for re-installing DNS.
The error is also covered in KB 975654 

 Download updated 18/03/2010

If you have upgraded your Active Directory from Windows 2000 to Windows Server 2003 SP1, 2008 or 2008R2 (or if you installed a pristine Windows 2003/2003 R2 forest), there is a high probability that you have overlooked updating the Active Directory Tombstone Lifetime from 60 days to the new default of 180 days.

The tombstone lifetime needs to exceed the expected replication latency between all domain controllers in a forest and should be set correctly as it can impact backup cycles and disaster recovery, attempting to restore domain controllers or objects from backup that have exceeded the tombstone lifetime are not permitted, but when you expect to have a 180 day window of opportunity but it is still set to 60 days – this can cause issues.

To determine the tombstone lifetime for the forest using ADSIEdit

Click Start, Run then type adsiedit.msc.

In ADSI Edit

Select Action

Select Connect to

Select Connection Point

Click Select a well known Naming Context select Configuration

If you want to connect to a different domain controller,

In the Computer section click Select or type a domain or server:  (Server | Domain [:port]).

Provide the server name or the domain name and Lightweight Directory Access Protocol (LDAP) port (389), click OK.

Double-click Configuration, CN=Configuration,DC=ForestRootDomainName, CN=Services, and CN=Windows NT

Right-click CN=Directory Service, select Properties.

In the Attribute column click tombstoneLifetime.

If the value is , the default value is in effect as follows:

On a domain controller in a forest that was created on a domain controller running Windows Server 2003 with Service Pack 1 (SP1), Windows Server 2003 with Service Pack 2 (SP2), Windows Server 2008, or Windows Server 2008 R2, the default value is 180 days..

On a domain controller in a forest that was created on a domain controller running Windows 2000 Server, Windows Server 2003, or Windows Server 2003 R2, the default value is 60 days

If the value of the tombstoneLifeTime= the the value is always 60 Days.

To change the setting from 60 days to 180 days:

Change the tombstoneLifetime value to 180 if your domain has the incorrect value. The key expression from the above being created on:  The is assuming that you have not set this value for some other business reason.

This has been confirmed with fellow DS-MVP’s and validated in the Source Code. The KB Article will be updated.

UPDATE. 11/02/10

After further discussions with other MVP’s on raising the Tombstone lifetime from 60 to 180 days to match the new default, there is one extra factor which needs to be taken into consideration.

If company X has two (many) Domain Controllers, one in London and one in Sydney and on the London DC the tombstoneLifetime is changed to 180.

When garbage collection runs on the London DC it should have already cleaned up all tombstones from 60 days ago but the London DC now has to keep tombstones for 180 days. As a result of this change on the London DC for the next 120 days garbage clean up process has nothing to do.

Meanwhile on the other side of the world the Sydney DC which has not yet received via replication the new tombstoneLifetime and runs the garbage clean up process and cleans up items deleted 60 days ago.

In this scenario the London DC may now have tombstones which were cleaned up on Sydney DC leading various detection mechanisms to identify them as lingering objects

The presence of lingering objects will prevent operations like schema updates for the next 120 days – the issue is self resolving but having to wait 120 days is not ideal. To avoid this issue ensure garbage collection does not run and immediately force replication after making the change to Active Directory to ensure consistency.

A suggested approach to resolving this issue was to lower the tombstone value to say 50 days and waiting for that to fully replicate and for the garbage clean up process to run and then increasing the tombstoneLifetime value to 180.

Comments welcome.