Best Practices for Implementing Schema Updates or : How I Learned to Stop Worrying and Love the Forest Recovery

28 May 2012 6:02 PM

Note:  This is general best practice guidance for implementing schema extensions, not the testing of their functionality.  There may be some additional best practices around design and functionality of schema extensions that should be considered.  Understand that the implementation of a schema extension may well succeed, but the functionality around the extension may not behave as expected.

As with any change to the Active Directory infrastructure, the two primary concerns around implementing a schema extension are:

1. Have you tested it, so you can be reasonably sure it will behave as expected when implemented in production?

2. Do you have a roll-back plan?  And is it tested?

Digging into the details of each of these is where things get a little stickier.  However, having personally helped customers with dozens of schema updates, I can honestly say that staying within best practices isn’t that hard, and definitely makes implementation less risky and less stressful.

Have you tested your schema update, so you can be reasonably sure it will behave as expected when implemented in production?

The reason this question gets so sticky is that customers either don’t have a test environment, or they don’t have a test environment that reasonably reflects the production environment.  With respect to testing a schema extension, the best test environment is one that has an identical schema to the production environment.  How can you build and/or maintain a test environment that has a schema that is identical to production?

1. Maintain a test Active Directory environment.  On an ongoing basis, be sure to apply all schema extensions to your test environment that you do to your production environment.

2. Build a test Active Directory environment, then synchronize the schema to production.  Specifically:

a. Start by building the test environment to the same AD version as production.  That is, if all your production DCs are Windows Server 2003 or lower, make sure your test environment has a 2003 schema.  If the production schema has been extended to 2008 R2, apply the 2008 R2 schema extensions to your test environment.

b. Apply other any known production schema extensions to the test environment.  This includes things like Exchange, OCS, LYNC or SCCM.

c. Fellow PFE Ashley McGlone has a cool PowerShell script that will analyze your production schema for other extensions, to help you “remember” any other schema extensions.

d. AD LDS (formally known as ADAM) has an awesome schema analyzer tool that will compare two schemas, and prepare an ldif file so you can actually synchronize the schemas.  You should definitely use this tool to otherwise sync the schemas across your production and test environments.

3. Perform a Forest Recovery Test on your production forest.  (Please be sure you isolate your recovery environment when you test forest recovery).  Your recovered forest will most certainly have an identical schema to production.  Perform your schema update test on this recovered environment.

Typically people will shy away from #3 because it seems the hardest (and potentially most dangerous if you forget to fully isolate the recovered forest).  However, based on my experiences, I think #3 is the best option.  Why?  Because if forces you to do something you should be doing anyways (see the section below), and there is no doubt that the schema in your test/recovered environment will be the same as the schema in production.

Do you have a roll-back plan?  And is it tested?

There’s no delicate way of saying this, so I’m just going to say it:

The only supported/guaranteed way to roll back a schema change is a full forest recovery.

Thus, the best (only?) roll-back plan is a well-designed, documented and tested forest recovery plan.  I know it sounds harsh (and it is), but you must be prepared for forest recovery.  A couple points to make this otherwise bitter pill a bit easier to swallow:

1. You should have a documented and tested forest recovery plan anyways.  It’s a general best practice.  You’ve probably been ignoring it for a while, so if you’re serious about a roll-back plan for your schema update, now is the time to get serious about documenting and testing forest recovery plan.

2. It’s not as hard as it appears.  But it is very unforgiving in the details.  We’ve got a great whitepaper to help you through the details.

3. You can actually kill two birds with one stone here.  The forest recovery test will actually generate a great test environment for testing your schema extension (see option #3, above, for testing schema updates).

If you’ve avoided testing forest recovery this long, I expect you won’t go down without a fight.  Here are some of the “alternatives” I’ve heard people used for potential roll-back strategies:

1. Disable inbound/outbound replication on the schema master.  Then perform the schema update on the schema master.  Any badness is contained to the schema master.  If something goes bad, blow up the schema master and repair the rest of the forest (seize schema master on another DC and clean out the old schema master).

2. Shut down/stop replication on select DCs.  Do the schema upgrade, and if something goes bad, kill all the DCs that were on-line and may have potentially replicated the “badness”.    Light up the DCs that were offline and repair/restore your forest.

Typically, I don’t like to go down those rabbit-holes.  First, choosing one of those strategies still does not absolve you from needing a documented and tested forest recovery plan.  Second, either of those strategies requires a good bit of work in preparing and executing.  Failure to execute properly could be disastrous.  Third, if I’m upgrading the schema I like to make sure AD replication is healthy before, during and after the update.  Taking DCs offline, or isolating them, significantly impairs the ability to check health, you need to be on your toes to distinguish real errors from self-inflicted errors (caused by the isolation).  Finally, be aware that for some schema upgrades (ADPREP specifically), Microsoft recommends against disabling replication on the schema master. Also, check out another strong recommendation against isolation.

Thus, I would recommend investing your valuable resources in a forest recovery test, and a schema extension test (on the recovered forest).  After that, there’s not a lot of value in additional risk-mitigation strategies like schema master isolation.  If you’ve tested the schema extension and validated recovery you’ve done your due diligence, so know the odds are monumentally in your favor.  Schema extensions, especially Microsoft-packaged schema extensions, have a proven and well-tested track record.  And real-life examples of customers needing to perform a production forest-recovery are almost non-existent.

Put it all together and it’s really quite simple

Get yourself in the habit of preparing for all schema extensions with a one-two step.  First, test your forest recovery plans.  Second, test your schema extensions in your recovery environment and in any other test/non-production environments you may have. The first time you perform the exercise, be sure to document. Every subsequent time, be sure to review/update your documentation. You can them be confident that you’ve done everything possible to insure the schema extension goes off without a hitch.

Leave a Reply

Your email address will not be published. Required fields are marked *