Manage Database Redundancy
Mailbox databases have now been moved to the organizational level as mentioned in Pt 1. Exchange 2010 introduces a radical new method of providing database redundancy. Databases can now be kept as multiple copies on different servers.
- A logical group of mailbox servers is called a 'Database Availability Group' or DAG for short.
- The DAG can contain 16 mailbox servers (which can also include other exchange roles)
- These servers can be on different subnets
- A DAG can have up to 16 copies of a database (with up to 100 databases per server)
- Within a DAG, one copy of the database is active while the other copies are passive.
- When a change is made to the active copy changes are recorded to the transaction log. When the log becomes full, it is closed and replicated to the passive copies on other servers. The replicated transaction logs are replayed into the passive databases which keeps the passive copies up-to-date (log shipping and replay)
- If the active database is lost a passive copy will failover automatically and become the active copy. This can also be administrator activated. This is called a switchover.
Create a DAG
A DAG consists of three primary components
- IP address
- Witness location
The name follows NetBIOS convention and the the IP address can be granted using DHCP (not personally recommended) or set statically. If the servers are on different subnets the IP address should include those networks. Because the DAG is using Server 2008 clustering features, the quorum model used is based on a file share witness. The file share witness is used when the number of nodes in the cluster is even. It has a vote in deciding which node should be active. The location of the witness can be on any server (but not servers in the DAG) and its path is configurable.
The following commands can be used to construct a DAG:
[PS] New-DatabaseAvailabilityGroup -Name DAG1 -DatabaseAvailabilityGroupIPAddress 192.168.2.100
You might now receive an error. Perhaps like the following:
WARNING: The operation wasn't successful because an error was encountered. You may find more details in log file "C:\ExchangeSetupLogs\DagTasks\dagtask_2010-02-28_21-56-49.338_new-databaseavailabiltygroup.log". The task was unable to find any Hub Transport servers without the Mailbox server role in the local Active Directory site. Please manually specify a witness server.
You will have to manually establish the witness location. If you add the location to a DC the “Exchange Trusted Subsystem” security group has to be added as a member of the local administrators group of the server and add the DC computer account to the Exchange Trusted Subsystem Group. This is not ideal. Best practice recommends that it is placed on a Hub transport server (not in the DAG!). The other thing to mention is do NOT create the folder share ahead of time and just let the cmdlet do all the work.
To set the witness location as well as creating a DAG, type the following:
[PS] New-DatabaseAvailabilityGroup DAG1 -WitnessServer SRV1.compulinx.com -WitnessDirectory c:\DAG1witness -DatabaseAvailabilityGroupIPAddress 192.168.2.100
Add Servers to the DAG
We now must add mailbox servers to the DAG just created. As mentioned, these exchange servers could infact be multiple role holders. The servers should have two network cards. One NIC is used to transfer replication traffic between servers and the other for MAPI traffic. Remember that the CAS server role now is the RPC endpoint for Outlook clients. These email clients will connect to CAS servers and they in turn will communicate with mailbox servers (very different from Exchange 2007) by RPC. This connection to CAS servers is now possible because they now run the RPC Client Access Service. This will be discussed in another post (CAS arrays).
By typing the following you add a server to your DAG
[PS] Add-DatabaseAvailabilityGroupServer DAG1 -MailboxServer SRV210
This should take about 20-40 seconds to complete. This will automatically install the Failover Clustering component. This is a feature on Windows Server 2008 R2 and is unavailable on Standard Ed. You will require Enterprise Ed. servers.
Now add your second server to the same DAG:
[PS] Add-DatabaseAvailabilityGroupServer DAG1 -MailboxServer SRV211
If you need to change the location of the witness resource:
[PS] Set-DatabaseAvailabilityGroup DAG1 -WitnessServer SRV212 -WitnessDirectory c:\somelocation
So far we have created a DAG, defined the location of the witness and added two servers to the DAG. The next step is to include a database that needs to replicated between servers. Since the SRV210 has a database (DB1 which we created earlier), a database copy exists. This needs to be replicated to SRV211.
[PS] Add-MailboxDatabaseCopy DB1 -MailboxServer SRV211
[PS] Get-MailboxDatabaseCopyStatus -Identity DB1
The last command should show you that the database has replicated. Notice the replica is 'healthy' and the original and active version is 'mounted'.
When we created the DAG, a network for replication was automatically established. This is called a DatabaseAvailabilityGroupNetwork. Because we have two network cards in our servers you should see two networks; DAGNetwork01 and DAGNetwork02. To see this type the following:
[PS] Get-DatabaseAvailabilityGroupNetwork | ft name,identity,replicationenabled,subnets,interfaces –au
This will show you that these networks are used for replication of database information. Considering our servers have two network cards, we can use one of the networks for MAPI traffic. This is traffic from CAS servers.
To set DAGNetwork02 for MAPI traffic, type the following:
[PS] Set-DatabaseAvailabilityGroupNetwork "DAG1\DAGNetwork02" -ReplicationEnabled $False
Manual Seeding Database Replicas
There are times when you will be required to force a replication to database copies because they are out of sync with the active original. This is caused by the following situations:
- When a replica is brought back on line after an extended downtime
- Log file corruption
- Database corruption
- Extended WAN outage (assuming the replicas are in different sites)
- Suspend replication
- Update the replica (reseeding)
- Start replication
1. To suspend replication use the following cmdlet. This will suspend replication to the database replica on SRV211:
[PS] Suspend-MailboxDatabaseCopy db1\srv211
2. To manually reseed the replica database type the following. Notice that you have to delete any existing files for it to work:
[PS] Update-MailboxDatabaseCopy db1\srv211 -SourceServer srv210 –DeleteExistingFiles $True
3. To resume replication type the following:
[PS] Resume-MailboxDatabaseCopy DB1\SRV211
There are two additional settings that can be used that affect how databases handle logs and failover.
This is the time that passes before replicated logs are replayed into the passive replicas. This can be useful if you are concerned about replaying a corrupted log into a passive copy.
Is the amount of time that passes before a log file can be deleted on a passive copy database. The following example sets the replaylagtime to a day and the trucationlagtime to a week
[PS] Set-MailboxDatabaseCopy DB01\SRV211 -ReplayLagTime 1.0:0:0 -TruncationLagTime 7.0:0:0
Failovers (When It All Goes Wrong!)
Failover occurs automatically with no administrator intervention. You can manually change your active/passive databases around by the following cmdlet. Obviously the active and passive replicas are still standing. This is called a switchover.
[PS] Move-ActiveMailboxDatabase "DB01" -ActivateOnServer SRV211 -MountDialOverride:None