It is easy to dismiss the sizing of your Active Directory infrastructure simply based on CPU and RAM resource consumption; the resources generally appear to have minimal usage. However, Active Directory [Domain Services] is a little more complicated than that and Exchange Server is the canary in the coal mine, so to speak. There are some fairly clearly documented metrics for sizing domain controllers, but there are many details that remain.
Before you can go about sizing Active Directory for Exchange, you must first size Exchange properly. I won’t go into significant detail here as this is a vast subject in its own right, but it starts by understanding the best practices for Exchange and applying the Exchange Server Role Sizing Calculator to deliver a sane recommendation on the number and placement of servers. Considerations for input include the number and usage patterns of various mailboxes (profiles); essentially, the tool allows you to define four profiles that include the size of mailboxes, the average message size, and other metrics, then defining the number of mailboxes in each profile. The profiles will be rather broad as they will be averages of certain types of users. Beyond that, the decision points around resiliency and hardware play into the overall sizing. Their are a few broad recommendations that exist related to Exchange sizing:
- Virtualization is not preferred – virtualization is completely supported for Exchange, but it doesn’t make sense for on-premises environments, especially medium-to-large deployments (smaller deployments and hybrid co-existence servers being the exception). It comes with some overhead (10%) and the requirements to properly virtualize mean that resources need to be dedicated to Exchange. It can be done, but doing it properly is rather costly and won’t generally make sense.
- Scale out rather than scale up – this has long been the mantra for Exchange, and it is more apt than ever when deploying Exchange in a highly available configuration because failover of a server causes additional work for each mailbox that is failed over. Adding more servers dilutes the overall user count per server and reduces the number of operations required in any failover scenario.
With that said, a sizing exercise for Exchange will output a key metric in regards to Active Directory sizing, the number of CPU cores required for Exchange in each Active Directory site which contains Exchange servers. Microsoft has an upper limit recommendation for any given server to have no more than two (2) sockets and each socket to be limited in the core count in a range of about 10-12. The main reason here is that as of Exchange Server 2013, most of Exchange is written in .NET managed code and increasing the number of CPU cores, by itself, will increase the amount of RAM that is preallocated for the system; basically, it inflates the amount of required RAM for no other benefit.
If we were to design a solution for 10 thousand mailboxes and the sizing calculator validated six (6) Exchange Servers in each of two Active Directory sites with 24 CPU cores each, then each Active Directory site would have 144 Exchange Server CPU cores. This is the primary metric we need to begin sizing Active Directory.
Active Directory CPU Cores
The Exchange team published guidance that is based on the number of Exchange Server CPU cores and there are two recommendations that align with the CPU type used by the domain controllers. If you are still running 32-Bit domain controllers, then the guidance is to have one (1) Active Directory CPU core for every four (4) Exchange Server CPU cores in a site; if you are running 64-Bit domain controllers, the guidance is to have one (1) Active Direcctory CPU core for every eight (8) Exchange Server CPU cores in a site. The best practice would be to have 64-Bit Windows Server for your domain controllers. If you’re planning to use Exchange Server 2016, your Forest and Domain Functional Levels will need to be at least Windows Server 2008, and Windows Server 2008 R2 and later are only offered in 64-Bit. If this is a point of contention, remediating the overall Active Directory sizing is a good opportunity to move to all 64-Bit domain controllers if you haven’t already done so.
For our environment, 144 Exchange Server CPU cores per site mean that we require 36 AD CPU cores or 18 AD CPU cores, respectively for 32-Bit or 64-Bit domain controllers. There are some general best practices for sizing domain controllers, but my recommendation would be to stay between 2-4 CPU cores per domain controller. Limiting the number of CPU cores per DC is recommended for a couple of reasons:
- Virtualization – many organizations virtualize domain controllers, along with just about everything else. While the recommendation for Exchange Server is to prefer physical servers, it is fine to have virtualized domain controllers as long as the broader requirements for virtualized DCs are followed. Hypervisors have historically had task prioritization issues with guests that have numerous CPU cores assigned. The ultimate best practice has been to limit guests to a single CPU core whenever possible, but having 18 domain controllers just to have single CPU cores allocated comes with its own challenges. 2-4 is a reasonable compromise. You could also consider coming up with a number that is easily divisible amount your CPU cores so there there is no remainder. 6 CPU cores per DC would mean 3 DCs per site for 64-Bit domain controllers, 4 CPU cores would mean 5 DCs per site, and 2 CPU cores would mean 9 DCs per site. 4 seems the most reasonable in this case.
- Scale out rather than scale up – just as with Exchange, there is cause to scale out domain controllers rather than scaling up. This will be discussed further when we consider resiliency. In addition, there are limits that are per server, so by having more servers, we can serve a greater number of overall operations.
Beyond CPU cores, RAM is another important consideration and it has a rather simple outcome. The best practice is to have 4GB of RAM plus enough RAM to keep the NTDS.DIT file in memory. This file is the Active Directory database itself. Being able to keep it in memory will give you better performance. Simple enough.
NOTE: This recommendation is a simplified model that isn’t quite as precise as others out there, but its final results are usually within 5% of recommendations. Those models start with 2GB of RAM plus the size of the NTDS.DIT, and then account for agents (antivirus, monitoring, security, etc.), utilities, backups, and other services (DNS, DHCP, etc.) that might be loaded on a domain controller.
If we go with 4 CPU cores per DC, then it is necessary to have five (5) DCs in each of the datacenter with Exchange servers. But what happens if a DC goes offline? Well, here we rely on our tried and true N+1 guidance. So, given this, we will need six (6) DCs in each datacenter to accommodate this requirement. This also lends itself towards scaling out versus scaling up, because having fewer servers means more CPU cores per server and redundant servers will increase the overall CPU core count for domain controllers… if you have 6 CPU cores per server, your redundant servers add 6 CPU cores, whereas a solution with 4 CPU cores would add only 4 CPU cores for each redundant server. In our scenario, the numbers are a little off because 4 CPU cores will leave a remainder of 2 CPU cores… in this situation, 24 CPU cores would be required either way.
Prior to Active Directory, domain operations were not multi-master operations; an environment would have a Primary Domain Controller (PDC) and numerous Backup Domain Controllers (BDC). This changed with Active Directory with the exception of a few core capabilities which are known as Flexible Single Master Operations roles. Some of these are per forest (2), and some are per domain (3). In total, a single domain forest would have five of these roles. Guidance has changed over the years and it is now recommended to place all of the FSMO roles on a single server. From an Exchange perspective, this is beneficial because it is also recommended that Exchange not utilize FSMO role holders. Exchange has a capability that allows administrators to exclude specific domain controllers which is how you would facilitate this requirement.
FSMO capabilities are basically services that the role holder(s) offer mainly to the other domain controllers. If Exchange is contacting the FSMO role holder, it is consuming some of its resources and can mean that other domain controllers are waiting, even if ever so shortly, for the FSMO role holder to respond to their requests. This situation can add authentication latency if it is not accounted for. Some of these services include Network Time Protocol (NTP) that is required for Kerberos, password changes, SID/RID reservations, etc. This means that if an AD site that contains Exchange servers has a FSMO role holder, the server count recommendation goes from N+1 to N+1+1. You should also account for redundancy of the FSMO role holder which is simple enough if you have the redundant server in the opposite datacenter, which means that each of the two datacenters will need that N+1+1 count met.
Another consideration worth tossing out there is that if you virtualize your domain controllers, consider having a physical server for the FSMO role holder.
You may be thinking to yourself that my environment seems fine today. Why should I dedicate these resources to remediating to these guidelines.
Hidden Performance Issues
There are various symptoms that you can experience if these recommendations are not heeded. These can appear as Outlook repeatedly prompting users for credentials and unexplained database failovers, which in turn creates additional authentication requests. How do you know if you are experiencing these issues? Review the Event Logs on your Exchange Servers and DCs for NETLOGON errors 5816-5819 and Kerberos error 7. In addition, you can monitor your Exchange servers for LDAP Read Time and LDAP Search Time. If your latency on these counters is approaching or exceeding 100ms or you see these Event Logs errors, you have an undersized AD infrastructure relative to Exchange. The best practice is to maintain a latency of <50ms.
If you are migrating from a legacy version of Exchange to a newer version of Exchange, the new infrastructure will be proxying requests through to the legacy environment. If you aren’t experiencing AD issues today and you don’t follow the guidelines presented here, you likely will because you will be increasing the number of authentication requests as the new infrastructure will authenticate users and so will the legacy infrastructure.
If you are planning to consolidate Exchange servers into fewer datacenters, you will also be increasing the load of the AD infrastructure in those sites.
These are some of the primary concerns that exist relative to AD sizing for Exchange. There are many more considerations on sizing that are pertinent to more complex AD infrastructures: multi-domain and multi-forest environments. However, these considerations are the basis for recommendations in these environments, as well. If I receive feedback requesting a follow on, I will write up the guidance for related to those considerations, as well.