A comprehensive guide to active directory troubleshooting

Active Directory has been a core part of enterprise IT environments for decades. It's stable, feature-rich, and scales well as organizations grow. However, as any experienced directory administrator can attest, even the most well-maintained AD environments can run into issues.

From replication problems and DNS misconfigurations to group policy failures and account lockouts, these problems can disrupt access and slow down operations. And while AD is built to be resilient, troubleshooting it can sometimes be difficult.

This guide walks you through the most common Active Directory issues, how to identify their root causes, and the steps to fix them. Let’s get started!

A quick overview of active directory

You probably already know this, but for those who don’t, Active Directory (AD) is a Microsoft-built directory service used to manage users, computers, groups, and other networked resources in Windows environments. It helps administrators structure these resources into a clear hierarchy and define who can access what, and under what conditions.

AD works through domain controllers that handle authentication and authorization requests. When a user logs in, accesses a file, or opens an internal app, AD verifies who they are and checks if they have the right permissions. It's also tightly integrated with other services like DNS and Group Policy.

Why is quick troubleshooting of AD issues important?

Here are a few reasons why timely troubleshooting is critical in any Active Directory environment:

  • If AD goes down or starts misbehaving, users may not be able to log in, access shared files, use email, or launch internal apps. The longer it takes to fix, the more disruption it causes to productivity.
  • Something as simple as repeated login failures or frequent account lockouts can lead to a flood of helpdesk tickets. By pinpointing the cause early, you can reduce support overhead.
  • Services like Exchange, SharePoint, VPN access, and even some third-party tools rely on AD for authentication. A small issue in AD can cascade and cause failures in these dependent systems.
  • If Group Policy Objects (GPOs) aren’t getting applied correctly, users may get incorrect permissions, disabled settings, or unrestricted access. This can lead to both operational risks and possible audit failures.
  • If changes made on one domain controller don't sync properly across others, admins and users may see conflicting information or outdated policies.

Authentication failures

Now, let’s start our troubleshooting guide with authentication failures. These are some of the most common and disruptive issues in any AD environment.

Issue 1: Incorrect time synchronization

Kerberos, the default authentication protocol used in AD, is very sensitive to time differences between domain members and controllers.

Symptoms

  • Users receive “Clock skew too great” or “KDC_ERR_PREAUTH_FAILED” errors
  • Login attempts randomly succeed or fail

Troubleshooting

  • Ensure all domain-joined systems sync time from a reliable and consistent source, ideally the domain controller with the PDC Emulator FSMO role
  • Run w32tm /query /status on clients and servers to check current time source and sync status
  • Check event logs (System and Directory Service) for time-related errors on both clients and domain controllers
  • Confirm that Group Policy is correctly pushing time sync settings, and that no conflicting local policies or registry settings are overriding them
  • If you are using virtualization, verify that VM guests are not syncing time from the host while also syncing via AD

Issue 2: Account lockouts due to repeated failed logins

Repeated login failures can lock out a user account, often caused by background services, mapped drives, or mobile devices with old passwords.

Symptoms

  • Users suddenly find themselves locked out after multiple failed login attempts
  • Account lockout logs show frequent failures from the same IP or device

Troubleshooting

  • Use the Account Lockout and Management Tools (e.g., LockoutStatus.exe) to track down which system is causing the lockouts
  • Check scheduled tasks, drive mappings, and cached credentials on the user’s machines for outdated passwords
  • Review event ID 4740 on domain controllers to identify the source of failed login attempts
  • Run net use on affected systems to find persistent connections using old credentials
  • Make sure mobile devices (email apps, VPN clients) have updated passwords after resets

Issue 3: Login failures due to expired passwords

Users are unable to log in if their password has expired.

Symptoms

  • Users see “Your password has expired” messages or cannot log in after a certain date
  • Login attempts work after the password is reset

Troubleshooting

  • Review password expiration policies set in Group Policy or on individual accounts
  • Use net user <username> /domain to check the “Password expires” field
  • In remote or hybrid setups, confirm that users can reach a domain controller to reset passwords
  • Educate users about password expiry warnings and ensure client systems are not suppressing them
  • Consider enabling password expiration notifications through login scripts or system tray alerts

Replication failures

Next up, let’s look at common replication failures and how to fix them.

Issue 1: Replication blocked due to lingering objects

Lingering objects can block replication and cause unexpected behavior.

Symptoms

  • Event ID 1988 or 2042 appears in the Directory Service event logs
  • Repadmin shows replication errors involving lingering objects

Troubleshooting

  • Run repadmin /showrepl and repadmin /replsummary to identify affected domain controllers (DCs)
  • Use repadmin /removelingeringobjects to safely remove stale objects from the affected DC
  • Verify that all DCs are within the tombstone lifetime
  • Avoid restoring DCs from old backups that may contain outdated data
  • Once cleaned up, monitor replication logs to confirm that normal sync resumes

Issue 2: DNS misconfiguration causing replication failures

Replication depends on proper name resolution. If DNS is not set up correctly, DCs may be unable to find each other and sync data.

Symptoms

  • “Target principal name is incorrect” or “RPC server is unavailable” errors in replication logs
  • DNS lookup failures when testing with nslookup or dcdiag

Troubleshooting

  • Ensure all DCs point to the correct internal DNS servers in their network settings
  • Check for missing or incorrect SRV records in DNS under _msdcs and _sites zones
  • Use dcdiag /test:dns and nltest /dsgetdc:<domain> to verify proper name resolution
  • Clear and reload the DNS cache using ipconfig /flushdns and ipconfig /registerdns
  • Confirm that AD-integrated DNS zones are replicating correctly across DCs

Issue 3: Site link or schedule misconfiguration

If site links or replication schedules are misconfigured, some DCs may not replicate frequently or at all.

Symptoms

  • Some sites show outdated directory information
  • repadmin /showrepl shows long replication gaps between specific domain controllers

Troubleshooting

  • Review Active Directory Sites and Services to ensure all sites have correct subnets and site links defined
  • Check that site link schedules allow for regular replication, especially during business hours
  • If needed, use repadmin /kcc to force recalculation of replication topology
  • For multi-site environments, verify that replication intervals meet business requirements and don’t delay changes unnecessarily

Issue 4: SYSVOL not replicating

Even if AD data replicates fine, if SYSVOL (which stores scripts and policies) isn’t syncing, GPOs may not apply properly.

Symptoms

  • New or updated Group Policies don’t apply across domain controllers
  • Event logs show issues with DFSR or FRS

Troubleshooting

  • Identify whether you’re using FRS or DFSR for SYSVOL replication
  • For DFSR, run dfsrdiag backlog to see if replication is stuck or delayed
  • For FRS (older setups), check for event ID 13508 or 13509 in the File Replication Service logs
  • Restart the DFS Replication or FRS services and observe replication behavior
  • Verify that SYSVOL and NETLOGON shares are present and accessible from other DCs

Group policy issues

Group Policy is one of the most powerful features in Active Directory, but also one of the trickiest to troubleshoot when it fails. Let’s look at some of the most common Group Policy issues and how to fix them.

Issue 1: Group policy not applying to users or computers

Sometimes Group Policy Objects (GPOs) just don’t apply, and the reason isn’t always obvious.

Symptoms

  • Users or computers don’t receive expected settings after reboot or login
  • Running gpresult /r or rsop.msc shows missing or unexpected GPOs

Troubleshooting

  • Confirm that the user or computer is in the correct OU where the GPO is linked
  • Check GPO link status in the Group Policy Management Console (GPMC) and make sure it’s not disabled
  • Review the scope of the GPO to confirm it applies to the correct security group and type
  • Use gpresult /h report.html on the affected machine to get a detailed report of what GPOs are being applied and why others are not
  • Make sure “Enforce or Block Inheritance” settings are not affecting expected GPO behavior

Issue 2: GPO settings applying inconsistently across users or systems

GPOs sometimes apply on some machines but not others.

Symptoms

  • Some users get the correct settings, while others don’t
  • GPO application works on one system but fails on an identical one

Troubleshooting

  • Use gpupdate /force on affected systems to trigger a manual policy refresh
  • Check for slow or unreachable domain controllers during logon, especially in multi-site environments
  • Confirm that all DCs are replicating SYSVOL properly using dfsrdiag backlog or by reviewing event logs
  • Compare gpresult reports from working and non-working systems to identify missing GPOs
  • Look for WMI filters or loopback policy processing settings that might be applying selectively

Issue 3: Scripts not running from group policy

Startup, shutdown, logon, or logoff scripts fail to execute even though they are correctly defined in GPOs.

Symptoms

  • Logon scripts don’t map drives or run configured tasks
  • No visible errors, but expected changes from scripts never take effect

Troubleshooting

  • Check that the script files exist in the correct location under the SYSVOL share
  • Ensure scripts have the correct file extension and execution permissions
  • Use Event Viewer under Application and Services Logs > Microsoft > Windows > GroupPolicy > Operational to trace script execution logs
  • If using PowerShell scripts, confirm that the system's execution policy (use Get-ExecutionPolicy to check) allows them to run
  • Review GPO settings under “Scripts” and make sure they are assigned to the correct user or computer context

DNS issues

Next, let’s cover the most common DNS-related issues in AD along with troubleshooting advice on how to fix them.

Issue 1: Clients can’t locate domain controllers

If clients can’t find a domain controller, they won’t be able to log in, join the domain, or apply Group Policy.

Symptoms

  • Login errors like “Domain not available” or “No logon servers available”
  • Running nltest /dsgetdc:<domain> fails to return a domain controller

Troubleshooting

  • Check that the client is using only internal DNS servers and not public ones like 8.8.8.8
  • Use nslookup, ping, or nltest to verify the client can resolve and reach domain controllers
  • Make sure the _ldap and _kerberos SRV records exist in the _msdcs.<domain> zone
  • Validate that the domain controller’s IP is correctly registered in DNS
  • Restart the client’s DNS Client service and flush the DNS cache with ipconfig /flushdns

Issue 2: Missing or stale SRV records

SRV records help clients and domain controllers locate services like LDAP and Kerberos. If they’re missing or outdated, many AD functions break.

Symptoms

  • Replication, authentication, or Group Policy fails unexpectedly
  • dcdiag reports errors related to DNS SRV records

Troubleshooting

  • Run dcdiag /test:dns /v on each domain controller to identify missing or broken DNS records
  • Confirm that AD-integrated DNS zones are replicating properly between DCs
  • Manually check for SRV records in the DNS Manager under _tcp and _udp folders inside the _msdcs.<domain> zone
  • Restart the Netlogon service on affected DCs to trigger re-registration of SRV records
  • Avoid having stale or duplicate A or SRV records pointing to decommissioned or unreachable domain controllers

Issue 3: Duplicate or incorrect host (A) records

Multiple machines with the same hostname or incorrect IPs in DNS can lead to failed logins, slow connections, or replication problems.

Symptoms

  • Users connect to the wrong server or experience random authentication failures
  • ping or nslookup returns inconsistent IPs for the same hostname

Troubleshooting

  • Open DNS Manager and check for duplicate A records for the same hostname
  • Identify and correct any IP conflicts on the network using arp -a or your DHCP server logs
  • Use DHCP with dynamic DNS updates for workstations, and reserve static IPs only for servers and critical systems
  • Make sure each domain controller registers only one A record with the correct IP
  • If cleaning manually, delete old or unused records and re-register valid ones with ipconfig /registerdns

Performance issues

Even when AD is technically “working,” poor performance can make logins slow, delay Group Policy application, or cause lag in authentication and replication. Below are common causes of Active Directory performance problems and how to troubleshoot them.

Issue 1: Slow user logins

Slow logins are one of the most common user complaints in AD environments, often caused by delays in policy processing or network lookups.

Symptoms

  • Users report login screens that hang for extended periods
  • Login times vary widely between users or devices

Troubleshooting

  • Run gpresult /h report.html and check for excessive or slow-loading Group Policies
  • Look for network issues like high latency to domain controllers, especially in remote sites
  • Verify that the client is contacting the nearest domain controller by checking nltest /dsgetdc:<domain>
  • Remove or simplify logon scripts that map drives or printers over slow connections
  • Use Event Viewer to review logon processing times and identify specific delays

Issue 2: High CPU or memory usage on domain controllers

When DCs are under heavy load, they may respond slowly to authentication and replication requests.

Symptoms

  • Delays in user logins or Group Policy application
  • CPU usage consistently stays near 90–100% on one or more domain controllers

Troubleshooting

  • Open Task Manager or Resource Monitor on the DC to identify which processes are consuming resources
  • Check for excessive event log entries or frequent authentication attempts that could indicate a brute force attack or misconfigured service
  • Use perfmon to monitor key counters like LSASS CPU usage, LDAP request rate, and disk I/O
  • Look for large numbers of stale or unnecessary objects in AD that may be bloating the database
  • Review installed services and background tasks to ensure the DC is dedicated and not overloaded with non-AD functions

Security issues

Finally, here are common AD security issues and how to detect and fix them before they cause real damage.

Issue 1: Excessive administrative privileges

Over-privileged accounts increase the risk of lateral movement and privilege escalation if compromised.

Symptoms

  • Too many users are members of Domain Admins, Enterprise Admins, or other high-privilege groups
  • No clear tracking of who has what level of access

Troubleshooting

  • Review membership of built-in privileged groups regularly using PowerShell (e.g., Get-ADGroupMember "Domain Admins")
  • Remove users who don’t need permanent admin access and use Just-in-Time (JIT) access when possible
  • Implement role-based access control (RBAC) to separate duties clearly
  • Monitor for unexpected group membership changes using security event logs (e.g. Event ID 4728, 4729)
  • Set up alerts or reports for any changes to privileged group memberships

Issue 2: Stale user or computer accounts

Inactive accounts are a common entry point for attackers, especially if they still have valid credentials or access rights.

Symptoms

  • AD contains user or computer accounts that haven’t been used in months
  • Old employee accounts are still enabled

Troubleshooting

  • Use PowerShell to identify inactive accounts (Search-ADAccount –AccountInactive) and flag them for review
  • Disable unused accounts before deletion to allow rollback if needed
  • Set expiration dates or automatic disabling for temporary and contractor accounts
  • Implement a process to regularly review and clean up stale accounts
  • Monitor logons and account activity using event logs or SIEM tools

Conclusion

Active Directory is a valuable enterprise service that is often critical to daily operations. We hope that the insights shared in this guide come in handy the next time your directory starts acting up.

To have better visibility across your AD network, check out the end-to-end AD monitoring tool by Site24x7.

Was this article helpful?
Monitor your Active Directory health

Gain insights into domain controller performance, replication status, LDAP and SAM client sessions, and database operations. Ensure consistent AD performance and prevent potential bottlenecks with real-time monitoring.

Related Articles