DevOps · K8s · Volleyball · Travel  •  DevOps · K8s · Volleyball · Travel  •  DevOps · K8s · Volleyball · Travel
Explore NY Stream

Multi-Master Replication Of OpenLDAP Server on CentOS

— ny_wk

Multi-Master Replication Of OpenLDAP Server on CentOS

OpenLDAP multi-master replication on CentOS lets two or more directory servers accept writes simultaneously and stay in sync using the syncrepl engine, giving you a highly available, load-balanced LDAP backend with no single point of failure. This walkthrough explains the concepts, then shows the exact olcServerID, syncprov, and olcSyncRepl configuration you need on each node, plus the pitfalls and verification steps that decide whether your cluster actually converges.

Why Multi-Master OpenLDAP Replication Matters

A standalone directory is a liability. If the one box holding your users, groups, sudo rules, and host maps goes down, every login and authenticated service that depends on it stalls. Replicated directories are a baseline requirement for any resilient enterprise deployment, and OpenLDAP multi-master replication on CentOS is the most common way to build that resilience on Red Hat-family systems.

Older OpenLDAP documentation framed everything as a single master and a set of read-only slaves. That model was rigid: a database could only ever play one role. Modern OpenLDAP (2.4 and later) replaced those terms with provider (a server that sends updates) and consumer (a server that receives them). The relationship is fluid — a consumer can re-propagate what it receives, so the same server can be a provider and a consumer at the same time. Multi-master is simply the case where every node is both.

Single-Master vs Multi-Master: Which Do You Need?

Before configuring anything, be honest about your write pattern, because that decides the topology.

AspectSingle-Master (Provider/Consumer)Multi-Master (Mirror Mode / N-Way)
Writes accepted onOne node onlyEvery node
Reads served fromAny nodeAny node
Failover for writesManual promotionAutomatic — keep writing elsewhere
Conflict riskNone (one writer)Real — concurrent writes can collide
Best forRead-heavy, rare writesHA, write availability, geo-distribution

If your directory is overwhelmingly read traffic and writes are infrequent and centralized, a single provider with read-only consumers is simpler and safer. Choose multi-master when you genuinely need writes to survive the loss of any node, or when an application must point at a local writable replica. The convenience of writing anywhere comes with a real cost: conflict resolution, which we cover below.

How syncrepl Actually Works

The replication engine is syncrepl, a consumer-side engine that runs as a thread inside slapd. It connects to a provider, performs an initial bulk load of the directory, then keeps the local copy current. It speaks the LDAP Content Synchronization protocol (RFC 4533), which supports two modes:

  • refreshOnly — pull-based. The consumer periodically polls the provider on a set interval and asks for everything that changed since its last sync.
  • refreshAndPersist — push-based. The consumer opens a persistent search and the provider streams changes in near real time. This is the right choice for multi-master, because it minimizes the window in which nodes disagree.

State is tracked with synchronization cookies. The provider maintains a contextCSN (Change Sequence Number) for each database — effectively a high-water mark of the most recent committed change. Each entry carries an entryCSN and a stable entryUUID that uniquely identifies it even if the DN changes. Because consumers and providers store their state in the same place (the contextCSN on the suffix entry), any node can be promoted or demoted without special handling — which is exactly what makes multi-master possible.

The piece that lets a server send updates is the syncprov overlay (the Sync Provider). In a multi-master setup every node loads syncprov so it can both serve and consume changes.

Prerequisites Before You Configure Replication

This guide assumes you already have a working OpenLDAP install on each node — the slapd service running, a populated base DIT, and ideally TLS certificates in place. If you have not done a base install yet, set that up first (package install, DB_CONFIG, the cn=config tree, base schema, and a manager DN) and then return here. Multi-master is a layer on top of a healthy single server, not a substitute for one.

Check the following on every node before you start:

  • Identical schema and base DN. All nodes must share the same suffix (e.g. dc=sysadminshare,dc=local) and the same loaded schema. Mismatched schema breaks replication immediately.
  • Resolvable, stable hostnames. Each node must reach the others by FQDN. Put entries in DNS or /etc/hosts; replication URIs use hostnames, not just IPs.
  • Synchronized clocks. Install and enable chrony (or ntpd) on every node. CSNs embed timestamps; clock skew causes wrong-winner conflict resolution.
  • Firewall and SELinux. Open 389/tcp (LDAP) and 636/tcp (LDAPS) between nodes, and confirm SELinux allows slapd outbound connections.
  • A replication bind identity. Decide which DN each node uses to bind to its peers (commonly the directory manager DN, or a dedicated replication account with read access to the whole tree).

Our reference environment uses three CentOS servers, all acting as masters:

  • ldpsrv1.sysadminshare.local — 192.168.12.10
  • ldpsrv2.sysadminshare.local — 192.168.12.20
  • ldpsrv3.sysadminshare.local — 192.168.12.30

cn=config (olc) vs Legacy slapd.conf

There are two ways to configure slapd, and it matters which one you use.

  • Legacy slapd.conf: a single static text file. You edit it and restart the daemon. Directives are capitalized words like ServerID, overlay syncprov, and syncrepl. It is deprecated but still works if you run slaptest to convert it.
  • Modern cn=config (the "olc" dynamic backend): the configuration lives inside the directory itself, under /etc/openldap/slapd.d/, and is changed live with ldapmodify over ldapi:/// — no restart, no downtime. Attributes are prefixed with olc: olcServerID, olcSyncRepl, olcMirrorMode.

On current CentOS/RHEL the dynamic cn=config method is the default and the recommended path — it is what we use for the main walkthrough. The legacy file is shown afterward for reference and for older boxes.

Step-by-Step: Multi-Master with cn=config (Recommended)

Run these steps on the nodes as indicated. You authenticate to the config backend locally using SASL EXTERNAL over the Unix socket (ldapi:///), so root on the box has manage access.

1. Load the syncprov Module on Every Node

The Sync Provider overlay must be available before you can reference it. Create an LDIF and apply it on each node:

  1. Create syncprov_mod.ldif:

    dn: cn=module,cn=config
    objectClass: olcModuleList
    cn: module
    olcModulePath: /usr/lib64/openldap
    olcModuleLoad: syncprov.la

  2. Apply it: ldapadd -Y EXTERNAL -H ldapi:/// -f syncprov_mod.ldif

If the module list already exists, use changetype: modify / add: olcModuleLoad instead of adding a fresh entry.

2. Set a Unique olcServerID on Each Node

Every server in an N-way multi-master cluster needs a unique numeric server ID. This ID is baked into every CSN that node generates, which is how the cluster tells changes apart and breaks ties. Set 1 on the first node, 2 on the second, 3 on the third.

On ldpsrv1, create olcserverid.ldif:

dn: cn=config
changetype: modify
replace: olcServerID
olcServerID: 1 ldap://ldpsrv1.sysadminshare.local
olcServerID: 2 ldap://ldpsrv2.sysadminshare.local
olcServerID: 3 ldap://ldpsrv3.sysadminshare.local

When you bind the ID to a URL like this, every node can carry the same full list — slapd matches its own URL to discover which ID is its own. Apply it: ldapmodify -Y EXTERNAL -H ldapi:/// -f olcserverid.ldif. Run the identical modify on all three nodes.

3. Replicate the Configuration Database (cn=config) Itself

A real strength of the dynamic backend is that you can replicate cn=config too, so future config changes propagate automatically. First give the config database a password (so peers can bind to it) on each node:

  1. Generate a hash: slappasswd — copy the {SSHA}… output.
  2. Create olcdatabase.ldif and set it:

    dn: olcDatabase={0}config,cn=config
    changetype: modify
    add: olcRootPW
    olcRootPW: {SSHA}REPLACE_WITH_YOUR_HASH

  3. Apply: ldapmodify -Y EXTERNAL -H ldapi:/// -f olcdatabase.ldif

Now enable syncprov on the config DB and point each node at all peers. Create configrep.ldif:

dn: olcOverlay=syncprov,olcDatabase={0}config,cn=config
changetype: add
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: syncprov

dn: olcDatabase={0}config,cn=config
changetype: modify
add: olcSyncRepl
olcSyncRepl: rid=001 provider=ldap://ldpsrv1.sysadminshare.local binddn="cn=config" bindmethod=simple credentials=YOUR_CONFIG_PW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=1
olcSyncRepl: rid=002 provider=ldap://ldpsrv2.sysadminshare.local binddn="cn=config" bindmethod=simple credentials=YOUR_CONFIG_PW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=1
olcSyncRepl: rid=003 provider=ldap://ldpsrv3.sysadminshare.local binddn="cn=config" bindmethod=simple credentials=YOUR_CONFIG_PW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=1
-
add: olcMirrorMode
olcMirrorMode: TRUE

Apply: ldapmodify -Y EXTERNAL -H ldapi:/// -f configrep.ldif. The critical flag is olcMirrorMode: TRUE — it tells slapd that this consumer is also a writable provider, which is what makes the cluster multi-master rather than read-only.

4. Enable syncprov and Replication on the Data Database

With cn=config replicating, the rest only needs to be applied on one node — it propagates to the others. Most CentOS installs name the main DB {2}hdb or {2}mdb; check yours with ldapsearch -Y EXTERNAL -H ldapi:/// -b cn=config olcDatabase=* dn and substitute accordingly.

First add the provider overlay to the data DB (syncprov.ldif):

dn: olcOverlay=syncprov,olcDatabase={2}hdb,cn=config
changetype: add
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: syncprov

Apply it, then configure the replication agreements and required indexes (olcdatabasehdb.ldif):

dn: olcDatabase={2}hdb,cn=config
changetype: modify
add: olcSyncRepl
olcSyncRepl: rid=004 provider=ldap://ldpsrv1.sysadminshare.local binddn="cn=ldapadm,dc=sysadminshare,dc=local" bindmethod=simple credentials=YOUR_DIR_PW searchbase="dc=sysadminshare,dc=local" type=refreshAndPersist retry="5 5 300 5" timeout=1
olcSyncRepl: rid=005 provider=ldap://ldpsrv2.sysadminshare.local binddn="cn=ldapadm,dc=sysadminshare,dc=local" bindmethod=simple credentials=YOUR_DIR_PW searchbase="dc=sysadminshare,dc=local" type=refreshAndPersist retry="5 5 300 5" timeout=1
olcSyncRepl: rid=006 provider=ldap://ldpsrv3.sysadminshare.local binddn="cn=ldapadm,dc=sysadminshare,dc=local" bindmethod=simple credentials=YOUR_DIR_PW searchbase="dc=sysadminshare,dc=local" type=refreshAndPersist retry="5 5 300 5" timeout=1
-
add: olcDbIndex
olcDbIndex: entryUUID eq
-
add: olcDbIndex
olcDbIndex: entryCSN eq
-
add: olcMirrorMode
olcMirrorMode: TRUE

Apply with ldapmodify -Y EXTERNAL -H ldapi:/// -f olcdatabasehdb.ldif. Note the two olcDbIndex lines: equality indexes on entryUUID and entryCSN are not optional for a provider. Without them, every startup scan and incremental sync has to walk the whole database, which is brutally slow on large directories.

5. Field Reference for the syncrepl Directive

Each olcSyncRepl value is one replication agreement. The key parameters:

  • rid — replica ID, a unique number per agreement on that server.
  • provider — the URI of the peer to pull from. Use ldaps:// for TLS.
  • typerefreshAndPersist (push, recommended) or refreshOnly (poll).
  • interval — poll period for refreshOnly, in dd:hh:mm:ss (ignored by refreshAndPersist).
  • retry — reconnect schedule, e.g. "5 5 300 5" = retry every 5s for 5 tries, then every 300s for 5 tries, then give up (use a trailing + to retry forever).
  • searchbase / scope — what subtree to replicate; scope=sub covers everything below the base.
  • binddn / credentials / bindmethod — how this node authenticates to the peer.

Securing Replication Traffic with TLS

Replication carries every attribute in your directory, including password hashes, across the network. Never run it in clear text between nodes. Each server presents its own certificate and trusts the others.

  1. Distribute each node's public certificate to its peers (for example into /etc/openldap/certs/):

    scp /etc/pki/tls/certs/ldap1pub.pem ldpsrv2:/etc/openldap/certs/

  2. Fix ownership so slapd can read them: chown ldap. /etc/openldap/certs/ldap1pub.pem
  3. Point the consumer at the trusted CA in each agreement by adding tls_cacert=/etc/openldap/certs/ldap1pub.pem to the olcSyncRepl line, and use provider=ldaps://….
  4. For start-of-connection security on port 389 instead of LDAPS, add starttls=critical to the agreement so a failed TLS handshake aborts the connection rather than falling back to plaintext.

Use bindmethod=sasl with a Kerberos/GSSAPI mechanism in environments that already run Kerberos; otherwise bindmethod=simple over TLS is acceptable, since TLS encrypts the simple bind credentials.

Common Pitfalls and How to Avoid Them

Most multi-master failures are not bugs in OpenLDAP — they are environment problems. Watch for these.

  • Clock skew. CSNs are timestamp-based. If one node's clock is ahead, its changes always "win" tie-breaks and genuinely newer changes from other nodes can be silently ignored. Enforce NTP/chrony everywhere before you trust the cluster.
  • Duplicate or missing serverID. Two nodes sharing an olcServerID corrupts conflict resolution. Each must be unique and present.
  • Write conflicts. Multi-master uses a last-writer-wins policy keyed on the CSN. If two admins modify the same entry on two nodes within the same instant, one change is discarded — there is no merge. Design applications so a given entry is normally written from one place, or route writes through a load balancer with sticky sessions to reduce collisions.
  • Delete/rename races. Deleting an entry on one node while modifying it on another can leave orphaned references or "glue" entries. Audit periodically.
  • Schema drift. Adding schema or an overlay on only one node (when cn=config is not replicated) causes that node to reject replicated entries. Keep config in sync.
  • Missing entryCSN/entryUUID indexes. Omitting them does not stop replication but makes provider startup scan the entire DB, which can take many minutes on a big directory.
  • Firewall/SELinux blocks. A node that cannot open an outbound connection to its peer will sit retrying forever; check retry attempts in the log.

Verifying That Replication Works

Configuration applying cleanly is not proof of convergence. Test it explicitly.

  1. Confirm the daemon is healthy. Restart and check status on each node: systemctl restart slapd then systemctl status slapd. Watch the log (/var/log/ldap.log if you routed local4 there) for do_syncrepl connection messages and the absence of bind errors.
  2. Write on one node, read on another. Add a test user on ldpsrv1:

    ldapadd -x -W -D "cn=ldapadm,dc=sysadminshare,dc=local" -f ldaptest.ldif

    Then immediately search for it on ldpsrv2:

    ldapsearch -x cn=ldaptest -b dc=sysadminshare,dc=local

    The entry should appear within seconds.
  3. Prove it is truly multi-master. The real test is a write on the second node. Set the new user's password from ldpsrv2:

    ldappasswd -s password123 -W -D "cn=ldapadm,dc=sysadminshare,dc=local" -x "uid=ldaptest,ou=People,dc=sysadminshare,dc=local"

    In a master-slave setup this would be rejected on the slave. If it succeeds and the change shows up on the other nodes, multi-master is working.
  4. Compare contextCSN across nodes. This is the definitive convergence check. On each node run:

    ldapsearch -Y EXTERNAL -H ldapi:/// -b "dc=sysadminshare,dc=local" -s base contextCSN

    Every node should report the same set of contextCSN values (one CSN per serverID). Matching CSNs mean the directories are identical and fully synced; values that lag and never catch up point to a broken agreement or a network/clock problem.

Legacy slapd.conf Equivalent (Older Systems)

On a box still using the static file, the same outcome is expressed inline. The core directives in the database section look like this:

ServerID 1 "ldaps://ldap1.example.com"
ServerID 2 "ldaps://ldap2.example.com"
overlay syncprov
syncprov-checkpoint 10 1
syncprov-sessionlog 100
syncrepl rid=1 provider="ldaps://ldap1.example.com" type=refreshAndPersist retry="5 10 60 +" searchbase="dc=example,dc=com" scope=sub bindmethod=simple binddn="cn=Manager,dc=example,dc=com" credentials="secret" tls_cacert=/etc/openldap/certs/ldap1pub.pem
syncrepl rid=2 provider="ldaps://ldap2.example.com" type=refreshAndPersist retry="5 10 60 +" searchbase="dc=example,dc=com" scope=sub bindmethod=simple binddn="cn=Manager,dc=example,dc=com" credentials="secret" tls_cacert=/etc/openldap/certs/ldap2pub.pem
MirrorMode on

After editing, convert the file into the dynamic format and fix permissions, since modern slapd runs from slapd.d:

  1. rm -rf /etc/openldap/slapd.d/*
  2. slaptest -f /etc/openldap/slapd.conf -F /etc/openldap/slapd.d/
  3. chown -R ldap. /etc/openldap/slapd.d/
  4. systemctl restart slapd

Prefer the native cn=config approach for anything new — the static file is a migration aid, not a destination.

Pointing Clients at the Cluster

The last piece is making clients aware of all masters so they can fail over. On CentOS, list every node when you configure LDAP authentication:

authconfig --enableldap --enableldapauth --ldapserver=ldpsrv1.sysadminshare.local,ldpsrv2.sysadminshare.local,ldpsrv3.sysadminshare.local --ldapbasedn="dc=sysadminshare,dc=local" --enablemkhomedir --update

On newer releases that have retired authconfig, use authselect with an sssd profile and list the URIs in /etc/sssd/sssd.conf. Either way, multiple server entries let a client keep authenticating when one node is down — the entire reason you built multi-master in the first place.

Key Takeaways

  • Multi-master = every node is both provider and consumer, achieved with syncprov on each node plus olcMirrorMode: TRUE on each database.
  • Each server needs a unique olcServerID, because that ID is embedded in CSNs and drives conflict resolution.
  • Use cn=config with ldapmodify over ldapi:/// for live, no-restart changes; treat slapd.conf as legacy.
  • Synchronized clocks and TLS between nodes are mandatory, not optional — skew breaks conflict resolution and clear text leaks password hashes.
  • Verify by comparing contextCSN on every node and by writing on one node and reading (and writing) on another.

Frequently Asked Questions

What is the difference between mirrormode and N-way multi-master?

They are the same underlying mechanism. MirrorMode (or olcMirrorMode: TRUE) is the flag that turns a replica into a writable master. "N-way multi-master" simply describes running that configuration across N nodes where all of them accept writes. Classic two-node mirror mode often paired the nodes behind a load balancer that sent writes to one at a time; true N-way lets clients write to any node directly.

How does OpenLDAP resolve write conflicts in multi-master?

It uses last-writer-wins based on the Change Sequence Number (CSN), which combines a timestamp, a change count, and the server ID. The change with the higher CSN wins; the other is discarded. There is no field-level merge, which is exactly why synchronized clocks matter and why you should avoid writing the same entry from two places at once.

Do I have to restart slapd after changing the configuration?

No — that is the main advantage of the cn=config backend. Changes applied with ldapmodify over ldapi:/// take effect immediately. You only restart when troubleshooting or when migrating from a legacy slapd.conf that you converted with slaptest.

How do I confirm two nodes are fully in sync?

Search the suffix entry for contextCSN on each node (ldapsearch -b "dc=example,dc=com" -s base contextCSN) and compare the values. When all nodes report the same set of CSNs — one per serverID — the directories are identical. Persistent differences indicate a stuck replication agreement, a firewall block, or clock skew.

Found this guide useful? Subscribe to @explorenystream on YouTube for more Linux and system administration walkthroughs.