Introduction
MariaDB Galera Cluster is a synchronous multi-master replication solution designed for high availability and strong consistency.

Unlike traditional asynchronous replication, Galera replicates transactions to all nodes before commit, ensuring every node stays consistent with no replica lag.
It is commonly used for:
- High availability databases
- Automatic failover
- Read scalability
- Fault-tolerant applications
Galera Cluster Architecture
A Galera-enabled MariaDB node consists of several layers:

Core Components
MariaDB Server
Handles:
- SQL execution
- InnoDB storage engine
- MVCC and locking
- Transaction management
wsrep API
The Write Set Replication API connects MariaDB with Galera.
Responsibilities:
- Extract write sets
- Replicate transactions
- Coordinate certification
- Manage cluster synchronization
Galera Replication Provider
Handles:
- Cluster communication
- Membership control
- Conflict detection
- Replication ordering
- Flow control
How Replication Works
Galera uses Write-Set Replication instead of binary log shipping.
Transaction flow:
- Transaction executes locally
- wsrep extracts modified rows
- Write set is sent to all nodes
- Nodes perform certification checks
- Transaction commits cluster-wide
Example:
UPDATE accounts
SET balance = balance – 100
WHERE id = 10;
The modified row data is replicated before COMMIT succeeds.
Certification-Based Replication
Galera uses optimistic concurrency control.
Instead of distributed locking:
- Transactions run independently
- Conflict detection occurs during commit
If two nodes modify the same row simultaneously:
- One transaction succeeds
- The other fails certification
This is why primary keys are mandatory for performance and conflict detection.
Group Communication System (GCS)
The Group Communication System ensures:
- Ordered transaction delivery
- Reliable node communication
- Membership tracking
- Failure detection
Galera uses Extended Virtual Synchrony (EVS) to maintain cluster consistency.
Quorum and Split-Brain Prevention
Galera requires a majority of nodes to remain operational.
Example:
3-node cluster -> minimum 2 nodes required
If quorum is lost:
- Cluster becomes non-primary
- Writes stop automatically
This prevents split-brain scenarios.
State Transfer Mechanisms
When a node joins the cluster, it synchronizes using:
SST (State Snapshot Transfer)
Full database copy from donor node.
Common method:
wsrep_sst_method=mariabackup
IST (Incremental State Transfer)
Transfers only missing transactions using GCache.
Much faster than SST.
Flow Control
Slow nodes can cause replication backlog.
Galera uses flow control to pause replication temporarily.
Important metric:
wsrep_flow_control_paused
High values indicate:
- Slow disks
- Network latency
- CPU bottlenecks
Important Ports
Port Purpose
3306 Client traffic
4567 Replication
4568 IST
4444 SST
Performance Considerations
Galera performs best with:
- Low-latency networks
- SSD storage
- Small transactions
- Proper indexing
Avoid:
- Large transactions
- WAN latency
- High write contention
Best Practices
Use Minimum 3 Nodes
Recommended architecture:
Node1
Node2
Node3
Use InnoDB Only
default_storage_engine=InnoDB
MyISAM is unsafe for Galera replication.
Tune Parallel Replication
wsrep_slave_threads=16
Improves replication throughput on multi-core systems.
Galera vs Traditional Replication
| Feature | Async Replication | Galera |
|---|---|---|
| Replication | Asyncronous | Synchronous |
| Replica Lag | Possible | None |
| Failover | Manual | Automatic |
| Multi-Master | Limited | Native |
| Consistency | Eventual | Strong |
Conclusion
Galera Cluster extends MariaDB with:
- Synchronous replication
- Multi-master capability
- Automatic failover
- Strong consistency
Its architecture combines:
- Write-set replication
- Certification-based conflict detection
- Group communication protocols
- Automatic state synchronization
For production environments requiring high availability and minimal downtime, Galera remains one of the most reliable clustering solutions in the MariaDB ecosystem.
