Introduction

MariaDB Galera Cluster is a synchronous multi-master replication solution designed for high availability and strong consistency.

Unlike traditional asynchronous replication, Galera replicates transactions to all nodes before commit, ensuring every node stays consistent with no replica lag.

It is commonly used for:

  • High availability databases
  • Automatic failover
  • Read scalability
  • Fault-tolerant applications

Galera Cluster Architecture

A Galera-enabled MariaDB node consists of several layers:

Core Components

MariaDB Server

Handles:

  • SQL execution
  • InnoDB storage engine
  • MVCC and locking
  • Transaction management

wsrep API

The Write Set Replication API connects MariaDB with Galera.

Responsibilities:

  • Extract write sets
  • Replicate transactions
  • Coordinate certification
  • Manage cluster synchronization

Galera Replication Provider

Handles:

  • Cluster communication
  • Membership control
  • Conflict detection
  • Replication ordering
  • Flow control

How Replication Works

Galera uses Write-Set Replication instead of binary log shipping.

Transaction flow:

  1. Transaction executes locally
  2. wsrep extracts modified rows
  3. Write set is sent to all nodes
  4. Nodes perform certification checks
  5. Transaction commits cluster-wide

Example:

UPDATE accounts

SET balance = balance – 100

WHERE id = 10;

The modified row data is replicated before COMMIT succeeds.

Certification-Based Replication

Galera uses optimistic concurrency control.

Instead of distributed locking:

  • Transactions run independently
  • Conflict detection occurs during commit

If two nodes modify the same row simultaneously:

  • One transaction succeeds
  • The other fails certification

This is why primary keys are mandatory for performance and conflict detection.

Group Communication System (GCS)

The Group Communication System ensures:

  • Ordered transaction delivery
  • Reliable node communication
  • Membership tracking
  • Failure detection

Galera uses Extended Virtual Synchrony (EVS) to maintain cluster consistency.

Quorum and Split-Brain Prevention

Galera requires a majority of nodes to remain operational.

Example:

3-node cluster -> minimum 2 nodes required

If quorum is lost:

  • Cluster becomes non-primary
  • Writes stop automatically

This prevents split-brain scenarios.

State Transfer Mechanisms

When a node joins the cluster, it synchronizes using:

SST (State Snapshot Transfer)

Full database copy from donor node.

Common method:

wsrep_sst_method=mariabackup

IST (Incremental State Transfer)

Transfers only missing transactions using GCache.

Much faster than SST.

Flow Control

Slow nodes can cause replication backlog.

Galera uses flow control to pause replication temporarily.

Important metric:

wsrep_flow_control_paused

High values indicate:

  •  Slow disks
  •  Network latency
  •  CPU bottlenecks

Important Ports

Port Purpose

3306 Client traffic

4567 Replication

4568 IST

4444 SST

Performance Considerations

Galera performs best with:

  •  Low-latency networks
  •  SSD storage
  •  Small transactions
  •  Proper indexing

Avoid:

  •  Large transactions
  •  WAN latency
  •  High write contention

Best Practices

Use Minimum 3 Nodes

Recommended architecture:

Node1

Node2

Node3

Use InnoDB Only

default_storage_engine=InnoDB

MyISAM is unsafe for Galera replication.

Tune Parallel Replication

wsrep_slave_threads=16

Improves replication throughput on multi-core systems.

Galera vs Traditional Replication

Feature Async Replication Galera
Replication Asyncronous Synchronous
Replica Lag Possible None
Failover Manual Automatic
Multi-Master Limited Native
Consistency Eventual Strong

Conclusion

Galera Cluster extends MariaDB with:

  • Synchronous replication
  • Multi-master capability
  • Automatic failover
  • Strong consistency

Its architecture combines:

  •  Write-set replication
  •  Certification-based conflict detection
  •  Group communication protocols
  •  Automatic state synchronization

For production environments requiring high availability and minimal downtime, Galera remains one of the most reliable clustering solutions in the MariaDB ecosystem.