How Percona XtraBackup works¶
Prerequisites
BACKUP_ADMIN is required to query performance_schema.log_status and to use LOCK INSTANCE FOR BACKUP, LOCK TABLES FOR BACKUP, or LOCK BINLOG FOR BACKUP. Additional privileges may be required depending on options: RELOAD, LOCK TABLES, and REPLICATION CLIENT are still needed for scenarios that use FLUSH TABLES WITH READ LOCK or --slave-info. Grant the minimum set for your use case; see Connection and privileges needed for the full list and examples.
High-level process: three phases¶
Percona XtraBackup (PXB) follows a sequential lifecycle. Understanding the three phases helps you reason about backup consistency and where blocking can occur.
| Step | Phase | What happens |
|---|---|---|
| 1 | Hot copy (Backup) | PXB copies data files while the server is running and tracks changes via the redo log. |
| 2 | Make the data consistent (Prepare) | PXB applies the captured redo log to the copied data, then rolls back uncommitted transactions (roll forward + roll back). |
| 3 | Deployment (Restore) | The prepared, consistent data is copied or moved back to the server data directory. |
Under the hood, PXB relies on InnoDB’s crash-recovery model: PXB copies data files (which are momentarily inconsistent), then replays the redo log and applies undo to produce a consistent snapshot. The sections below expand on each phase.
Documentation scope
This guide covers Percona XtraBackup 8.4. Server behavior (for example, performance_schema.log_status) can vary by MySQL or Percona Server version. Verify your build with xtrabackup --version and your server with the server version command (for example, SELECT VERSION();). Check release notes or vendor documentation when in doubt.
Technical deep-dive: Backup phase¶
This section explains how PXB avoids blocking your database during the hot copy.
Redo log thread: capturing changes in real time¶
Percona XtraBackup records the log sequence number (LSN) when the backup starts, then copies InnoDB data files. Because the copy takes time, the on-disk files change while copying. A background thread runs for the duration of the backup: the thread watches the redo (transaction) log files and continuously copies new log data. The redo log is written in a round-robin fashion and can be reused; the background thread ensures PXB has captured all changes up to a consistent point.
The backup and locking lifecycle in text form: (1) Phase 1 — Non-blocking: record LSN, copy InnoDB data and redo while a background thread follows new redo. (2) Phase 2 — Lightweight lock: under LOCK INSTANCE FOR BACKUP or LOCK TABLES FOR BACKUP, copy non-InnoDB files. (3) Phase 3 — Final sync: under LOCK BINLOG FOR BACKUP, finish redo copy, fetch binlog/replica coordinates, then unlock and exit. DML continues during Phase 1; Phase 2 blocks only DDL; Phase 3 is a brief hold to capture a consistent binlog position.
The following diagram shows the same flow. If the diagram does not render (for example, in a viewer that does not support Mermaid), use the text description above.
flowchart LR
subgraph Phase1["Phase 1: Non-blocking"]
A[Record LSN] --> B[Copy InnoDB data + redo]
B --> C[Background thread follows redo]
end
subgraph Phase2["Phase 2: Lightweight lock"]
D[LOCK INSTANCE / TABLES FOR BACKUP] --> E[Copy non-InnoDB files]
end
subgraph Phase3["Phase 3: Final sync"]
F[LOCK BINLOG FOR BACKUP] --> G[Finish redo copy]
G --> H[Fetch binlog/replica coords]
H --> I[Unlock & exit]
end
Phase1 --> Phase2 --> Phase3
Redo log consumer (8.4): In MySQL 8.x, redo logs are highly volatile. The optional parameter --register-redo-log-consumer (disabled by default) lets PXB register as a redo log consumer at backup start. The server will not remove a redo log file until PXB (the consumer) has copied that file. The consumer reads the log and advances the LSN; the server may block writes briefly during that process and uses consumption to decide when the server can purge the log.
High-write servers
On busy servers, the server can reuse or purge redo log files before PXB has copied them. Enable --register-redo-log-consumer for high-write workloads to reduce the risk of backup failure.
Redo log consumer: disk and server impact
Enabling --register-redo-log-consumer prevents the server from purging redo until PXB has copied the redo. On high-write systems, enabling the option can retain more redo on disk and increase disk usage (“redo bloat”). Monitor disk space and server I/O when using this option; the trade-off is backup reliability and server-side redo retention.
Emergency — if disk becomes critically full: The consumer is released when the backup process exits. Stop the backup (Ctrl+C or send SIGTERM to the xtrabackup process) so the server can purge redo log files again. Resolve disk usage before retrying; consider enabling the consumer only when sufficient disk headroom exists.
Locking hierarchy: minimal blocking¶
Backup locks are a lightweight alternative to FLUSH TABLES WITH READ LOCK. MySQL 8.4 supports an instance-level backup lock via LOCK INSTANCE FOR BACKUP.
-
Phase 1 — Non-blocking: InnoDB data files and redo log are copied while DML continues. No global lock is held during this phase.
-
Phase 2 — Lightweight lock: When backup locks are supported, PXB uses them so that non-InnoDB data can be copied without blocking DML on InnoDB. This lock blocks DDL (for example,
CREATE,ALTER,DROP) but allows DML (INSERT, UPDATE, DELETE). With the default--lock-ddl=ON, the backup lock is taken at the start of the backup; with--lock-ddl=REDUCED, the lock is taken only after copying InnoDB data. Under the lock, PXB copies non-InnoDB files: .frm, .MRG, .MYD, .MYI, .CSM, .CSV,.sdi, and.par. -
Phase 3 — Final sync: PXB uses
LOCK BINLOG FOR BACKUPto briefly block operations that would change binary log position or replica coordinates (Exec_Source_Log_Pos,Exec_Gtid_Set). PXB then finishes copying the redo log and fetches binary log coordinates fromperformance_schema.log_status, after which PXB releases the backup and binlog locks. The binary log position is printed to STDERR (redirect to a file if needed, for examplextrabackup OPTIONS 2> backupout.log), and PXB exits with 0 on success.
Locking is only needed for MyISAM and other non-InnoDB tables after InnoDB data and logs are backed up, so DML on InnoDB is not blocked during the main copy.
When locks are avoided¶
When all tables in every schema are InnoDB, PXB can avoid backup locks and obtain binary log coordinates from performance_schema.log_status only. In practice, the mysql system schema often contains non-InnoDB tables (for example, MyISAM or CSV, such as general_log), so backup locks are usually still taken. Treat “lockless” as applying only when you have confirmed that no schema uses MyISAM or other non-InnoDB engines. On Percona Server for MySQL 8.4, log_status is extended to include relay log coordinates, so no extra locks are needed even with --slave-info. On standard MySQL 8.4, FLUSH TABLES WITH READ LOCK is still required when using --slave-info if relay log position is needed.
See Index of files created by Percona XtraBackup for the files created in the backup directory.
Cloud and streaming backups (xbcloud, S3, Azure)¶
When backing up directly to cloud storage (for example, via xbcloud to S3 or Azure Blob), the backup lifecycle is the same, but the Final Sync phase is subject to network latency: the binlog lock may be held longer while redo and metadata are flushed to the remote endpoint. Plan for a longer binlog lock hold when using streaming or cloud backups in production. See Take a streaming backup and xbcloud binary overview for cloud-specific behavior.
Technical deep-dive: Prepare phase (recovery)¶
The --prepare step turns the raw backup into a consistent snapshot by applying redo and then undoing uncommitted work. The prepare step aligns InnoDB with the backup’s sync point (or, when FLUSH TABLES WITH READ LOCK is used, the time that lock was taken), so that InnoDB and MyISAM data are consistent with each other.
Part A: Redo application (physical)¶
XtraBackup applies changes from the redo log directly to page offsets in the tablespace (IBD files). Redo application is a physical operation: the operation works at the page level, not at the row or transaction level. The redo log can contain uncommitted transactions (the server may flush them to disk), so redo application alone does not guarantee transactional consistency.
Part B: Undo application (logical, with SDI)¶
After redo, XtraBackup uses the undo log to logically roll back any uncommitted transactions whose changes appear in the redo log. Undo records are typed as INSERT or UPDATE and carry a table_id. To perform rollback, XtraBackup initializes the InnoDB engine and data dictionary, then uses Serialized Dictionary Information (SDI) from the tablespace—a JSON representation of the table—to parse index pages and apply undo operations.
Table metadata (SDI)
In MySQL 8.0+, table definitions live in the tablespace as SDI (Serialized Dictionary Information), not in separate .frm files. During prepare, PXB uses SDI to map table_id to table structure for undo rollback.
Tables are loaded as evictable; PXB maps table_id to tablespace via the data dictionary and loads user tables only when needed for rollback. This design reduces memory and I/O and speeds up prepare and Percona XtraDB Cluster SST.
After --prepare, InnoDB tables are rolled forward to the backup completion point, not rolled back to the start. Both InnoDB and MyISAM tables are consistent with each other at that point.
Prepare is often the bottleneck¶
The prepare phase is frequently the longest part of recovery, especially for large or incremental backups. You can shorten the prepare phase by:
--use-memory— Increases memory used during prepare (similar to a buffer pool). Default is 100MB; recommended 1GB–2GB when RAM allows. Only applies to the prepare phase. If you run prepare on the same host as production MySQL, do not allocate memory that the OS or MySQL need—doing so can trigger OOM (Out of Memory) kills; prefer running prepare on a separate host or leave sufficient headroom. See--use-memory.--parallel— From Percona XtraBackup 8.4.0-3 onward, prepare can use multiple threads to apply.deltafiles (incremental backups). This does not parallelize the initial redo log application on a full backup; setting for example--parallel=64on a full backup will not make redo application multi-threaded. Use a numeric value; minimum recommended is 4 (for example,--parallel=4). See--parallel.
Example: xtrabackup --prepare --use-memory=2G --parallel=4 --target-dir=/data/backups/
Restore phase (deployment)¶
To restore a backup, use --copy-back or --move-back. XtraBackup reads the target paths from your configuration (for example, datadir, innodb_data_home_dir, innodb_data_file_path, innodb_log_group_home_dir in my.cnf) and ensures the directories exist. XtraBackup copies (or moves) files in a defined order: MyISAM-related files (.MRG, .MYD, .MYI, .CSM, .CSV, .sdi, .par) first, then InnoDB tables and indexes, then log files. File attributes are preserved.
Datadir ownership and permissions
You must set correct ownership and permissions on the datadir before starting the server (for example, chown -R mysql:mysql /var/lib/mysql). Failing to do so is one of the most common causes of “database won’t start” after a restore. Restored files are owned by the user who ran the backup; the server typically expects them to be owned by the mysql system user.
--move-back moves files instead of copying and removes them from the backup directory. Use --move-back when disk space is limited; the backup is consumed and cannot be reused.
For full restore procedures, see Restore full, incremental, and compressed backups.