Post

A Simple rsync Backup Script with Smart Excludes

A Simple rsync Backup Script with Smart Excludes

Backups don’t need to be complicated. This post presents a single-file rsync wrapper that handles the common cases: excluding development cruft, resuming interrupted transfers, and optionally verifying checksums.

The Problem

You have directories to back up regularly:

  • Project folders with node_modules and __pycache__ bloat
  • Data directories that shouldn’t include virtual environments
  • Large transfers that might get interrupted

You want a script that:

  • Excludes development artifacts automatically
  • Resumes partial transfers
  • Logs everything for debugging
  • Optionally verifies file integrity

The Script

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
#!/bin/bash

set -euo pipefail

print_usage() {
    echo "Usage: $0 --src=SOURCE_DIR --dest=DEST_DIR [--verify|-v]"
    echo
    echo "Options:"
    echo "  --src=DIR       Absolute path to source directory"
    echo "  --dest=DIR      Absolute path to destination directory"
    echo "  --verify, -v    Enable checksum-based verification (slower but safer)"
    echo "  --help, -h      Show this help message"
    exit 1
}

# Parse arguments
SRC=""
DEST=""
USE_CHECKSUM=0

for arg in "$@"; do
    case "$arg" in
        --src=*)
            SRC="${arg#*=}"
            ;;
        --dest=*)
            DEST="${arg#*=}"
            ;;
        --verify|-v)
            USE_CHECKSUM=1
            ;;
        --help|-h)
            print_usage
            ;;
        *)
            echo "Unknown argument: $arg"
            print_usage
            ;;
    esac
done

# Validate arguments
if [[ -z "$SRC" || -z "$DEST" ]]; then
    echo "Error: --src and --dest are required."
    print_usage
fi

# Log file
LOGFILE="$HOME/rsync-backup.log"
mkdir -p "$(dirname "$LOGFILE")"

log_message() {
    echo "$1" | tee -a "$LOGFILE"
}

log_message "Starting rsync from $SRC to $DEST on $(date)"
[[ $USE_CHECKSUM -eq 1 ]] && log_message "Checksum verification mode ENABLED"

# Validate source
if [[ ! -d "$SRC" ]]; then
    log_message "Error: Source directory $SRC does not exist"
    exit 1
fi

# Create destination if missing
if [[ ! -d "$DEST" ]]; then
    log_message "Creating destination directory $DEST"
    mkdir -p "$DEST"
fi

# Create temporary exclude file
EXCLUDE_FILE="$(mktemp /tmp/rsync-exclude.XXXXXX)"
cat > "$EXCLUDE_FILE" <<EOF
venv/
.venv/
**/node_modules/
**/__pycache__/
**/.mypy_cache/
**/.pytest_cache/
.DS_Store
EOF

# Build rsync options
RSYNC_OPTS=(-aP
    --partial --partial-dir=.rsync-partials
    --log-file="$LOGFILE"
    --exclude-from="$EXCLUDE_FILE"
)

# Add checksum if requested
[[ $USE_CHECKSUM -eq 1 ]] && RSYNC_OPTS+=(--checksum)

# Run rsync
rsync "${RSYNC_OPTS[@]}" "$SRC" "$DEST"

if [[ $? -eq 0 ]]; then
    log_message "Rsync completed successfully on $(date)"
else
    log_message "Error: Rsync failed with exit code $?"
    rm -f "$EXCLUDE_FILE"
    exit 1
fi

rm -f "$EXCLUDE_FILE"

Key Features

Smart Excludes

The script automatically skips common development artifacts:

PatternPurpose
venv/, .venv/Python virtual environments
**/node_modules/npm dependencies (often gigabytes)
**/__pycache__/Python bytecode cache
**/.mypy_cache/Type checker cache
**/.pytest_cache/Test runner cache
.DS_StoremacOS metadata files

These patterns use ** to match at any depth in the directory tree.

Resumable Transfers

The script uses --partial with a dedicated partial directory:

1
--partial --partial-dir=.rsync-partials

If a transfer is interrupted mid-file, rsync saves the partial file in .rsync-partials/. On the next run, it resumes from where it left off rather than starting over. This is essential for large files over unreliable connections.

Checksum Verification

By default, rsync uses file size and modification time to detect changes. With --verify, the script adds --checksum to force byte-level comparison:

1
2
3
4
5
# Fast mode (default) - uses mtime and size
./backup.sh --src=/data --dest=/backup

# Verification mode - computes checksums
./backup.sh --src=/data --dest=/backup --verify

Use verification mode when:

  • Backing up to a new destination for the first time
  • Recovering from a failed or interrupted backup
  • Verifying backup integrity periodically

Logging

All operations log to ~/rsync-backup.log:

1
2
3
Starting rsync from /mnt/extra/ to /mnt/backup/ on Mon Mar  3 14:32:01 MST 2026
Checksum verification mode ENABLED
Rsync completed successfully on Mon Mar  3 14:47:23 MST 2026

The log includes rsync’s detailed transfer log, useful for debugging failed transfers or auditing what changed.

Usage Examples

Basic Backup

1
./backup.sh --src=/home/user/projects --dest=/mnt/backup/projects

External Drive Backup

1
2
# Mount point to USB drive
./backup.sh --src=/home/user --dest=/run/media/user/BackupDrive/home

Network Backup (via mounted share)

1
2
# NFS or CIFS mount
./backup.sh --src=/var/data --dest=/mnt/nas/backups/data

Periodic Verification

1
2
# Weekly cron job with verification
0 3 * * 0 /opt/backup.sh --src=/data --dest=/backup --verify

Extending the Script

Add More Excludes

Edit the heredoc to add project-specific exclusions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
cat > "$EXCLUDE_FILE" <<EOF
venv/
.venv/
**/node_modules/
**/__pycache__/
**/.mypy_cache/
**/.pytest_cache/
.DS_Store
# Add your own
*.log
*.tmp
build/
dist/
.git/
EOF

Dry Run Mode

Add a --dry-run flag to preview changes:

1
2
3
--dry-run|-n)
    DRY_RUN=1
    ;;

Then add to rsync options:

1
[[ $DRY_RUN -eq 1 ]] && RSYNC_OPTS+=(--dry-run)

Remote Destinations

rsync natively supports SSH destinations:

1
./backup.sh --src=/data --dest=user@server:/backup/data

The script works unchanged—rsync handles the SSH transport.

Installation

1
2
3
4
5
6
7
8
# Download
curl -o ~/bin/backup.sh https://raw.githubusercontent.com/Derrekito/emergency_backup/main/backup.sh

# Make executable
chmod +x ~/bin/backup.sh

# Verify rsync is installed
rsync --version

On Arch Linux:

1
sudo pacman -S rsync

Summary

FeatureImplementation
Exclude dev artifactsTemporary exclude file with common patterns
Resume interrupted transfers--partial --partial-dir
Verify integrity--checksum flag
Logging--log-file + tee to console
Progress display-P flag (progress + partial)

This script handles 90% of backup scenarios in under 130 lines. For more complex needs like incremental snapshots or deduplication, consider tools like restic or borg—but for straightforward directory mirroring, rsync with smart defaults gets the job done.

This post is licensed under CC BY 4.0 by the author.