Database, Migrations, Backups, and Replication
This page describes how dinary stores data, how schema migrations
run, the full backup story (local cold copies, hot Litestream
replication to VM 2, and daily off-site backups to Yandex.Disk), and
the restore procedures for each.
Database file
The server stores everything in a single SQLite database:
data/dinary.db— the live database (categories, groups, tags, stores, mappings, expenses, income, and sync-job metadata)data/dinary.db-walanddata/dinary.db-shm— SQLite WAL-mode sidecars that hold in-flight writes before they are checkpointed back intodinary.db
All three files belong together. Do not copy just dinary.db: the
WAL sidecar can legitimately contain committed transactions that
have not yet been checkpointed. Any backup flow that does not go
through SQLite's .backup API must stop the dinary service first.
Migrations
Schema changes are managed with yoyo migrations living in
src/dinary/migrations/.
When migrations run automatically
- On application startup, the SQLite DB is opened and
yoyoapplies every outstanding migration against the live file before any request is served. - During
inv deploy, the deploy script callsinv migratebefore restarting the service so a broken migration fails the deploy instead of a running server.
For a fresh installation no manual migration step is required.
Manual migration
Migrations run automatically on every server start (yoyo tracks applied migrations and only applies new ones). No manual step is needed — just deploy and restart.
Integrity check
SQLite ships two pragmas for post-migration / post-restore sanity checks:
PRAGMA integrity_checkwalks every btree page and reports structural damage (torn pages, index/table mismatches, orphan freelist entries).PRAGMA foreign_key_checklists every row that violates a declared FK.
Both are read-only and cheap. dinary wraps them in:
inv verify-db # local data/dinary.db
inv verify-db --remote # snapshot of prod DB, checked over SSH
--remote first takes a sqlite3 .backup snapshot into /tmp on
the server and checks that, so you never read a live file whose WAL
is mid-checkpoint.
Backups
SQLite is a single-file database, so every backup ultimately boils
down to "produce a transactionally consistent copy of
data/dinary.db". dinary offers two mechanisms depending on how
fresh and how tolerant of latency the copy needs to be.
Cold backup: inv backup
inv backup # copy to ./backups/<timestamp>/
inv backup --dest=./my-backups
This SSHes to the server, runs sqlite3 .backup on the live DB to
produce a consistent snapshot in /tmp, streams the bytes locally,
and writes ./backups/<timestamp>/dinary.db. The snapshot is
atomic even while the service is writing — SQLite's online backup
API copies pages under a brief lock and retries torn reads.
Use inv backup before:
inv deploy(the deploy wrapper also runs its own pre-deploy snapshot automatically)- any manual schema migration
- any ad-hoc DB surgery
Hot replica: Litestream
inv backup is pull-based and fires on demand. For continuous
streaming replication — "my VM 1 got terminated; how much data did
I just lose" — dinary supports Litestream (v0.5.x) as a sidecar
that ships LTX segments to an SFTP target continuously.
Prerequisites
- A second VM you control (VM 2, the Litestream replica host) — the
reference target is an Oracle Cloud Free Tier
VM.Standard.E2.1. Microwith Ubuntu 22.04 Minimal, same shape as VM 1. - VM 2 has
DINARY_REPLICA_HOSTset in.deploy/.envon the operator machine (e.g.ubuntu@dinary-replicavia Tailscale MagicDNS). - VM 1's
~/.ssh/id_ed25519.pubis in VM 2's~/.ssh/authorized_keys— this is the trust that letslitestream.serviceon VM 1 push WAL segments over SFTP. Runssh-copy-idmanually once (cross-host trust is out of scope for automation from the operator machine).
Provisioning VM 2
Run once from the operator machine against VM 2:
inv setup-replica
This installs unattended-upgrades, allocates a 1 GB swap file,
creates /var/lib/litestream/ with the right ownership so the SFTP
receiver can drop WAL segments into it, and locks public SSH (see
the Cloud security notes for the
rationale). The task is idempotent — re-running it after a
Persistent=true reboot or a Tailscale IP rotation converges
cleanly.
setup-replica does NOT install the dinary app service, a public
tunnel, or any Python runtime — VM 2 is intentionally a minimal SFTP
sink. Everything the daily off-site backup needs (rclone, sqlite3,
zstd, the Litestream binary used only for local restore) is added
by inv setup-replica — see "Off-site backup: Yandex.Disk"
below.
One-time Litestream bootstrap on VM 1
- Copy the example config locally and fill in the SFTP target:
cp .deploy.example/litestream.yml .deploy/litestream.yml
# edit .deploy/litestream.yml — set host, user, path, key-path
- Install the Litestream sidecar on VM 1 (now part of
inv setup-replica):
inv setup-replica
This installs the Litestream binary, uploads the config to
/etc/litestream.yml, creates a litestream.service systemd
unit, and starts it. The task is idempotent — re-running
inv setup-replica upgrades the binary and reloads the
config.
- Confirm replication is healthy:
inv status --remote
A healthy sidecar shows an active systemd unit and the managed
DB path listed by litestream databases. An empty output means
the sidecar either never reached the SFTP host or is still
producing its first snapshot (first one lands within seconds of
the first DB write after the sidecar starts).
inv setup-server does not start Litestream automatically even when
.deploy/litestream.yml is present, because the sidecar needs an
already-reachable SFTP host with VM 1's public key in its
authorized_keys — a cross-host trust relationship we cannot set
up from the deploy workstation. Run inv setup-replica
manually once that prerequisite is in place.
What the sidecar does
Litestream v0.5 is a passive replicator: it opens the DB read-only, tails LTX segments out of SQLite's WAL, compacts them into level files, and ships them to the SFTP target. The app never talks to it. If the sidecar crashes, the app keeps writing into the WAL normally — you just stop accumulating replica state until the sidecar restarts. There is no back-pressure; SQLite's checkpoint loop is unaffected.
Default settings in the example config: a full snapshot every hour
and 7 days of LTX history (top-level snapshot: { interval: 1h,
retention: 168h }). That bounds "how far back can I rewind the DB"
to a week and bounds LTX replay on restore to a one-hour window.
These fields are global in v0.5 — they are NOT valid inside a
per-replica block and Litestream refuses to start if they are
placed there.
Restoring from a replica
On any host with Litestream installed and SSH access to the replica target:
litestream restore -config /path/to/litestream.yml /path/to/output/dinary.db
Litestream reads the most recent snapshot in the replica, replays
LTX forward to the latest committed transaction, and writes a fresh
dinary.db. The restored DB is transactionally consistent — no
PRAGMA integrity_check is required, but running it does not hurt.
Off-site backup: Yandex.Disk (daily, GFS retention)
The Litestream replica on VM 2 is hot (seconds-level RPO) but
co-located with VM 1 in the same cloud provider's region. For
"both VMs went away" scenarios there is a daily off-site backup
pushed from VM 2 to Yandex.Disk, orchestrated by
inv setup-replica.
What it does
Every day at 03:17 UTC (+ 30 min jitter) a systemd oneshot on VM 2:
- Materializes the local Litestream replica at
/var/lib/litestream/dinaryinto a plain SQLite file vialitestream restore. - Validates the restored file with
PRAGMA integrity_check. A corruption failure aborts the run without uploading — we refuse to overwrite the Yandex history with a visibly broken snapshot. - Compresses with
zstd -19(ratio is near-optimal on SQLite's repetitive page layout; CPU cost is negligible on the <1 MB input). - Uploads to
yandex:Backup/dinary/dinary-<UTC-ISO>.db.zstviarclone. - Prunes Yandex.Disk per the GFS retention policy (see below).
The upload is a plain compressed SQLite file, not an opaque
repository format — any machine with zstd and sqlite3 can open
a snapshot directly without the dinary tooling.
GFS retention
- 7 most-recent daily snapshots.
- 4 most-recent weekly snapshots (newest per ISO week).
- 12 most-recent monthly snapshots (newest per calendar month).
- All yearly snapshots, kept indefinitely (closed years are immutable — any drift between two yearly snapshots of the same closed year signals corruption and is worth keeping forever).
Buckets overlap — a snapshot is pruned only if it belongs to no keeper bucket. On a 10-year horizon this is roughly 29 files total (~9 MB on disk).
One-time bootstrap
Run once from the operator machine against VM 2 (this also covers the off-site backup bootstrap):
inv setup-replica
The task:
- Installs apt packages
rclone,sqlite3,zstdand the pinned Litestream binary on VM 2. - Configures the
yandex:rclone remote interactively on the first run. If the remote is missing the task prints a pointer to https://id.yandex.ru/security/app-passwords, prompts for the Yandex login, and reads the app-password viagetpass(no echo).
Yandex WebDAV does NOT accept your regular Yandex account password. You must create a dedicated app-password under the "Files" (Файлы / WebDAV) category on the page above — Mail, Calendar, or generic tokens are rejected by the WebDAV endpoint. The app-password is revocable from the same page without affecting the main account password.
The plaintext password travels only over the SSH channel to VM 2,
is fed to rclone obscure - on stdin, and only the obscured form
is written to ~ubuntu/.config/rclone/rclone.conf. Plaintext
never lands in argv (ps listing), shell history, or disk.
After writing the config the task runs rclone lsd yandex: as a
smoke test. Any failure (wrong password, wrong scope, network)
aborts the task and rolls back the partial config so the next run
re-prompts with a clean slate. On a green smoke test the prompt
is skipped on subsequent runs.
- Writes /usr/local/bin/dinary-backup and the paired retention
script; installs and enables dinary-backup.timer; triggers one
immediate run so the first snapshot is visible on Yandex.Disk
within a minute of bootstrap.
Why an app-password and not the full OAuth flow: VM 2 is headless,
and the interactive rclone config wizard expects a
laptop-authorize → copy-token dance across machines. An
app-password is equivalent for our access pattern (PUT/DELETE of
uploaded files), can be revoked from the Yandex account UI at any
time, and bootstrap stays end-to-end non-interactive beyond the
password prompt.
The task is idempotent: apt installs are no-op on re-apply,
rclone-remote bootstrap is a no-op once yandex: exists, scripts
and systemd units are overwritten, enable --now is harmless to
re-run.
Watching the daily run
ssh ubuntu@dinary-replica sudo journalctl -u dinary-backup.service -n 50 --no-pager
ssh ubuntu@dinary-replica sudo systemctl list-timers dinary-backup.timer
Freshness monitoring: inv backup-cloud-status
The daily timer failing silently is the worst-case mode — the
off-site snapshot stops refreshing while everything else looks fine.
inv backup-cloud-status is the off-VM2 probe:
inv backup-cloud-status # human one-liner; exit 0/1
inv backup-cloud-status --json-output # machine-readable verdict
inv backup-cloud-status --max-age-hours 3 # tighten threshold during an incident
It SSHes into VM 2, runs rclone lsjson against the yandex:
remote, extracts the UTC timestamp encoded in the newest
dinary-YYYY-MM-DDTHHMMZ.db.zst filename, and compares it to
--max-age-hours (default 26 h — 24 h + 1 h for the 30 min
timer jitter + 30 min headroom). Sample outputs:
OK: newest dinary-2026-04-22T0317Z.db.zst, age 7.1h, size 198.5 KB (threshold: 26h)
STALE: newest dinary-2026-04-20T0317Z.db.zst, age 49.3h, size 198.1 KB (threshold: 26h)
STALE: no snapshots on yandex:Backup/dinary/ (threshold: 26h)
Because it runs off-VM2, both a dead VM 2 (SSH fails) and a silently-stopped timer (snapshot stale) surface as a non-zero exit code, so one probe covers both failure modes.
Wire it into laptop cron via
linux-conf/osx/dinary_backup_check.sh (copied into ~/scripts/
by linux-conf/osx/copy_scripts.sh). That wrapper cds into
~/projects/dinary, runs uv run inv backup-cloud-status, and on
non-zero exit pipes the captured output through send_fail_email
(macOS-side msmtp → Yandex SMTP). Suggested crontab:
17 */6 * * * /Users/<you>/scripts/dinary_backup_check.sh
Four checks per day is cheap and catches a missed 03:17 UTC run within hours rather than the next morning.
Point-in-time restore from Yandex.Disk
inv restore-cloud-backup --list-only # show inventory
inv restore-cloud-backup # restore latest
inv restore-cloud-backup --snapshot 2026-03-15 # specific date
inv restore-cloud-backup --yes # skip confirm
Two intended use cases
- Laptop debug bootstrap. Materialize a recent prod snapshot
into
./data/dinary.dbon your workstation to reproduce a bug against real data. Low risk: any overwrite of a local debug DB is recoverable from the auto-saveddata/dinary.db.before-restore-<ts>. - Production disaster recovery. Run the task on VM 1 itself
(via SSH) when both the local DB and the Litestream replica on
VM 2 are unusable. The SSH +
cd ~/dinary+ interactive confirmation hops are intentional friction so a one-wordinv restore-cloud-backupon the wrong terminal cannot silently overwrite prod.
restore-cloud-backup is local-only — it writes to
./data/dinary.db relative to the cwd and has no --remote mode.
There is no way to invoke it against a remote host from the
operator machine.
Flags
| Flag | Default | Meaning |
|---|---|---|
--snapshot DATE |
latest |
Filename date prefix, e.g. 2026-04-22 matches the full dinary-2026-04-22T0317Z.db.zst |
--list-only |
off | Read-only: enumerate available snapshots and exit |
--yes |
off | Skip the "type yes to proceed" gate (preservation backup still happens) |
Preconditions (operator machine, one-time)
rcloneinstalled (apt install rcloneon Ubuntu,brew install rcloneon macOS). Already pre-installed on VM 1 byinv setup-serverso no manual install is needed during disaster recovery.- A
yandex:rclone remote configured locally, pointing at the same Yandex.Disk account used byinv setup-replica. If the operator machine never hadinv setup-replicarun from it, configure it once withrclone config create yandex webdav url=https://webdav.yandex.ru vendor=other user=<login>(it will prompt for the app-password viarclone obscure). sqlite3+zstdinstalled (both are already on VM 1 viainv setup-serverand on macOS viabrew install sqlite zstd).
Safety guarantees
- The snapshot is decompressed into a tmpdir and
PRAGMA integrity_check'd before any existingdata/dinary.dbis touched. A corrupt snapshot aborts the run with the live DB untouched. - If
data/dinary.dbexists and is non-empty, it is renamed todata/dinary.db.before-restore-<UTC-ISO>before the move. Previous state is always recoverable from the same directory, even with--yes.
Production disaster recovery runbook
When VM 1's live DB is gone AND the Litestream replica on VM 2 is unusable:
ssh ubuntu@dinary # or the public IP / Tailscale IP
sudo systemctl stop dinary litestream # avoid a half-written DB
cd ~/dinary
inv restore-cloud-backup --snapshot 2026-03-15
# confirmation prompt: shows row count / size / mtime of the
# current DB plus compressed size of the incoming snapshot, then
# asks for literal 'yes'.
sudo systemctl start litestream dinary # resume write + replication
inv verify-db # integrity + FK check
If the live DB is merely stale but intact and you only want the
Yandex snapshot for comparison, run the task in a scratch
directory (so ./data/dinary.db is the snapshot, not prod):
cd /tmp/restore-preview
mkdir -p data
inv restore-cloud-backup --snapshot 2026-03-15 --yes
sqlite3 data/dinary.db 'SELECT COUNT(*) FROM expense'
Restore from cold backup
- Stop the running service:
ssh $HOST 'sudo systemctl stop dinary'. - Replace
data/dinary.dbwith the backed-up file. - Remove any stale WAL sidecars:
rm -f data/dinary.db-wal data/dinary.db-shm. - Start the service:
inv restart-server. - Optionally run
inv verify-dbto check integrity.
Practical guidance
- Three layers of redundancy, paired to their failure modes:
inv backupcovers "oops I nuked the DB during a manual repair"; Litestream to VM 2 covers "oops I lost VM 1"; Yandex.Disk (inv setup-replica) covers "oops I lost both VMs / lost the whole cloud provider". - Treat
data/dinary.dbas the source of truth. Do not edit it while the service is running, and never hand-edit the-wal/-shmsidecars at all. - The laptop-side DuckDB-over-SQLite analytics workflow (Phase 5 of
.plans/storage-migration.md) consumes the same Litestream replica — no second backup pipeline is needed for analytics.