
In my last post I gave an introduction to my overall hardware and software setup for backup. But this was the easy part. Backing up several massive Apple Photos libraries—each overflowing with iPhone snapshots, RAW captures from my system camera, videos, panoramas, and AI-generated art—turns out to be a far more complex endeavor than initially anticipated.
For the impatient… You can find the resulting repository, with one iOS solution (which does not completely satisfy my requirements) as well as a macOS solution (which only runs if my MacBook is “awake”, but besides that is an A+), here: https://github.com/juangamnik/apple_photos_backup/
Still with me? Great, let’s dive into the story… Over months of testing, scripting, and wrestling with encrypted storage quirks, a reliable, fully automated solution has finally taken shape. In this post, I’ll walk you through the entire journey: from the early missteps to the final macOS-based workflow that ensures every original file (complete with sidecar and edit metadata) eventually lands safely in an encrypted, deduplicated long-term archive. Whether you’re juggling multiple family libraries or simply want bulletproof Apple Photos backups, I hope this detailed account of trials and triumphs helps you avoid the pitfalls and understand exactly how the tooling works.
Disclaimer: This issue vanishes, if you have a windows or macOS device running 24/7 with enough disk space to deactivate “optimize space”. If true: don’t read 🤓.
By the way: Apple encourages you to backup your iCloud Photos!
Initial Situation
Goal: Create incremental, automated backups for multiple Apple Photos libraries (each roughly 800 GB) in original quality, including all sidecar and edit metadata.
Constraints
- Library Composition: Each family member’s library contains iPhone shots, RAW files, videos, screenshots, and AI-generated images. All libraries share a single iCloud Plus plan under “Optimize Storage,” meaning that—with few exceptions—originals reside in iCloud rather than locally.
- Backup Requirements:
- 100% Originals: No downscaled or recompressed media. If a photo is only in iCloud, it has to be downloaded before backup at its full resolution.
- Sidecar/Metadata Preservation: XMP/AAE files have to be captured.
- Incremental & Automated: Backups should run frequently (ideally every few minutes) without manual intervention.
- End-to-End Encryption: Data has to be encrypted both in transit and at rest.
- Deduplication: Since multiple libraries share many similar items (e.g., screenshots or shared album imports), storage-efficient deduplication is preferred.
Why This Is Tricky
- iCloud “Optimize Storage” means originals are offloaded as needed, so relying on any client-side cache misses full-resolution files unless we force a download.
- Apple’s sandbox model restricts background services from touching Photos’ private library database without using an approved API, and even approved automation (i.e., Shortcuts) often returns rendered variants, not originals.
- Family Libraries Share iCloud: Each member’s library can pull a high volume of data at unpredictable times (e.g., vacation photo dumps), demanding a robust, incremental sync that can handle bursts.
Approaches I Have Tried (and Discarded)
I tried many things… it did cost me a lot of time… please, share my pain.
1. Manual Quarterly Export on a Mac
- Concept: Every three months, start a manual export of all originals plus sidecars via the UI on macOS.
- Downsides:
- Too Infrequent: In practice, the manual export cadence slips to once a year—far too sparse to mitigate recent data-loss risks.
- Human Error: It is easy to miss media files, or delete files unintentionally before they are backed up.
- Tedious: The procedure is time-consuming and annoying.
Ultimately, the manual approach imposes too much operational overhead and too many opportunities for error. I have simply been fed up with it.
2. iCloud Photo Downloader on Raspberry Pi
- Concept: Deploy an open-source “iCloud Photo Downloader” service on a Pi (or small Linux box). It logs into iCloud with MFA (using cookies that last 1-2 months), scans all Photos libraries, and downloads originals.
- Downside: Advanced Data Protection (ADP): Apple’s E2E encryption layer prevents any remote service from accessing raw originals, rendering the updater useless for my needs.
Because Apple’s ADP (a feature I do not want to miss) simply blocks this tool, this avenue quickly hits a hard stop.
3. iPhone-Based Apps (e.g., PhotoSync)
- Concept: Use apps like PhotoSync to pull originals (and sidecars) directly from the device.
- Downsides:
- Cached Renderings: In practice, these apps grab whatever live-view is resident (e.g. a recompressed JPEG of 1.5MB for a 30MB PNG), not the original files.
- Time-Sensitivity: They only reliably get originals immediately after a photo is taken (before iCloud offload occurs), so any older assets are inaccessible.
- One-Size fits all: The apps have their workflow and their way to select media files that need to be backed up, as well as encryption tooling they support. That does not fit my requirement, e.g., that not only new files, but older edited files, or older newly imported files, should be backed up.
The dream of simply using an app and having an “always on device”, which handles the backup process for me, asks for too many compromises. Dismissed.
4. Apple Shortcuts + JSBox, Scriptable, or Pythonista
- Concept: Since that dream of an “always on device” that backs up for me is so compelling, let’s build an Apple Shortcut that, when triggered, transfers the relevant media files one by one to my backup server via SSH.
- Downsides:
- API Limitations: Shortcuts’ “Find Photos” action as well as bridges via apps like Scriptable or JSBox return rendered JPEG/HEIC variants, showing the same behavior as PhotoSync and the like.
- Potential Angle with Pythonista/Objective-C or Building a Dedicated App: One could theoretically script deeper into Photos, e.g., via Objective-C bridges in Pythonista, but I have not pursued this path due to initial effort, maintenance complexity (and I am not sure after trying two scripting apps, whether the third will hold its claims).
Though intriguing in theory and being much more flexible regarding workflow than the approach using an out-of-the-box sync app, the sandbox restrictions on iOS (and Apple Shortcuts on macOS either) block access to true originals, so this approach stalls. Very sad.
The Final Working Solution: macOS + osxphotos
After discarding other paths, the best reliable way to extract genuine originals—and preserve sidecar/edit metadata—is to leverage osxphotos (or AppleScript / Automator), a Python-based command-line tool that directly interfaces with the Photos database. Combined with FUSE-based encryption (GoCryptFS), SSHFS, and borg backup, I have built a fully automated, incremental backup pipeline running on a MacBook Air as the backup client.
Infrastructure Overview
- Backup Client:
- MacBook Air running macOS Monterey or later.
- Connected (either at home or via VPN) to the network that hosts the backup server.
- Remote Storage:
- Raspberry Pi serving a GoCryptFS-encrypted volume mounted via SSHFS.
- Borg Backup repository on the backup server, storing immutable, deduplicated archives.
- Automation:
- A LaunchAgent (com.user.trigger_backup_photos.plist) on the MacBook Air fires every 5 minutes to:
- Check whether the Mac is on the home network or connected via VPN.
- Mount the remote GoCryptFS volume over SSHFS using public-key authentication and mount it as a decrypted volume.
- Call osxphotos to sync changed items since the last run.
- Monthly move the media files into a borg repository for read-only long-term archiving.
- A LaunchAgent (com.user.trigger_backup_photos.plist) on the MacBook Air fires every 5 minutes to:
- Encryption & Key Management:
- SSHFS & GoCryptFS:
- Public-key auth is used for SSHFS. The private key’s passphrase is unlocked once per reboot using the local ssh-agent.
- The GoCryptFS passphrase is stored in macOS Keychain (service name defined by GOCRYPTFS_KEYCHAIN).
- Borg Backup:
- Borg is installed both on the MacBook Air (for local archiving) and on the Pi (for remote usage).
- The Borg repository is encrypted. Its passphrase (or, when exceeding Keychain limits, a split passphrase) is also stored in Keychain under BORG_KEYCHAIN[_1].
- SSHFS & GoCryptFS:
How the 5-Minute Sync Works with osxphotos
At the heart of this solution lies osxphotos, a powerful Python library that can query the Photos database directly and export every asset—unedited RAW files, sidecars (XMP and AAE), rendered variants (JPEG/HEIC), Live Photo MOVs, and videos with the possibility to define the files to be exported via filters.
The exports are done partially/incrementally with the following logic behind it:
- Threshold Source:
- The script reads a “date threshold” from a simple text file (e.g., photos_backup/date_threshold.txt) in the export directory.
- If that file is missing or empty, it falls back to a default threshold specified in photos_backup.conf.
- Incremental vs. Long-Term Branch Depending on Date Threshold:
- Current Month (Threshold ≥ 1st of Current Month): Perform an incremental export: osxphotos compares timestamps in the Photos database against the threshold and only copies new or modified items.
- Past Month (Threshold < 1st of Current Month): First, do a “catch-up” export (in case any edits or retroactive imports have occurred). Then, initiate a long-term backup for the completed month (see details below).
Long-Term Backup & Archiving
When the date threshold indicates that a full month (or more) has completed (e.g., it’s June 2, and the threshold is May 1 or earlier), we run a monthly archival routine as part of the same 5-minute cron-like cycle:
- Finalize Monthly Exports: Even if some photos from May have been added or edited on June 1 or 2, osxphotos catches them because we first run the export unconditionally.
- Create a Borg Archive for the Completed Month: The script collects everything in the monthly export and creates a new archive in a borg repository.
- Purge Archived Data from Local Staging:
- Once Borg signals a successful archive the monthly export is deleted both from the local working directory and the mounted GoCryptFS volume.
- This saves local disk space while ensuring the encrypted Borg repo remains the canonical long-term store.
- Reset the Date Threshold:
- The threshold file is updated to the first day of the current month (e.g., 2025-06-01).
- This means subsequent 5-minute runs will only catch new media files, as well as imports, and edits starting from June 2025.
- Intentional Duplication for Safety: Since we sync all the way “up to now” before archiving the previous month, some newly-added late-May files might end up in both the “rolling” export (June 1–2) and the May Borg archive. That duplication is by design: Borg’s deduplication ensures no extra storage, but we gain confidence that nothing slips through the cracks.
Conclusion
This backup pipeline finally delivers on every requirement:
- Originals: osxphotos guarantees that each unedited or edited original (including all sidecars) is pulled every time—even if iCloud “Optimize Storage” has offloaded it.
- Incremental: By relying on robust date thresholds, the process only copies new or changed items, minimizing bandwidth and local staging disk usage.
- Automated: A 5-minute LaunchAgent loop means no manual intervention; once configured, it simply hums along (and gives a push notification on the Mac, if it fails).
- End-to-End Encryption: SSH (for in-transit/from Mac to Pi) and GoCryptFS as well as Borg (for at-rest) ensure a zero-trust posture.
- Deduplicated, Immutable Archives: Borg’s chunk-level magic means redundant screenshots or duplicate videos cost nothing extra—and each monthly snapshot is immutable, giving us confidence that corruption or accidental deletion can’t sneak in.
Exciting 🤓.
This article and the tools described here have been written with help from an LLM, which may make errors (like humans do 😇).