Drupal File Migration Approach Separates Records From File Transfers
Moving large Drupal file stores can become a migration bottleneck when every managed file is copied byte for byte through PHP. In a blog post titled Migrating a Terabyte of Drupal Files Without the Wait, Gordon Heydon of Heydon Consulting outlines a record-first migration approach that registers Drupal file entities and URIs before moving the underlying binaries separately.
The article is useful for Drupal teams working with large file stores because it separates metadata migration from file transfer. Instead of relying on file_copy to stream every source file into the destination site, the approach creates or reuses managed file records, keeps field references valid, and lets filesystem tools handle bulk file movement outside the migration process.
Gordon describes using HTTP HEAD checks to confirm whether missing binaries exist at the source before adding them to a backfill list. The post also covers using a no-op destination to track downloaded file IDs, relying on rsync before go-live, and using the Stage File Proxy module so local development environments fetch only the public files developers actually view.
The article also addresses related migration decisions, including handling private files through a temporary locked-down endpoint, converting legacy file fields into media entities, rewriting unmanaged WYSIWYG file references as CKEditor 5 media embeds, treating decorative legacy images as theme-owned icons, and preserving file URIs across repeated migration runs.
The post includes code examples for process, source, and destination plugins, as well as migration YAML. Teams planning large Drupal file migrations should read the original article for implementation details before adapting the pattern to production projects.


