BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Dropbox Collaborates with GitHub to Reduce Monorepo Size from 87GB to 20GB

Dropbox Collaborates with GitHub to Reduce Monorepo Size from 87GB to 20GB

Listen to this article -  0:00

Dropbox engineers have reduced the size of their backend monorepo from 87GB to 20GB by addressing inefficiencies in Git’s storage and delta compression model, improving developer productivity and continuous integration performance. The effort was driven by scaling challenges in a repository that serves as a central integration point for backend services and shared libraries across teams at Dropbox.

As the monorepo grew, engineering teams began experiencing slow clone operations that could take over an hour, along with degraded CI pipeline performance due to repeated fetch and build overhead. The expansion also increased the risk of reaching repository hosting limits. According to Dropbox engineering findings, the issue was not primarily caused by large binaries or accidental commits, but by how Git’s internal compression heuristics handled large sets of related files.

Git uses delta compression to reduce storage by identifying similarities between files and storing differences efficiently. At scale, Dropbox engineers observed that these heuristics produced suboptimal packfiles, resulting in disproportionately large repository growth compared to actual code changes. The mismatch between expected and observed growth prompted a deeper investigation into storage behavior rather than repository content alone.

As Ishan Mishra, Senior Software Engineer at Dropbox, noted:

The growth rate didn’t match what we would expect from normal development activity, even at Dropbox’s scale. That suggested the problem wasn’t just what we were storing, but how it was being stored.

The team treated the monorepo as production infrastructure and conducted a detailed analysis of storage patterns. They implemented optimized repacking strategies and adjusted how Git structures object deltas, focusing on improving delta window and depth behavior. Since server-side packing for clone and fetch operations is managed through GitHub infrastructure, Dropbox engineers collaborated with GitHub teams to tune these parameters. Changes were validated in mirrored environments before production rollout to reduce operational risk.

As Shailesh Mishra noted in a LinkedIn post: ‘This was a tooling assumption colliding with repo structure at scale.

This was a tooling assumption colliding with repo structure at scale.

Following these optimizations, the repository size decreased from 87GB to 20GB, representing an approximate 77 percent reduction. Clone times dropped from over an hour to under 15 minutes, while CI pipelines saw faster execution due to reduced data transfer and processing overhead. The improvements also reduced the likelihood of hitting repository size limits and shortened developer onboarding times.

Dropbox Git data size reduction (Source: Dropbox Blog Post)

Dropbox engineers emphasized that the primary learning was the importance of treating version control systems as critical infrastructure, where storage behavior directly impacts engineering velocity. The work combined tooling-level optimization, cross-organizational collaboration with GitHub, and staged validation to ensure safe rollout without disrupting developer workflows.

About the Author

Rate this Article

Adoption
Style

BT