BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News How Datadog Cut the Size of Its Agent Go Binaries by 77%

How Datadog Cut the Size of Its Agent Go Binaries by 77%

Listen to this article -  0:00

After the Datadog Agent grew from 428 MiB to 1.22 GiB over a period of 5 years, Datadog engineers set out to reduce its binary size. They discovered that most Go binary bloat comes from hidden dependencies, disabled linker optimizations, and subtle behaviors in the Go compiler and linker.

This growth impacted both us and our users: network costs and resource usage increased, perception of the Agent worsened, and it became harder to use the Agent on resource-constrained platforms.

To address this, Datadog software engineer Pierre Gimalac wrote, their approach consisted in auditing imports, isolating optional code, and eliminating reflection/plugin pitfalls to shrink binaries as much as possible.

Indeed, after analyzing the Agent's growth, Datadog engineers found out it was driven by new features, additional integrations, and large third-party dependencies (e.g., Kubernetes SDKs). In particular, Go’s dependency model includes transitive imports, making it so that even a small change can pull in hundreds of packages.

Datadog engineers devised two practical ways to remove unnecessary dependencies: using build tags (//go:build feature_x) to exclude optional code, and moving code into separate packages so that non-optional packages remain as small as possible. Both techniques required systematically auditing imports to identify which files or packages could be excluded from a given build. For example, simply moving one function into its own package removed ~570 packages and ~36 MB of generated code from a binary not using it.

Auditing dependencies is not an easy task, but the Go ecosystem provides three useful tools to help: go list, which lists all packages used in a build; goda, which visualizes dependency graphs and import chains to help understand why a given dependency is required; and go-size-analyzer, which shows how much space each dependency contributes to a binary.

Besides dependency optimization, Datadog engineers got an additional 20% size reduction by minimizing the use of reflection, which can silently disable some linker optimizations, including dead-code elimination:

if you use a non-constant method name, the linker can no longer know at build time which methods will be used at runtime. So it needs to keep every exported method of every reachable type, and all the symbols they depend on, which can drastically increase the size of the final binary.

To address this issue, they eliminated dynamic reflection where possible, both in their codebase and in dependencies. The latter step required submitting several PRs to projects such as kubernetes/kubernetes, uber-go/dig, google/go-cmp and others.

Another feature that disables dead-code elimination is Go plugins, a mechanism that enables a Go program to dynamically load Go code at runtime. In fact, simply importing the plugin package causes the linker to treat the binary as dynamically linked, "which disables method dead code elimination and even forces the linker to keep all unexported methods". This change yielded an additional ~20% reduction in some builds.

As a final note, Gimalac emphasizes that these improvements were achieved over a six month period and, most importantly, did not require removing any feature. His account includes many more details than can be covered here, so be sure to read it to get the full story.

About the Author

Rate this Article

Adoption
Style

BT