A recent report by former npm engineering manager Darcy Clarke found that the npm registry does not validate manifest information against the contents of its corresponding package tarball. This creates a double source of truth that attackers can exploit to hide scripts or dependencies, says Clarke.
At the root of manifest confusion, as Clarke named this vulnerability, lies the fact that npm API requires maintainers to provide a valid manifest both in the body of the PUT
request used to submit a package and in the package.json
file uploaded within the package tarball. Since npm server does not validate the two manifests, they can differ without anyone noticing. This is especially critical for npm clients, since it also casts a shadow of ambiguity about which is the "real" manifest to consider. Clarke showed in his article how you can reproduce such inconsistencies using the npm CLI or directly accessing npm API.
This has huge implications, says Clarke. For example, a package listing on npm may show that the package has no dependencies, while it actually does. Likewise, it can show a different package name or version than those shown in package.json
, leading to a possibility of cache poisoning. Even worse, it can hide the fact that it will run a script during installation.
All the inconsistencies listed above are vulnerabilities that can be exploited. For example, a package could disguise as a different one to induce someone to install it by mistake; a hidden dependency could be installed without the user knowing; etc.
This issue, Clarke argues in a detailed way, affects many third-party clients and tools in the npm ecosystem, as well as package managers, so you may want to double check what is your preferred tool's mileage.
Clarke's final suggestion is for all npm users to stop relying on npm registry manifest files and use package.json
instead, with the exception of the name
and version
fields.
In conversation with InfoQ, Sonatype’s security researcher Ax Sharma highlighted that these kind of inconsistencies are not necessarily malicious and could ensue from legitimate cloning or forking, or due to a developer not cleaning up stale metadata when updating a package. He also added an additional nuance to the problem:
Trusting what’s within package.json is no better than trusting what’s advertised on the npmjs page of a package – neither are completely reliable.
The solution rests, according to Sharma, on using security tooling that performs a deeper analysis, such as hash-based analysis of the malicious or vulnerable files – known as advanced binary fingerprinting.
Another useful piece of advice comes from J. M. Rossy on Twitter, who suggests to turn scripts off by default.
If you are interested in manifest confusion, do not miss Clarke's original article, which provides many additional insights.