The Intake — Thursday, May 21, 2026

On the substrate

639 malicious @antv package versions published in latest Mini Shai-Hulud npm campaign

Socket Research SecurityWeek The Hacker News

The @antv namespace on npm is a collection of data-visualization libraries that ship as dependencies in a wide range of JavaScript projects. On May 19, 2026, 639 compromised package versions across 323 packages appeared in that namespace — identified by Socket Research. The vector was a compromised maintainer account, not a repository takeover or a build-system intrusion.

The payload is an infostealer. It exfiltrates harvested credentials through an encrypted channel to attacker-controlled infrastructure. Among the packages carrying it: echarts-for-react, which runs at approximately 1.1 million weekly downloads.

May 19 was not the campaign's first wave. Across all Mini Shai-Hulud activity to date, Socket reports 1,055 malicious versions spanning 502 unique packages. The campaign is attributed to TeamPCP. The attribution rests on two overlapping indicators. The first is Dune-themed GitHub repository infrastructure associated with the campaign. The second is an overlap with a Breached forum competition where the worm's source code was released publicly.

Any project pulling from the @antv namespace — including echarts-for-react — carries an infostealer payload risk from the compromised versions.

Replacing a single tokenizer vocabulary entry enables silent tool-call injection across any model format

HiddenLayer Research The Hacker News

A model's tokenizer.json is a vocabulary file — it controls how the model encodes and decodes text at inference time. It ships with the model in Hugging Face repositories and loads automatically. It is not part of the model weights, and changing it does not change the model's architecture.

Three attack techniques on tokenizer.json were documented May 11, 2026 by HiddenLayer researcher Divyanshu Divyanshu. Each is achievable by replacing a single entry in the vocabulary file. The first is URL proxy injection: every URL the model generates is redirected through attacker infrastructure. The second is command substitution: a predicted shell command is swapped for a different one at decode time. The third is tool-call injection: a second tool call is silently appended to every legitimate tool call the model generates.

All three techniques apply across ONNX, SafeTensors, and GGUF model formats. In GGUF, the tokenizer vocabulary is stored in the tokenizer.ggml.tokens metadata field and can be modified directly without touching the weights.

The exposure propagates downstream. A compromised tokenizer reaches every user who pulls the model from Hugging Face. It also reaches every model fine-tuned from a compromised base — the fine-tune inherits the modified vocabulary file.

Any builder pulling models from Hugging Face now has a documented attack surface in tokenizer.json. A single vocabulary entry replacement silently injects additional tool calls — no weight changes, no architecture changes.