InfoQ Homepage Transformer models Content on InfoQ
News
RSS Feed-
Transformers v5 Introduces a More Modular and Interoperable Core
Hugging Face has released the first candidate for Transformers v5, marking a significant evolution from v4 five years ago. The library has grown from a specialized model toolkit to a critical resource in AI development, achieving over three million installations daily and more than 1.2 billion total installs.
-
New IBM Granite 4 Models to Reduce AI Costs with Inference-Efficient Hybrid Mamba-2 Architecture
IBM recently announced the Granite 4.0 family of small language models. The model family aims to deliver faster speeds and significantly lower operational costs at acceptable accuracy vs. larger models. Granite 4.0 features a new hybrid Mamba/transformer architecture that largely reduces memory requirements, enabling Granite to run on significantly cheaper GPUs and at significantly reduced costs.
-
Dreamer 4: Learning to Achieve Goals from Offline Data through Imagination Training
Researchers from DeepMind have described a new approach for teaching intelligent agents to solve complex, long-term tasks by training them exclusively on video footage rather than through direct interaction with the environment. Their new agent, called Dreamer 4, demonstrated the ability to mine diamonds playing Minecraft after being trained on videos, without ever actually playing the game.