InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Enter your e-mail address

Select your country

We protect your privacy.

InfoQ Homepage News Ai2 Launches OLMo 2, a Fully Open-Source Foundation Model

AI, ML & Data Engineering

Ai2 Launches OLMo 2, a Fully Open-Source Foundation Model

Dec 05, 2024 2 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

The Allen Institute for AI research team has introduced OLMo 2, a new family of open-source language models available in 7 billion (7B) and 13 billion (13B) parameter configurations. Trained on up to 5 trillion tokens, these models redefine training stability, adopting staged training processes, and incorporating diverse datasets.

OLMo 2's architecture leverages improvements in layer normalization, employing RMSNorm, and rotary positional embeddings, as well as Z-loss regularization to enhance model robustness. The training process utilized a two-stage curriculum approach, with the first stage focusing on the OLMo-Mix-1124 dataset, comprising 3.9 trillion tokens from high-quality repositories like DCLM and Starcoder. The second stage involved fine-tuning Dolmino-Mix-1124, a curated dataset of 843 billion tokens featuring web-based and domain-specific content.

Techniques like model souping, which merges checkpoints to optimize performance, were crucial in achieving the final versions of the 7B and 13B models. The performance of OLMo 2 sets new benchmarks in open-source language modeling, demonstrating a significant boost across all evaluation tasks compared to its predecessor, OLMo-0424.

Notably, OLMo 2 7B outperforms Llama-3.1 8B, and OLMo 2 13B surpasses Qwen 2.5 7B, despite utilizing fewer training FLOPs. Evaluation using the Open Language Modeling Evaluation System (OLMES), a suite of 20 benchmarks, confirmed these gains, highlighting strengths in knowledge recall, reasoning, and general language capabilities.

The development of OLMo 2 marks a significant shift in the language modeling landscape, addressing challenges such as training stability and evaluation transparency. By setting a new standard for open-source AI, these models demonstrate the potential of collaborative innovation in advancing artificial intelligence, paving the way for more equitable technological advancements.

The AI community has responded enthusiastically to OLMo 2’s launch, recognizing Ai2 for its commitment to open-source.

AI Researcher Constantine Dee commented on X:

Ai2 has unveiled OLMo 2, the world's leading open-source AI model. Built with transparent datasets and training, it's a game-changer for creating diverse content.

While user Billy462 shared on Reddit:

This release is extremely significant. For those that don't know Allen AI are a research institute who are releasing completely open models. That means that all of their results can be reproduced (and improved upon) from scratch.

The OLMo 2 models are available, along with their weights, data, code, recipes, and intermediate checkpoints. The introduction of OLMES provides structured benchmarks to guide model development and track progress effectively. Additionally, post-training methodologies, including supervised fine-tuning, preference tuning, and reinforcement learning with verifiable rewards, have enhanced the models' instruction-following capabilities.

About the Author

Daniel Dominguez

Daniel is the Managing Partner at SamXLabs an AWS Partner Network company. He has over 13 years of experience in software product development for startups and Fortune 500 companies. Daniel holds a Machine Learning specialization from the University of Washington. He is passionate about leveraging AI and cloud computing to create innovative solutions. As an AWS Community Builder in the Machine Learning tier, Daniel is committed to sharing knowledge and driving innovation in software products.

Show moreShow less

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Ai2 Launches OLMo 2, a Fully Open-Source Foundation Model

Write for InfoQ

About the Author

Daniel Dominguez

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter