InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Enter your e-mail address

Select your country

We protect your privacy.

InfoQ Homepage Deep Learning Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Microsoft Announces Small Language Model Phi-2

Microsoft Research announced Phi-2, a 2.7 billion-parameter Transformer-based language model. Phi-2 is trained on 1.4T tokens of synthetic data generated by GPT-3.5 and outperforms larger models on a variety of benchmarks.

Anthony Alford
on Dec 19, 2023
AI, ML & Data Engineering

Apple Open-sources Apple Silicon-Optimized Machine Learning Framework MLX

Apple's MLX combines familiar APIs, composable function transformations, and lazy computation to create a machine learning framework inspired by NumPy and PyTorch that is optimized for Apple Silicon. Implemented in Python and C++, the framework aims to provide a user-friendly and efficient solution to train and deploy machine learning models on Apple Silicon.

Sergio De Simone
on Dec 17, 2023
AI, ML & Data Engineering

Microsoft's Orca 2 LLM Outperforms Models That Are 10x Larger

Microsoft Research released its Orca 2 LLM, a fine-tuned version of Llama 2 that performs as well as or better than models that contain 10x the number of parameters. Orca 2 uses a synthetic training dataset and a new technique called Prompt Erasure to achieve this performance.

Anthony Alford
on Dec 12, 2023
AI, ML & Data Engineering

Google Launches New Multi-Modal Gemini AI Model

On December 6, Alphabet released the first phase of its next-generation AI model, Gemini. Gemini was overseen and driven by its CEO, Sundar Pichai and Google DeepMind. Gemini is the first model to outperform human experts on MMLU (Massive Multitask Language Understanding), one of the most popular methods to test the performance of language models.

Andrew Hoblitzell
on Dec 11, 2023
AI, ML & Data Engineering

Stability AI Open-Sources Video Generation Model Stable Video Diffusion

Stability AI released the code and model weights for Stable Video Diffusion (SVD), a video generation AI model. When given an input image as context, the model can generate 25 video frames at a resolution of 576x1024 pixels.

Anthony Alford
on Dec 05, 2023
AI, ML & Data Engineering

Meta Announces Generative AI Models Emu Video and Emu Edit

Meta AI Research announced two new generative AI models: Emu Video, which can generate short videos given a text prompt, and Emu Edit, which can edit images given text-based instructions. Both models are based on Meta's Emu foundation model and exhibit state-of-the-art performance on several benchmarks.

Anthony Alford
on Nov 28, 2023
AI, ML & Data Engineering

Spotify Open-Sources Voyager Nearest-Neighbor Search Library

Spotify Engineering recently open-sourced Voyager, an approximate nearest-neighbor (ANN) search library. Voyager is based on the hierarchical navigable small worlds (HNSW) algorithm and is 10 times faster than Spotify's previous ANN library, Annoy.

Anthony Alford
on Nov 21, 2023
AI, ML & Data Engineering

xAI Introduces Large Language Model Grok

xAI, the AI company founded by Elon Musk, recently announced Grok, a large language model. Grok can access current knowledge of the world via the X platform and outperforms other LLMs of comparable size, including GPT-3.5, on several benchmarks.

Anthony Alford
on Nov 14, 2023
AI, ML & Data Engineering

AWS Unveils Gemini, a Distributed Training System for Swift Failure Recovery in Large Model Training

AWS and Rice University have introduced Gemini, a new distributed training system to redefine failure recovery in large-scale deep learning models. According to the research paper, Gemini adopts a daring strategy by utilizing CPU memory to ensure previously unheard-of speeds in failure recovery, overcoming obstacles related to high recovery costs and constrained checkpoint storage capacity.

Daniel Dominguez
on Nov 10, 2023
AI, ML & Data Engineering

Microsoft Releases DeepSpeed-FastGen for High-Throughput Text Generation

Microsoft has announced the alpha release of DeepSpeed-FastGen, a system designed to improve the deployment and serving of large language models (LLMs). DeepSpeed-FastGen is the synergistic composition of DeepSpeed-MII and DeepSpeed-Inference . DeepSpeed-FastGen is based on the Dynamic SplitFuse technique. The system currently supports several model architectures.

Andrew Hoblitzell
on Nov 07, 2023
AI, ML & Data Engineering

PyTorch 2.1 Release Supports Automatic Dynamic Shape Support and Distributed Training Enhancements

PyTorch Conference 2023 presented an overview of PyTorch 2.1. ExecuTorch was introduced to enhance PyTorch's performance on mobile and edge devices. The conference also had a focus on community with new members added to the PyTorch Foundation and a Docathon announced.

Andrew Hoblitzell
on Oct 25, 2023
DevOps

TorchServe Potentially Exposed to Remote Code Execution

Israeli-based security company Oligo has uncovered multiple vulnerabilities in TorchServe, the tool used to serve PyTorch models, that could allow an attacker to run arbitrary code on vulnerable systems. The vulnerabilities have been promptly fixed in TorchServe version 0.82.

Sergio De Simone
on Oct 16, 2023
AI, ML & Data Engineering

Stability AI Releases Generative Audio Model Stable Audio

Harmonai, the audio research lab of Stability AI, has released Stable Audio, a diffusion model for text-controlled audio generation. Stable Audio is trained on 19,500 hours of audio data and can generate 44.1kHz quality audio in realtime using a single NVIDIA A100 GPU.

Anthony Alford
on Oct 10, 2023
AI, ML & Data Engineering

Unpacking How Ads Ranking Works @ Pinterest: Aayush Mudgal at QCon San Francisco

At QCon San Francisco, Aayush Mudgal gave a talk on Pinterest's ad ranking strategy. Pinterest does both candidate retrieval and ranking, supported by user interaction data and what they are currently watching. They use neural networks to create embeddings for ads and users, where ads which are close to the user should be relevant. They train and deploy models on a daily basis.

Roland Meertens
on Oct 03, 2023
AI, ML & Data Engineering

Meta Open-Sources Multilingual Translation Foundation Model SeamlessM4T

Meta recently open-sourced Massively Multilingual & Multimodal Machine Translation (SeamlessM4T), a multilingual translation AI that can translate both speech audio and text data across nearly 100 languages. SeamlessM4T is trained on 1 million hours of audio data and outperforms the current state-of-the-art speech-to-text translation model.

Anthony Alford
on Sep 19, 2023

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News