BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News OpenAI Developer Day 2024 (SF) Announces Real-Time API, Vision Fine-Tuning, and More

OpenAI Developer Day 2024 (SF) Announces Real-Time API, Vision Fine-Tuning, and More

On October 1, OpenAI SF DevDay 2024 introduced several new features in addition to hosting workshops, breakout sessions, and demos. Some of the new features unveiled include a Real-Time API with function calling, vision-fine tuning, distillation, and prompt caching.

The Real-Time API allows for persistent WebSocket connections, enabling real-time voice interactions. This capability is crucial for applications that require instantaneous responses like virtual assistants and real-time translation services. The API allows developers to send and receive JSON-formatted events, representing various interaction elements such as text, audio, function calls, and interruptions. The API also has the ability to handle simultaneous multimodal output.

While the pricing is around $0.30 per minute, it brings in new possibilities. It even supports function calling, so the AI can perform actions—not just chat. Jonathan Ijzerman

Code snippet to establish a socket connection, send a message from the client, and receive a response from the server.

The function calling feature, demonstrated through a travel agent app, allows AI to access external tools and databases, effectively acting as an intermediary that can perform tasks beyond its pre-trained knowledge. They acknowledged the need for greater user control over safety settings, potentially through a future "safety API."

The O1 model also was used in a coding demo. Developers can leverage O1 to not only generate code but to understand and architect it, as demonstrated by a developer who built an iPhone app by simply describing it to O1. OpenAI did acknowledge metrics like Sweebench, which focus on code accuracy, may not fully capture the model's real-world effectiveness for some other scenarios like UI developments.

OpenAI also announced they were expanding fine-tuning for vision models. This allows developers to customize AI for specific tasks. The fine-tuning framework includes options for adjusting hyperparameters, such as epochs and learning rate multipliers. An integration with Weights and Biases provides a toolset for tracking and analyzing fine-tuning jobs, offering insights into model performance. "We continuously run automated safety evals on fine-tuned models and monitor usage to ensure applications adhere to our usage policies," OpenAI noted about safety concerns.

OpenAI introduced a model distillation API and new evaluation tools to make their APIs more affordable. Distillation allows developers to create smaller models while trying to maintain model performance. This is crucial for deploying AI in environments with limited computational resources. Prompt caching reduces latency by reusing previously processed prompts. Developers may optimize their prompts for caching by structuring them with static content at the beginning and dynamic content at the end, ensuring maximum cache hits. "OpenAI prompt caching is not as big a discount as Gemini and Anthropic, but works without code changes. Let's see how long they cache," said Shawn Wang.

Other events are coming up in London October 30 and Singapore November 21. Developers interested in learning more may refer to documentation released in-line with the event.

About the Author

Rate this Article

Adoption
Style

BT