InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Enter your e-mail address

Select your country

We protect your privacy.

InfoQ Homepage News Wear OS Gets New, More Efficient Text-to-Speech Engine

Mobile

Wear OS Gets New, More Efficient Text-to-Speech Engine

Mar 18, 2024 1 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

Google has announced a new text-to-speech engine for Wear OS, its Android variant aimed at smartwatches and other wearables, supporting over 50 languages and faster than its predecessor thanks to using smaller ML models.

According to Google's Ouiam Koubaa and Yingzhe Li, the new text-to-speech engine is particularly geared for low-memory devices, such as those that power services for which wearables are most suited, including accessibility services, exercise apps, navigation-cues, and reading-aloud apps.

Text-to-speech turns text into natural-sounding speech across more than 50 languages powered by Google’s machine learning (ML) technology. The new text-to-speech engine on Wear OS uses smaller and more efficient prosody ML models to bring faster synthesis on Wear OS devices.

The new text-to-speech engine does not introduce new APIs to synthesize speech, meaning developers can keep using the previously existing speak method, along with the rest of the methods previously available.

Developers should keep in mind that the new engine takes about 10 seconds to get ready when the app is initialized. Therefore, apps that want to use speech right after launch is completed should initialize the engine as soon as possible by calling TextToSpeech(applicationContext, callback) and synthesize the desired text from the passed callback.

An additional caveat concerns the possibility that the new engine can synthesize speech in a language other than the user's preferred language. This may happen, for example, when sending an emergency call, in which case the language corresponding to the actual locale the user is in is preferred over the user's chosen UI language.

The new text-to-speech engine can be used on devices running Wear OS 4, released last July, or higher.

Besides text-to-speech synthesis, Wear OS also provides a speech recognition service through the SpeechRecognizer API, which is though not appropriate for continuous recognition since it relies on remote services.

About the Author

Sergio De Simone

Sergio De Simone is a software engineer. Sergio has been working as a software engineer for over twenty five years across a range of different projects and companies, including such different work environments as Siemens, HP, and small startups. For the last 10+ years, his focus has been on development for mobile platforms and related technologies. He is currently working for BigML, Inc., where he leads iOS and macOS development.

Show moreShow less

This content is in the Mobile topic

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Wear OS Gets New, More Efficient Text-to-Speech Engine

Write for InfoQ

About the Author

Sergio De Simone

This content is in the Mobile topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter