Want to add a voice to your web app? The Text to Speech (TTS) feature in JavaScript lets you convert text into spoken words, making your app more accessible and engaging for users. Whether you are building a tool for reading articles aloud or offering assistance to visually impaired users, the speechSynthesis
API makes it simple.
What is Text to Speech JavaScript API?
The Text to Speech API is part of the Web Speech API, allowing browsers to convert text into speech using the speechSynthesis
object. You can use it as follows:
- Convert any text into speech.
- Customize speech properties like voice, rate, pitch, and volume.
- Control playback—start, pause, resume, or stop whenever you want.
The best part? The majority of contemporary browsers support it, and you don't need any external libraries to make it work.
Getting Started with Text to Speech
Let's move straight to the basic code. First, you need to create a SpeechSynthesisUtterance
object with a piece of text you want to speak. Then all you have to do is call the speak()
method to make it work.
Example:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Text to Speech Example</title>
</head>
<body>
<textarea id="textInput" rows="4" cols="50" placeholder="Type something..."></textarea><br>
<button onclick="speak()">Speak</button>
<script>
function speak() {
const text = document.getElementById('textInput').value;
const utterance = new SpeechSynthesisUtterance(text);
window.speechSynthesis.speak(utterance);
}
</script>
</body>
</html>
Explanation:
- The
textarea
is where users can type what they want to hear. - When the 'Speak' button is clicked, the
speak()
function is invoked. - Inside
speak()
, the text from the textarea is converted into aSpeechSynthesisUtterance
, which thespeechSynthesis
API then reads out loud.
Pretty simple, right? But we can do more than just speak the text—we can control how it sounds.
Adjusting Speech Rate, Pitch, and Volume
Want to slow down the speech or give it a higher pitch? The SpeechSynthesisUtterance
object lets you tweak the speech properties to get it just right.
Example:
function speak() {
const text = document.getElementById('textInput').value;
const utterance = new SpeechSynthesisUtterance(text);
// Adjust speech properties
utterance.rate = 1; // Speed of speech (1 is normal, 0.5 is slower, 2 is faster)
utterance.pitch = 1; // Pitch of the voice (0 is low, 2 is high)
utterance.volume = 1; // Volume level (0 is mute, 1 is full volume)
window.speechSynthesis.speak(utterance);
}
Now the speech will sound exactly the way you want it. This is perfect if you need to make the speech clearer or just want to have some fun with it.
Choosing a Different Voice
Did you know that browsers come with different voices? You can pick from various accents, genders, and languages—whatever suits your app's personality.
Example:
function speak() {
const text = document.getElementById('textInput').value;
const utterance = new SpeechSynthesisUtterance(text);
// Get all available voices
const voices = window.speechSynthesis.getVoices();
// Pick a voice (in this case, the third one)
if (voices.length > 2) {
utterance.voice = voices[2];
} else {
console.log('Not enough voices available.');
}
window.speechSynthesis.speak(utterance);
}
// Load voices when they're ready
window.speechSynthesis.onvoiceschanged = function() {
speak(); // Refresh the voices list
};
Different voices can be fun and engaging, adding a unique flair to your application. Since the voices available may vary between browsers and operating systems, it's a good idea to give users some choices.
Handling Voices Asynchronously
Sometimes, the voices are not immediately available when the page loads. To ensure your app is waiting for their response, you can listen to the voiceschanged
event.
Example:
let availableVoices = [];
function loadVoices() {
availableVoices = window.speechSynthesis.getVoices();
console.log('Loaded voices:', availableVoices);
}
// Load voices asynchronously
window.speechSynthesis.onvoiceschanged = loadVoices;
function speak() {
const text = document.getElementById('textInput').value;
const utterance = new SpeechSynthesisUtterance(text);
if (availableVoices.length > 0) {
utterance.voice = availableVoices[0]; // Use the first available voice
} else {
console.log('Voices not ready yet!');
}
window.speechSynthesis.speak(utterance);
}
This ensures the voices are fully loaded before the speech starts.
Pausing, Resuming, and Cancelling Speech
For longer pieces of text, you might want to give users more control—let them pause, resume, or even stop the speech. Here's how you can do that.
Example:
function pauseSpeech() {
window.speechSynthesis.pause();
}
function resumeSpeech() {
window.speechSynthesis.resume();
}
function stopSpeech() {
window.speechSynthesis.cancel();
}
These functions are useful when users want to manage the speech playback, especially if the content is long.
Conclusion
In this tutorial, you've learned how to implement Text to Speech in JavaScript using the speechSynthesis
API. You've seen how to:
- Turn text into speech.
- Adjust the speech's rate, pitch, and volume.
- Choose different voices.
- Control playback with pause, resume, and stop functions.
This feature can add a lot of value to your app, making it more accessible and interactive. Now go ahead and give your app a voice!