Want to add a voice to your web app? The Text to Speech (TTS) feature in JavaScript lets you convert text into spoken words, making your app more accessible and engaging for users. Whether you are building a tool for reading articles aloud or offering assistance to visually impaired users, the speechSynthesis API makes it simple.



What is Text to Speech JavaScript API?

The Text to Speech API is part of the Web Speech API, allowing browsers to convert text into speech using the speechSynthesis object. You can use it as follows:

  • Convert any text into speech.
  • Customize speech properties like voice, rate, pitch, and volume.
  • Control playback—start, pause, resume, or stop whenever you want.

The best part? The majority of contemporary browsers support it, and you don't need any external libraries to make it work.

Getting Started with Text to Speech

Let's move straight to the basic code. First, you need to create a SpeechSynthesisUtterance object with a piece of text you want to speak. Then all you have to do is call the speak() method to make it work.

Example:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Text to Speech Example</title>
</head>
<body>

    <textarea id="textInput" rows="4" cols="50" placeholder="Type something..."></textarea><br>
    <button onclick="speak()">Speak</button>

    <script>
        function speak() {
            const text = document.getElementById('textInput').value;
            const utterance = new SpeechSynthesisUtterance(text);
            window.speechSynthesis.speak(utterance);
        }
    </script>

</body>
</html>

Explanation:

  1. The textarea is where users can type what they want to hear.
  2. When the 'Speak' button is clicked, the speak() function is invoked.
  3. Inside speak(), the text from the textarea is converted into a SpeechSynthesisUtterance, which the speechSynthesis API then reads out loud.

Pretty simple, right? But we can do more than just speak the text—we can control how it sounds.

Adjusting Speech Rate, Pitch, and Volume

Want to slow down the speech or give it a higher pitch? The SpeechSynthesisUtterance object lets you tweak the speech properties to get it just right.

Example:

function speak() {
    const text = document.getElementById('textInput').value;
    const utterance = new SpeechSynthesisUtterance(text);
    
    // Adjust speech properties
    utterance.rate = 1;  // Speed of speech (1 is normal, 0.5 is slower, 2 is faster)
    utterance.pitch = 1; // Pitch of the voice (0 is low, 2 is high)
    utterance.volume = 1; // Volume level (0 is mute, 1 is full volume)

    window.speechSynthesis.speak(utterance);
}

Now the speech will sound exactly the way you want it. This is perfect if you need to make the speech clearer or just want to have some fun with it.

Choosing a Different Voice

Did you know that browsers come with different voices? You can pick from various accents, genders, and languages—whatever suits your app's personality.

Example:

function speak() {
    const text = document.getElementById('textInput').value;
    const utterance = new SpeechSynthesisUtterance(text);
    
    // Get all available voices
    const voices = window.speechSynthesis.getVoices();
    
    // Pick a voice (in this case, the third one)
    if (voices.length > 2) {
        utterance.voice = voices[2];
    } else {
        console.log('Not enough voices available.');
    }

    window.speechSynthesis.speak(utterance);
}

// Load voices when they're ready
window.speechSynthesis.onvoiceschanged = function() {
    speak(); // Refresh the voices list
};

Different voices can be fun and engaging, adding a unique flair to your application. Since the voices available may vary between browsers and operating systems, it's a good idea to give users some choices.

Handling Voices Asynchronously

Sometimes, the voices are not immediately available when the page loads. To ensure your app is waiting for their response, you can listen to the voiceschanged event.

Example:

let availableVoices = [];

function loadVoices() {
    availableVoices = window.speechSynthesis.getVoices();
    console.log('Loaded voices:', availableVoices);
}

// Load voices asynchronously
window.speechSynthesis.onvoiceschanged = loadVoices;

function speak() {
    const text = document.getElementById('textInput').value;
    const utterance = new SpeechSynthesisUtterance(text);

    if (availableVoices.length > 0) {
        utterance.voice = availableVoices[0]; // Use the first available voice
    } else {
        console.log('Voices not ready yet!');
    }

    window.speechSynthesis.speak(utterance);
}

This ensures the voices are fully loaded before the speech starts.

Pausing, Resuming, and Cancelling Speech

For longer pieces of text, you might want to give users more control—let them pause, resume, or even stop the speech. Here's how you can do that.

Example:

function pauseSpeech() {
    window.speechSynthesis.pause();
}

function resumeSpeech() {
    window.speechSynthesis.resume();
}

function stopSpeech() {
    window.speechSynthesis.cancel();
}

These functions are useful when users want to manage the speech playback, especially if the content is long.

Conclusion

In this tutorial, you've learned how to implement Text to Speech in JavaScript using the speechSynthesis API. You've seen how to:

  • Turn text into speech.
  • Adjust the speech's rate, pitch, and volume.
  • Choose different voices.
  • Control playback with pause, resume, and stop functions.

This feature can add a lot of value to your app, making it more accessible and interactive. Now go ahead and give your app a voice!



Found This Page Useful? Share It!
Get the Latest Tutorials and Updates
Join us on Telegram