Harshit Prasad bio photo

Harshit Prasad

Software Engineer - @blinkit, all about search systems - data and infrastructure stuff. Loves to talk about distributed systems and open source.

Email Twitter LinkedIn Github Stackoverflow

Explains how I implemented text-to-speech feature in Susper search.

This blog was originally posted on FOSSASIA Blog.

Susper has been given a voice search feature through which it provides the user a better experience of search. We introduced to enhance the speech recognition by adding Speech Synthesis or Text-To-Speech feature. The speech synthesis feature should only work when a voice search is attempted.

The idea was to create speech synthesis similar to market leader. Here is the link to YouTube video showing the demo of the feature: Video link

In the video, it will show demo :

If a manual search is used then the feature should not work. If voice search is used then the feature should work. For implementing this feature, we used Speech Synthesis API which is provided with Google Chrome browser 33 and above versions.

window.speechSynthesis.speak(‘Hello world!’); can be used to check whether the browser supports this feature or not.

First, we created an interface:

interface IWindow extends Window {
  SpeechSynthesisUtterance: any;
  speechSynthesis: any;
}

Then under @Injectable we created a class for the SpeechSynthesisService

export class SpeechSynthesisService {
  utterence: any;

  constructor(private zone: NgZone) {}

  speak(text: string): void {
    const { SpeechSynthesisUtterance }: IWindow = <IWindow>window;
    const { speechSynthesis }: IWindow = <IWindow>window;
    this.utterence = new SpeechSynthesisUtterance();
    this.utterence.text = text; // utters text
    this.utterence.lang = "en-US"; // default language
    this.utterence.volume = 1; // it can be set between 0 and 1
    this.utterence.rate = 1; // it can be set between 0 and 1
    this.utterence.pitch = 1; // it can be set between 0 and 1

    (window as any).speechSynthesis.speak(this.utterence);
  }

  // to pause the queue of utterence
  pause(): void {
    const { speechSynthesis }: IWindow = <IWindow>window;
    const { SpeechSynthesisUtterance }: IWindow = <IWindow>window;
    this.utterence = new SpeechSynthesisUtterance();

    (window as any).speechSynthesis.pause();
  }
}

The above code will implement the feature Text-To-Speech.

We call speech synthesis only when voice search mode is activated. Here, we have used redux to check whether the mode is ‘speech’ or not. When the mode is ‘speech’ then it should utter the description inside the infobox.

We did the following changes in infobox.component.ts:

import { SpeechSynthesisService } from "../speech–synthesis.service";

speechMode: any;

constructor(private synthesis: SpeechSynthesisService) { }

this.query$ = store.select(fromRoot.getwholequery);

this.query$.subscribe(query => {
    this.keyword = query.query;
    this.speechMode = query.mode;
});

And we added a conditional statement to check whether mode is ‘speech’ or not.

// conditional statement
// to check if mode is ‘speech’ or not
if (this.speechMode === "speech") {
    this.startSpeaking(this.results[0].description);
}

startSpeaking(description) {
    this.synthesis.speak(description);
    this.synthesis.pause();
}

References