[HOWTO] How To Use Wit.ai, Windows Speech Recognition, Google Cloud Voice Recognition, or Recognissimo in Conversations
Posted: Wed Dec 21, 2016 8:55 pm
(Link to example scenes at end of post. Also includes Windows Speech Recognition, Google, & Recognissimo examples.)
A Dialogue System user developed an interesting way to run conversations. I wanted to post a description here in case it can help others.
Subtitles can be sent to a text-to-speech plugin such as RT-Voice. The Dialogue System has very easy integration for it.
This post covers the other direction: allowing the player to make responses simply by speaking.
In place of a traditional response menu UI, his solution listens for keywords, which he refers to as "intents." When editing a response dialogue entry, he puts the intent text into the Menu Text field. When the player speaks a keyword associated with a response, the Dialogue System chooses that response.
To accomplish this, he used wit.ai, an online speech recognition service. He started with the Unity / wit.ai integration at https://github.com/afauch/wit3d
He used the parts of wit3D that initiate recording, save a file of the recording, send it to wit.ai, get back the JSON, and parse the JSON for the intent. Then he passes the intent to a simple subclass of UnityUIDialogueUI that he named WitDialogueUI. (When he receives input from wit.ai, he calls a static method in WitDialogueUI.) The basic code is here:
WitDialogueUI.cs
So using this technique, he could create a game like The Wayne Investigation on Amazon Echo, or a mobile game that he could play in the car without having to look at a screen, or an accessible game for visually impaired players. Pretty neat!
Example scene (Wit.ai): WitAI_Example_2017-01-09.unitypackage
Example scene (Windows Speech Recognition): DS_WindowsSpeechRecognitionExample_2020-08-20.unitypackage
A Dialogue System user developed an interesting way to run conversations. I wanted to post a description here in case it can help others.
Subtitles can be sent to a text-to-speech plugin such as RT-Voice. The Dialogue System has very easy integration for it.
This post covers the other direction: allowing the player to make responses simply by speaking.
In place of a traditional response menu UI, his solution listens for keywords, which he refers to as "intents." When editing a response dialogue entry, he puts the intent text into the Menu Text field. When the player speaks a keyword associated with a response, the Dialogue System chooses that response.
To accomplish this, he used wit.ai, an online speech recognition service. He started with the Unity / wit.ai integration at https://github.com/afauch/wit3d
He used the parts of wit3D that initiate recording, save a file of the recording, send it to wit.ai, get back the JSON, and parse the JSON for the intent. Then he passes the intent to a simple subclass of UnityUIDialogueUI that he named WitDialogueUI. (When he receives input from wit.ai, he calls a static method in WitDialogueUI.) The basic code is here:
WitDialogueUI.cs
Code: Select all
using UnityEngine;
using System.Collections;
using PixelCrushers.DialogueSystem;
public class WitDialogueUI : UnityUIDialogueUI {
//Use a singleton to allow access from static methods
public static WitDialogueUI Instance;
public override void Awake() {
base.Awake();
Instance = this;
}
public bool showMenu = true; // Show response menu. Useful for debugging.
public bool listening = false; // True when listening for a Wit.AI voice command.
public Response[] responses;
public override void ShowResponses(Subtitle subtitle, Response[] responses, float timeout) {
base.ShowResponses(subtitle, responses, timeout);
this.responses = responses; // Remember the responses to check when wit.ai returns an intent.
}
public override void HideResponses() {
base.HideResponses();
responses = null; // Response menu is done, so no responses to check.
}
// Get the Intent from the _Handler script (part of Wit3d)
public static void getIntentFromWit3d(string Wit3Dintent) {
if (!string.IsNullOrEmpty(Wit3Dintent)) {
foreach (var response in Instance.responses) {
if (string.Equals(Wit3Dintent, response.formattedText.text)) {
// We have a match, select the choice:
Instance.OnClick(response); // Simulate a click on a response button.
}
}
}
}
}
Example scene (Wit.ai): WitAI_Example_2017-01-09.unitypackage
Example scene (Windows Speech Recognition): DS_WindowsSpeechRecognitionExample_2020-08-20.unitypackage