Challenge

Have you ever wondered how conversational AI agents such as Alexa and Siri work? How do they interpret what you are saying to them and grasp your intent? How do they then know how to appropriately and meaningfully respond to you? In this project you are challenged to create your very first own Voice User Interface (VUI) as you build a voice-driven calculator that can do basic arithmetic operations.

Setup your computer

Voice Calculator (Level: Intermediate)

Introduction

The purpose of this AI project is to give you a sense of the basics of a Voice User Interface (VUI) and to teach you how to design a simple AI system that can understand the intent of the user in a verbally stated calculation question and respond appropriately. Such a voice-driven AI system can be useful in various contexts such as when designing assistive technologies for the visually disabled and the elderly. For example, a visually impaired user can use the Voice Calculator to do mathematical calculations verbally without having to type in all the details of the calculation.

GUI of Voice Calculator


Important: Please note that for this project you cannot use the Emulator to test your app as it does not have Speech Recognition capability. Similarly your mobile device must have Speech Recognition capability for the Voice Calculator to work.

Graphical User Interface (GUI)

The GUI has been created for you in the starter file. Please change the properties of the components as you wish to get the look and feel you want. However, please do not rename the components, as this tutorial will refer to the given names in the instructions.

GUI of Voice Calculator

In the GUI you will notice that there is a Speak button which the user will press to verbally communicate the calculation they wish to have performed. The interface will then display in writing what the Calculator heard and respond, both in writing and verbally, with the result of the calculation. If the Calculator could not hear a meaningful calculation query or could not understand the intent of the user, it will say so.

Initialize numberList

The first thing you’ll tackle is to extract the numbers in the sentence spoken by the user. You’ll use them later when you actually perform the mathematical operation. To do this first you will initialize a global variable named numberList where the numbers in the calculation query will be stored. As this variable will be a list of numbers, it will be initialized to an empty list.

Initialization of global numberList

procedure extractNumbers

(page 1)

Then you will create a procedure called extractNumbers which when given an input sentence will extract the numerical values in that sentence and store these in the global variable numberList. To do this:

  1. choose a procedure and name it extractNumbers
  2. use the settings gear to add an input parameter and call it sentence

procedure extractNumbersprocedure extractNumbers input parameter

(procedure continues next page)

procedure extractNumbers

(page 2)

  1. set the global variable numberList to the empty list. You need to reinitialize the variable every time you call this procedure as each calculation the user initiates will use a new pair of numbers
  2. use the split at spaces text block to split the input sentence into a list of its words and use for each word in list block to check to see if any of the words is a number

for each word

  1. if any word is a number then add it to the global variable numberList

procedure extractNumbers hint

procedure extractNumbers solution

Try this on your own but if you get stuck you can click the Hint button.

Multiplication Intent

As there are many ways for a user to indicate that they would like to perform a multiplication operation, it is essential to identify all these different approaches as a multiplication intent. For example all of the following statements are different ways of expressing the same multiplication intent:

Note that the key words/symbols/numbers in green define the multiplication intent while the words/symbols in red can be disregarded.

variable multiplicationIntents

Now you will create a global variable multiplicationIntents which will be a list of all the common ways of communicating a multiplication intent with symbols and words:

{ * , x , X , product, multiply, times }

global variable multiplicationIntents

SpeakButton

Now you will write the code to give functionality to the Speak button. When the Speak button is clicked:

  1. clear the UserTextLabel and CalculatorTextLabel
  2. call the SpeechRecognizer to get the text of what the user has spoken

when SpeakButton Click hint

when SpeakButton Click solution

Try this on your own but if you get stuck you can click the Hint button.

when SpeechRecognizer gets text

(page 1)

When the SpeechRecognizer performs its task and returns with a text result:

  1. set the UserTextLabel to this text result. This indicates what the Calculator heard.
  2. extract the numbers from the text result to store them in the global variable numberList using the procedure extractNumbers
  3. set the CalculatorTextLabel to a default statement indicating that the Calculator could not understand what the user asked and inviting them to ask a clear calculation question. For ex: “I could not understand. Please ask me a multiplication or addition or subtraction or division question like: What is 123 times 85?”

(task continues next page)

when SpeechRecognizer gets text

(page 2)

  1. check that there were exactly two numbers extracted from the sentence uttered by the user and if so, determine
    • if the intent was multiplication, set CalculatorTextLabel to the product of the two numbers
  2. use the TextToSpeech component to have the Calculator verbally read the contents of the CalculatorTextLabel.

You may find the the text block contains any helpful. text block contains any

In the following example you can see that the contains any block returns true when one of the words in the piece list (“you”) is contained in the input text (“How are you?”). text block contains any do it

when SpeechRecognizer gets text hint

when SpeechRecognizer gets text partial solution

Try this on your own but if you get stuck you can click the Hint button.

Test your App for Multiplication

Now you will use the AI Companion to check that your app works well for a multiplication calculation. Be sure to use AI2 Companion version 2.60 or later otherwise the app will give errors. Also please note that an Emulator cannot be used in the testing as it does not support Speech Recognition. Try to state your multiplication intent in a variety of ways to make sure that the Calculator responds properly with the correct product. Also make a non-calculation statement like “Hello how are you doing today?” and check that the Calculator responds appropriately by saying something like “I could not understand. Please ask me a multiplication or addition or subtraction or division question like: What is 123 times 85?”

Other operations

(page 1)

Now you will create three more global variables for the other operations: additionIntents, subtractionIntents and divisionIntents.

The following symbols/words are common ways of indicating intent for each operation

procedure additionIntents solution

procedure subtractionIntents solution

procedure divisionIntents solution

Try this on your own but if you get stuck you can click the Hint button.

Other operations

(page 2)

And now revise when SpeechRecognizer1.AfterGettingText to include the extra operations

Using the settings gear you can access the following version of the if then block which can be helpful:

if then else if hint

when SpeechRecognizer gets text full solution

Try this on your own but if you get stuck you can click the Hint button.

Test Your App again

Congratulations, you have built your first voice-driven AI system. Test it thoroughly to make sure that your Voice Calculator can correctly respond to a variety of different utterances for each operation intended.

Expand Your App

About Youth Mobile Power

A lot of us spend all day on our phones, hooked on our favorite apps. We keep typing and swiping, even when we know the risks phones can pose to our attention, privacy, and even our safety. But the computers in our pockets also create untapped opportunities for young people to learn, connect and transform our communities.

That’s why MIT and YR Media teamed up to launch the Youth Mobile Power series. YR teens produce stories highlighting how young people use their phones in surprising and powerful ways. Meanwhile, the team at MIT is continually enhancing MIT App Inventor to make it possible for users like you to create apps like the ones featured in YR’s reporting.

Essentially: get inspired by the story, get busy making your own app!

The YR + MIT collaboration is supported in part by the National Science Foundation. This material is based upon work supported by the National Science Foundation under Grant No. (1906895, 1906636). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Check out more apps and interactive news content created by YR here.