Have you ever wondered how conversational AI agents such as Alexa and Siri work? How do they interpret what you are saying to them and grasp your intent? How do they then know how to appropriately and meaningfully respond to you? In this project you are challenged to create your very first own Voice User Interface (VUI) as you build a voice-driven calculator that can do basic arithmetic operations.
The purpose of this AI project is to give you a sense of the basics of a Voice User Interface (VUI) and to teach you how to design a simple AI system that can understand the intent of the user in a verbally stated calculation question and respond appropriately. Such a voice-driven AI system can be useful in various contexts such as when designing assistive technologies for the visually disabled and the elderly. For example, a visually impaired user can use the Voice Calculator to do mathematical calculations verbally without having to type in all the details of the calculation.
Important: Please note that for this project you cannot use the Emulator to test your app as it does not have Speech Recognition capability. Similarly your mobile device must have Speech Recognition capability for the Voice Calculator to work.
The GUI has been created for you in the starter file. Please change the properties of the components as you wish to get the look and feel you want. However, please do not rename the components, as this tutorial will refer to the given names in the instructions.
In the GUI you will notice that there is a Speak button which the user will press to verbally communicate the calculation they wish to have performed. The interface will then display in writing what the Calculator heard and respond, both in writing and verbally, with the result of the calculation. If the Calculator could not hear a meaningful calculation query or could not understand the intent of the user, it will say so.
The first thing you’ll tackle is to extract the numbers in the sentence spoken by the user. You’ll use them later when you actually perform the mathematical operation. To do this first you will initialize a global variable named numberList where the numbers in the calculation query will be stored. As this variable will be a list of numbers, it will be initialized to an empty list.
(page 1)
Then you will create a procedure called extractNumbers which when given an input sentence will extract the numerical values in that sentence and store these in the global variable numberList. To do this:
(procedure continues next page)
(page 2)
Try this on your own but if you get stuck you can click the Hint button.
As there are many ways for a user to indicate that they would like to perform a multiplication operation, it is essential to identify all these different approaches as a multiplication intent. For example all of the following statements are different ways of expressing the same multiplication intent:
Note that the key words/symbols/numbers in green define the multiplication intent while the words/symbols in red can be disregarded.
Now you will create a global variable multiplicationIntents which will be a list of all the common ways of communicating a multiplication intent with symbols and words:
{ * , x , X , product, multiply, times }
Now you will write the code to give functionality to the Speak button. When the Speak button is clicked:
Try this on your own but if you get stuck you can click the Hint button.
(page 1)
When the SpeechRecognizer performs its task and returns with a text result:
(task continues next page)
(page 2)
You may find the the text block contains any helpful.
In the following example you can see that the contains any block returns true when one of the words in the piece list (“you”) is contained in the input text (“How are you?”).
Try this on your own but if you get stuck you can click the Hint button.
Now you will use the AI Companion to check that your app works well for a multiplication calculation. Be sure to use AI2 Companion version 2.60 or later otherwise the app will give errors. Also please note that an Emulator cannot be used in the testing as it does not support Speech Recognition. Try to state your multiplication intent in a variety of ways to make sure that the Calculator responds properly with the correct product. Also make a non-calculation statement like “Hello how are you doing today?” and check that the Calculator responds appropriately by saying something like “I could not understand. Please ask me a multiplication or addition or subtraction or division question like: What is 123 times 85?”
(page 1)
Now you will create three more global variables for the other operations: additionIntents, subtractionIntents and divisionIntents.
The following symbols/words are common ways of indicating intent for each operation
Try this on your own but if you get stuck you can click the Hint button.
(page 2)
And now revise when SpeechRecognizer1.AfterGettingText to include the extra operations
Using the settings gear you can access the following version of the if then block which can be helpful:
Try this on your own but if you get stuck you can click the Hint button.
Congratulations, you have built your first voice-driven AI system. Test it thoroughly to make sure that your Voice Calculator can correctly respond to a variety of different utterances for each operation intended.
The calculator at this point functions a lot like how a beginning foreign language learner may try to function in a foreign country when listening to a native speaker: a few known key words are used to identify the intent of the native speaker and the rest of the other words are completely ignored in high hopes that they are irrelevant and thus don’t really matter. For example in the following sentence, all the words in red are irrelevant and can be ignored while the words in green are highly relevant in defining the intent of the speaker:
Would you be so kind, oh dear amazing Calculator, to tell me the product of the most glorious number 73 and the supremely wondrous quantity 51 ?
A lot of us spend all day on our phones, hooked on our favorite apps. We keep typing and swiping, even when we know the risks phones can pose to our attention, privacy, and even our safety. But the computers in our pockets also create untapped opportunities for young people to learn, connect and transform our communities.
That’s why MIT and YR Media teamed up to launch the Youth Mobile Power series. YR teens produce stories highlighting how young people use their phones in surprising and powerful ways. Meanwhile, the team at MIT is continually enhancing MIT App Inventor to make it possible for users like you to create apps like the ones featured in YR’s reporting.
Essentially: get inspired by the story, get busy making your own app!
The YR + MIT collaboration is supported in part by the National Science Foundation. This material is based upon work supported by the National Science Foundation under Grant No. (1906895, 1906636). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Check out more apps and interactive news content created by YR here.