It’s Monday morning and you get up from the bed to get ready to office. You wonder would it rain or not. You pick up your iPhone 4S, click on the microphone tab and ask the question-
‘Siri, Will I need an umbrella today?’
‘The weather is Sunny today in Sydney. Hence no need for the umbrella.’ Comes the prompt reply.
Welcome to the world of voice recognition and artificial intelligence. With the new iPhone 4S, the world of mobile phones has transformed how we use our electronic devices and has provided us with better productivity and convenience.
The Siri application has taken the world by storm by performing certain basic activities like taking notes, sending email, scheduling appointment and many more just by following voice commands. It is better than other speech recognition software we have encountered till now as instead of functioning on pre-determined words, Siri actually tries to understand what you speak and perform tasks accordingly. This leads to an important question- How does Siri work?
When a voice input is provided to Siri, a chain of algorithm is followed which tries to understand what the command means and accordingly produce the output. The following processes occur when you give a command to Siri-
How Does Siri Work?
- Your sound is channeled through the noise cancelling microphone of the iPhone and is encoded into digital format and stored temporarily.
- The digital data is then transferred via internet connection through your ISP to the Siri server located in cloud which is loaded with pre determined AI algorithms to understand the data and provide suitable feedback to your phone.
- Further, the digital data in your iPhone is analyzed locally by a built in recognizer (which communicates with the server) installed in your phone which tries to determine whether the command can be resolved locally (by creating contacts, schedules etc.) or it must connect to the internet for command execution. (For more sophisticated commands like weather report, sending an email etc.)
- The server compares the data with various statistical algorithms already provided and tries to understand the data based on the letters and order of the letters to come up with a solution for the command (voice command). Meanwhile the local recognizer in the phone also tries to compare the data with a shorter version of the statistical algorithm and comes up with a solution. Based on the highest probability between the two solutions, a go ahead signal is given to the corresponding channel.
- The response is then passed through a language model which then tries to compare it with a list of probable interpretations to further ensure accuracy of the response. Then the phone will produce a voice feedback to insure that the command was properly understood. If the program is convinced with the results, then it will execute the task corresponding to what it understood.
Image © Divya Rawat
For instance if it understands that you want to make an appointment with a certain contact, Jack, it will go ahead and ask the question ‘Do you want to make an appointment with Jack?’ to ensure that it properly understood your command.
If it receives a yes or a similar response to the question, (which will also be scrutinized in the same manner) then it will go ahead and create a new entry in the appointment folder with the appropriate note linking with the contact Jack.
The beauty of the program is that the whole process takes less than 3 seconds and can provide with instant feedback and solutions for your queries. Hence Siri can be regarded as a good start in field of Artificial intelligence and although it may not be perfect for now and make lot of mistakes, with further improvement in R&D, the future will be ruled by AI.
Guest author Divya Rawat is a mother and a self confessed SEO and tech-enthusiast. She works as a Writer @ iNetZeal (which offers variety of services including content wrting–click here to read more about their content writing service.)