Rome was not built in a day, neither was Amazon Alexa. Key insights from Amazon blog that explain why Alexa keeps getting smarter.
Amazon Alexa has been the talk of the town since its arrival in the world. Based on voice-commands, it has enabled users to complete a plethora of tasks in much easier and simpler ways than the past.
But did you know that it is not merely a search engine that works on the voice commands? It actually, uses AI and machine learning to improve search results based over patterns a consumer uses it for.
Here is everything else that makes Amazon Alexa special, very-very special!
Amazon Alexa’s research and development
Rohit Prasad, head scientist, team Alexa, on a blog published by Amazon staff has explained the five major categories that fall under Alexa’s research and development. In simpler words he has tried to explain the reasons that make Amazon Alexa one of its kind.
He said, “We will continue to make Alexa more useful and delightful by shifting the cognitive burden for more complex tasks from our customers to Alexa. I’m optimistic that our investments in all layers of our AI stack will continue to make Alexa smarter at a breakneck pace.”
Here are the five major categories falling under Amazon Alexa’s research and development that have made it what it is.
Competence by learning new skills and improving existing ones
Alexa learns new skills and works on improving existing ones – a trait that separates leaders from followers.
The blog mentions that the Alexa team has been able to reduce Alexa’s error rate in every location and language Alexa is available in. Prasad said, “Alexa features more than 50,000 skills built by third party developers. We are helping democratize AI through our Alexa Skills Kit.”
Rohit has credited ‘active learning ‘as one of the most important techniques behind Alexa’s massive growth in skills. This technique makes Alexa sort through training data to extract examples likely to yield the improvements most significant in accuracy. The researchers on the Alexa team have also found out that active learning can reduce data needed in training a machine learning system by up to a staggering 97 per cent. The active learning practice has also resulted in enabling rapid improvement in Alexa’s language understanding systems.
Rohit also described a breakthrough that leads to faster development of new deep learning networks. This breakthrough is a combination of deep learning for natural language understanding with transfer learning. This enables a network already trained to perform a task, which has a large set of training data available, to be retrained on a related task, with less availability of data.
“We are rolling this out in the coming months to all skills. What this will do is give 15 percent relative improvement in accuracy for custom skills with no additional work from the third-party developer,” Rohit mentioned.
Context awareness by sharing info on past customer interactions with Alexa
This Amazon Alexa’s research and development category focuses on improving results of a particular request based on experience derived from past customer interactions and state of the world.
Prasad has explained in the blog that Amazon Alexa has already shown the capability of customising decisions based on the type of device a consumer is interacting with. The new tools christened – sound detection technology and whisper detection, empower Alexa to recognize glass breaking, smoke alarms and carbon monoxide alarms.
Rohit on these developments in Alexa has said, “Both systems use a machine learning network known as a long short-term memory.”
Alexa’s long-short term memory processes the incoming audio signals, broken into snippets and decides whether the signals are part of a whisper or an alarm. Then it’s AI enables these networks to automatically learn features that can be helpful in differentiating between whispers and other sound events.
Put in simple words, these networks, rather than relying on manually engineered features for whisper detection, automatically learn the frequency characteristics that are a part of the whispered speech.
Expanding Alexa’s knowledge of facts and events
The key point in expanding Amazon Alexa’s knowledge base is that instead of relying on a single knowledge source, researchers on the team Alexa combine heterogenous knowledge sources. This enables Alexa in providing best answers to consumer questions.
Rohit, as mentioned by the blog, has said, “In the past 12 months, Amazon’s knowledge team has added billions of data points to Alexa’s knowledge graph, a representation of named entities and their attributes and relationships.”
Enabling more natural interaction with the Alexa voice service
According to Rohit Prasad, researchers use ‘context carryover’ to make conversations with Alexa more natural. For example, if you ask Alexa about the weather today and she replies with information, and then you again ask her what about tomorrow, then Alexa is using ‘context carryover’ to give a more natural answer.
Rohit explained that the long-short term networks are also active in such instances.
Consumers interacting with Alexa had to specify the name of the skills they wanted to call upon but moving forward to “natural skill interaction” Rohit explained that a machine learning system will automatically select the skill that suits a consumer’s request best.
The two components of the system include shortlisting skills based on consumer’s request and then using detailed information to choose one among the list created. This Alexa’s ‘natural skill interaction’ is already available for many skills in the United States.
Self-learning – the process where Alexa learns from experience
We all learn from experience and so does Alexa! Amazon Alexa will start using automatic equivalence class learning, which will enable her to differentiate between failed and successful request made by the same user that contain a same word.
The blog explains, “If an Alexa customer in the Seattle area, for instance, requests the satellite radio station Sirius XM Chill, and that request fails, she might rephrase it as Sirius channel 53. An automated system can recognize that these requests share a word (“Sirius”), that the second request was successful, and that both names should be treated as referring to the same entity.”
Looking at the amount of time and effort researchers and developers put in to make Alexa smarter it would not be wrong to say that ‘Alexa was not built in a day’!