Quote:
Originally Posted by Quoth
Google Go, Alexa, Cortana, Siri and Facebook Portal are NOT AI or smart. The voice to text is deliberately done on the companies' servers to acquire information about the users (the product). It could be done locally twenty years ago. Phones, tablets and TVs, not just so called "smart speakers" like Echo have been supplying user information to Google/Alphabet, Amazon, Microsoft, Apple, Facebook/Meta for years and this is being used in house illegally and sold to third parties. Poorly paid humans are also used to listen as actual speech recognition (really pattern matching and then a search back end for the text) isn't much better than 25 years ago. It just needs less "training" on site.
|
No it really couldn't. A modern
good home computer with a beefy GPU can barely do it now with Mozilla DeepSpeech or one of the other options floating around. It's a very different problem to have arbitrary random speech correctly detected from 10 feet away with a $3 mic setup and your kid blasting Mario music in the background than to have one with a mic jammed up in your face that you spent 2 hours training on your voice and doesn't have to process any of the context of those words either.
There's a few open source smart AI systems out there. Mycroft which is probably the biggest of them by default passes your voice to a server to process. There are other ones that don't need it like Rhasspy but you get limited canned phrases that you have to specifically train.