Motivated by this post, I played with a similar setup with our beloved Cubieboard. Although I only tested the speech recognition, the results are pretty encouraging so far.
Since I don’t have an USB microphone (yet), I used a very cheap webcam (about $10 or so). Details about the webcam I used are available here. No drivers are required. I have to note that the Cubieboard does have a line-in input that can probably be used with additional electronics.
To reproduce my experiment, the first step is to get Cubian text mode SD-card image, available here: http://cubian.org/downloads/. I used the latest version, which at the time of writing this is Cubian-base-r7-arm-a10.img.7z.
Directions on how to install Cubian on your SD-card are available here: https://github.com/cubieplayer/cubian/wiki/Install-Cubian. Instead of using Image Writer, I used and I highly recommend Win32 Disk Imager.
Once you have Cubian up and running, login via SSH. The default username is cubie with the password cubie. To avoid frustration, also note that the SSH port is not the default 22, but 36000.
If your webcam is connected and recognized by the Cubieboard, by issuing the following command:
arecord -L [enter]
you should get the following output:
null Discard all samples (playback) or generate zero samples (capture) default:CARD=sunxicodec sunxi-CODEC, sunxi PCM Default Audio Device sysdefault:CARD=sunxicodec sunxi-CODEC, sunxi PCM Default Audio Device default:CARD=Camera USB 2.0 Camera, USB Audio Default Audio Device sysdefault:CARD=Camera USB 2.0 Camera, USB Audio Default Audio Device front:CARD=Camera,DEV=0 USB 2.0 Camera, USB Audio Front speakers surround40:CARD=Camera,DEV=0 USB 2.0 Camera, USB Audio 4.0 Surround output to Front and Rear speakers surround41:CARD=Camera,DEV=0 USB 2.0 Camera, USB Audio 4.1 Surround output to Front, Rear and Subwoofer speakers surround50:CARD=Camera,DEV=0 USB 2.0 Camera, USB Audio 5.0 Surround output to Front, Center and Rear speakers surround51:CARD=Camera,DEV=0 USB 2.0 Camera, USB Audio 5.1 Surround output to Front, Center, Rear and Subwoofer speakers surround71:CARD=Camera,DEV=0 USB 2.0 Camera, USB Audio 7.1 Surround output to Front, Center, Side, Rear and Woofer speakers iec958:CARD=Camera,DEV=0 USB 2.0 Camera, USB Audio IEC958 (S/PDIF) Digital Audio Output
Just like in the Raspberry Pi tutorial, I used the Google voice recognition functions. First, install ffmpeg using:
sudo apt-get install ffmpeg [enter]
The same script works:
#!/bin/bash echo "Recording... Press Ctrl+C to Stop." arecord -D "plughw:1,0" -q -f cd -t wav | ffmpeg -loglevel panic -y -i - -ar 16000 -acodec flac file.flac > /dev/null 2>&1 echo "Processing..." wget -q -U "Mozilla/5.0" --post-file file.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -$ echo -n "You Said: " cat stt.txt rm file.flac > /dev/null 2>&1
Save the above script to something like stt.sh and make it executable with:
chmod +x stt.sh [enter]
Start the script with:
and say something in the microphone. If everything works as it should, you should see something like this:
To use a different language, replace the lang parameter, like this:
It actually works amazingly well.
This can probably be used with small changes for home automation. Next, add some brains and voice, to get something similar to Siri. Stay tuned.