This weekend I’ve wanted to look some things up on Wikipedia but as it was such nice weather I didn’t want to be cooped up inside. My trusty Acer Aspire One has developed a problem with it’s solid state drive following (but possibly not because of) installing Ubuntu Netbook Remix so I was left with my ogg player as my only option. I could have used the web based service pediaphon to convert the files but I want a command line option that I could run on my own computer. The solution is a good example of the Unix philosophy of chaining the output of several special purpose commands together to do a job.
Here’s the code:
echo "What do you want me to lookup for you on Wikipedia ?"; \r
read line; \r
wget -q http://en.wikipedia.org/wiki/"${line}" -O - | \r
html2text -nobs -ascii -o -| \r
espeak --stdout | \r
ffmpeg -i - "${line}.ogg"
Let’s break it down. First we ask the user to enter a search term. Then we have wget get the url from wikipedia. We are taking advantage of the fact that Wikipedia has very predictable URL’s. The "-q" tuns off verbose output while the "-O -" tells wget to send the output to standard output. This outputs a raw html file which pipe into html2text to remove the additional markup.
By default html2text will try and output additional formating that is understood by pager programs like less and more. In our case we do not want that so we will use the switches "-nobs -ascii" to suppress hidden formating and to use a simpler encoding standard. The next option "-o -" will pass the cleaned text to standard output which we then pipe to espeak.
eSpeak is a compact open source software speech synthesizer that is in the default ubuntu repositories and I found it a lot easier to setup than festival but more importantly it handles redirection a lot easier. The only option I need to use is the "--stdout" to tell it to speak the incoming text and send the wav output to our old friend ffmpeg.
The " -i -" tells ffmpeg to read a wav data stream from standard input. The "${line}.ogg" tells it to save the spoken text to a ogg file with the same name as the search term the user entered. While the audio takes a bit of getting used to it is a good way to actually learn something useful.
Pingback: HPR web2speech v 0.2 released « kenfallon.com