Is Google Using Voice Recognition for Adwords?

MTP readers will recall that I was invited to participate in an episode of the Music Biz Podcast about the songwriter class actions against Spotify.  At one point during the course of the podcast, I said that I would be surprised if Spotify got an IPO this year, “even with Goldman Sachs”.

That’s the only time that Goldman Sachs was mentioned in the video and it  otherwise had nothing to do with Goldman Sachs.

I happened to notice that YouTube served an ad for Goldman Sachs as the pre-roll for the podcast.

youtube goldman sachs

So ask yourself this–have you EVER seen an ad for Goldman Sachs on YouTube?  Or anywhere online for that matter?

I have not.  So I got to thinking why would they serve an ad for Goldman Sachs to that particular video?

Remember “GOOG-411”?  This Google product was the “free” Google directory assistance (very similar to Google Voice). Former Googler (and perhaps soon to be former Yahoo!er) Marissa Meyer told  Info World nearly 8 years ago that GOOG-411 was not intended to be what it appeared to be:

You may have heard about our [directory assistance] 1-800-GOOG-411 service. Whether or not free 411 is a profitable business unto itself is yet to be seen. I myself am somewhat skeptical. The reason we really did it is because we need to build a great speech-to-text model … that we can use for all kinds of different things, including video search.

The speech recognition experts that we have say: If you want us to build a really robust speech model, we need a lot of phonemes, which is a syllable as spoken by a particular voice with a particular intonation. So we need a lot of people talking, saying things so that we can ultimately train off of that. … So 1-800-GOOG-411 is about that: Getting a bunch of different speech samples so that when you call up or we’re trying to get the voice out of video [such as from YouTube], we can do it with high accuracy.

That’s right–Google told you the product was doing one thing, but in actual fact it was always intended to be something entirely different.  The real action was in the background where users couldn’t see it.  If Marissa Meyer hadn’t let it slip in an interview, you might never have known.

So what else could Google do with this voice recognition technology you helped them to create if you ever used GOOG 411 (or Google Voice or God knows what else, like maybe an Android phone).

They could use it to convert the soundtrack of YouTube videos into text and sell those words as Adwords to serve ads.  Like if you mentioned Goldman Sachs, serve a Goldman Sachs ad as the pre-roll.  Because you know, it’s like useful.

Of course this time the soundtrack was three guys talking, but what is in the soundtrack of a YouTube video much more frequently?