Who took on the Standard Oil men and whipped their ass
Just like he promised he’d do?
Ain’t no Standard Oil men gonna run this state
Gonna to run by little folks like me and you.
Kingfish, by Randy Newman
As we enter the home stretch of the 2016 U.S. Presidential race, it’s hard to ignore Google’s presence in the government. From the MIC Coalition to the Mayday PAC, from Google’s massive pack of lobbyists prowling through the halls of Congress and the White House to Google’s entry into the military industrial complex it’s clear that Google is one multinational corporation that does not intend to go the way of Standard Oil.
With so much at stake, what if Google yielded to the moral hazard that consumes its monopoly search engine and for which it is being sued by the European Commission among others? Why not just fix the damn elections using the same data they use to “know what you want before you do“? Sheesh, all this lobbying is such a hassle!
And if Google did fix the elections how would they do it and how would we ever know they did it?
Oh Canada! What Would We Know and How Would We Know It?
This is exactly the issue that surfaced in the recent Canadian national elections, discussed in a thoughtful two part article by Michael Den Tandt in the National Post that highlighted the work of Dr. Robert Epstein and Dr. Ronald E. Robertson entitled “Democracy at Risk: Manipulating Search Rankings Can Shift Voting Preferences Substantially Without Voter Awareness”. Epstein and Robertson started from this premise:
Internet search rankings have a significant impact on consumer choices, mainly because most users click only on highly ranked results (Agichtein, Brill, Dumais, & Ragno, 2006; Granka, Joachim, & Gay, 2004; Guan & Cutrell, 2007; Joachims et al., 2007; Pan et al. 2007). This is why North American companies now spend more than 20 billion dollars annually to place results at the top of rankings (Econsultancy, 2012; Learmonth, 2010). We conducted an experiment to determine whether the deliberate manipulation of search rankings could also influence the preferences of undecided voters.
Dr. Epstein discusses their research in an interview on the PBS News Hour:
The upshot is that in a close election, search engine manipulation has a pronounced effect on outcomes according to Epstein and Robertson:
We conclude (1) that the outcomes of real elections—especially tight races—could conceivably be determined by the strategic manipulation of search engine rankings and (2) that the manipulation could be accomplished without people being aware of it.
The recent Canadian election was consistent with the kind of election that Epstein and Robertson had in mind. In fact, Google created a product to collect data about Canadian voters.
Whenever Google tells you they are collecting data, what they very likely are doing in the background is creating data profiles about users and segmenting that behavior. (Google got into big trouble recently when they were caught profiling student emails through Google Apps for Education.) If the user also has a Google account and is logged in (such as a gmail account, Google Apps for Education or Google Apps for Work), then they can tie your political interests to YOU.
As Google executive Leslie Church also wrote in the National Post (partly in response to Den Tandt’s reporting on the Epstein and Robertson study):
For a company like Google, whose mission is to organize the world’s information and make it universally accessible and useful, civic information has always been a big part of that mission. It’s why we’ve worked on elections projects in over 40 countries since 2007, created a “Know Your Candidates” platform in India to learn about more than 8,000 candidates in the world’s largest democratic election, served up voter information more than 120 million times in Egypt and helped millions of U.S. voters since 2008 access voting information through our Voting Information Project and Civic Information API. It’s why we have made several of Canada’s federal leaders’ debates available on YouTube in real time — a Canadian election “first” — and why we’re working with partners like Elections Canada to make voting information easily accessible on the web and on the smartphone in your pocket.
So why would a company that outspends nearly every multinational on lobbying and is driven to control every government they do business with be so interested in collecting data on voter sentiment around the world? According to Ms. Church’s op-ed:
Our goal in being involved with Canada’s election, as Googlers and Canadian voters ourselves, is to ensure that our users — and today’s voters — get the information that they are looking for, so that they can have their questions answered and make an informed decision when they cast their ballot.
Because puppy dog tails and unicorn scoopers, of course. Because on this thing Google behaves differently than it does on all the other things.
Just Because You’re Paranoid…
Unfair, you say? Too paranoid? Indulge me for a moment. Paranoid would be thinking that Google will collect election data and turn it over to the National Security Agency, and I’m certainly not thinking that.
But history demonstrates that Google has a history of creating real products that allow them to collect data in the background for undisclosed purposes–what I call “nondisplay uses.”
Case in point: Remember “GOOG-411”? This Google product was the “free” Google directory assistance (very similar to Google Voice). Former Googler (and soon to be former Yahoo!er) Marissa Meyer told Info World nearly 8 years ago that GOOG-411 was not intended to be what it appeared to be:
You may have heard about our [directory assistance] 1-800-GOOG-411 service. Whether or not free 411 is a profitable business unto itself is yet to be seen. I myself am somewhat skeptical. The reason we really did it is because we need to build a great speech-to-text model … that we can use for all kinds of different things, including video search.
The speech recognition experts that we have say: If you want us to build a really robust speech model, we need a lot of phonemes, which is a syllable as spoken by a particular voice with a particular intonation. So we need a lot of people talking, saying things so that we can ultimately train off of that. … So 1-800-GOOG-411 is about that: Getting a bunch of different speech samples so that when you call up or we’re trying to get the voice out of video [such as from YouTube], we can do it with high accuracy.
That’s right–Google told you the product was doing one thing, but in actual fact it was always intended to be something entirely different. The real action was in the background where users couldn’t see it. If Marissa Meyer hadn’t let it slip in an interview, you might never have known.
And then there’s Google Books. I bet you thought Google Books was the “digital Library of Alexandria.” Did anyone ever tell you that Google uses Google Books for “corpus machine translation”–training its translation tools using a huge library of translated works (that someone else paid to have painstakingly translated)? Here’s a helpful YouTube video about that technique:
And of course Gmail, which uses Google’s Content OneBox and PHIL technology according to Jeff Gould’s excellent and startling story based on disclosures Google made in open court during the Gmail class action (but which were redacted from transcripts). (“PHIL stands for Probabilistic Hierarchical Inferential Learner):
How does Google’s online profiling work? At its core are the same patented PHIL clustering and concept extraction methods described above. A user (or group of users) can be described by various kinds of clusters. The simplest kind are clusters of terms used in documents created or viewed by the user. Another kind derives from the URLs of documents the user has viewed or perhaps forwarded to others by email or social media. A third kind — the most comprehensive — consists of the concept or category clusters extracted by the PHIL algorithm from documents the user has viewed (web pages, inbound emails) or created (outbound emails, social media posts).
So let me ask you again–why is Google so interested in collecting data on voter sentiment around the world?
Ms. Church tells us:
For a company like Google, whose mission is to organize the world’s information and make it universally accessible and useful, civic information has always been a big part of that mission….But to be clear, Google has never re-ranked search results on any topic (including elections) to manipulate user sentiment…
Nobody said Google was “re-ranking”. They just don’t show the “unuseful” bits based on their data profiling.
Dr. Epstein and Dr. Robertson were making a different point:
[W]ith little or no competition among search engines in today’s marketplace, search engine manipulations could be slanted almost entirely toward one candidate or party in election after election with no one the wiser. Dominance by one company in the search engine business, combined with the invisibility of the manipulations, could, over time, subvert the mechanisms that maintain open and free elections.
So did Google steal the Canadian elections? I doubt it. Did they spot trends early? With all their data capability, you have to believe they did.
Did Google have a stake in the outcome of the Canadian elections? Were they able to exploit that data to their advantage? Did they exploit their data to benefit a particular candidate? And what benefit did Google get from their “Civic Information API”? Whatever could it be?
One correction. Ms. Church is no longer at Google. She’s gotten herself a government job working as the chief of staff to the new Prime Minister’s Heritage Minister, a position that has great influence on copyright policy in Canada (and remember that the former Prime Minister’s party just extended the sound recording copyright term to 70 years in Canada). And that’s because she has great experience and deep qualifications with intellectual property matters…no wait. She has none.
Nothing to see here, move along.