Here's what we really know about Google's mysterious search engine
President Donald Trump thinks Google's search engine is "rigged." By featuring more mainstream news outlets and relatively fewer conservative sites in the results he sees, Trump tweeted Tuesday, Google is "suppressing" right-wing views on its platform. Trump escalated his attacks Tuesday afternoon in remarks from the Oval Office, warning that "Google and Twitter and Facebook, they are treading on very, very troubled territory and they have to be careful."
It's easy to see how Trump arrived at this conclusion, because in many ways his experience mirrors that of millions of Americans who've awoken to the dominance of Google – and Facebook, and Twitter – in their everyday lives without being quite certain how it wound up there.
We rely constantly on Google to find out what to buy, which restaurants to eat at and how to get from one place to another. But, partly by design, how Google does its job can still seem deeply mysterious, giving rise to theories about the way it supposedly operates. Is it possible for Google to manipulate your results? Would it?
Google denies that it does. "Search is not used to set a political agenda and we don't bias our results toward any political ideology," the company said Tuesday. "We never rank search results to manipulate political sentiment."
Google's claim may be small comfort to those convinced that their own results are being skewed. But the dust up between Trump and Google is an important opportunity to shed light on what we do know about Google and its search algorithm.
What happens when you run a Google search?
At a high level, Google's search engine is based on a long list of websites from which Google has already scraped information, using automated software it calls a "crawler." The crawler gathers keywords and other data about sites on the Internet, and at this point billions of Web pages have been analyzed this way.
When users type in a search query, Google takes their request and goes looking in its records for any matches. Then it faces another problem: How to organize all the results.
This is where the more subjective parts of Google's search engine come in. Over 100 factors – from where the user is located to how recently a given webpage was updated – contribute to how highly a certain result may appear. In addition, the company's famous PageRank algorithm, developed by co-founders Larry Page and Sergey Brin, plays a role in determining the authoritativeness of a given source.
Google executives are hesitant to discuss the specifics of their software, for fear of encouraging those who may seek to game the algorithm. And a core aspect of Trump's critique is that Google is mistaken in the way that it assigns authority in the first place.
But, said Pandu Nayak, the head of Google's search ranking team, Google tests its own search results with regular humans to ensure that the search engine does what it is meant to do: Provide relevant and authoritative results.
"We've developed a detailed series of guidelines about what it means to be authoritative," said Nayak in an interview. "It's a 160-page document, it's been publicly available on the Web for several years now, and it's our representation about what it means to give relevant and authoritative results. Raters must study it and pass a test" before they can participate in the evaluation process.
Google News' secret sauce
Some of Trump's criticism appears to revolve around which stories appear in Google News.
Recent changes to Google News have turned it into a much more personalized product. The company now apples artificial intelligence and accounts more for your expressed preferences.
This approach has given rise to questions about what determines News results and just how much its engineers truly understand about the decisions their AI is making. A consistent theme in current machine learning research is that the algorithms are typically black boxes – often, the only way to determine why an algorithm made a decision is to try to reverse-engineer the logic from the results.
Influencing the sauce?
Still, it appears, companies and individuals can influence Google search results.
Reverse-engineering Google has practically become something of a cottage industry, particularly in media. Publishers are constantly trying to find ways to compete for visibility on Google News and on Google Search. For example, Google's tendency to favor recency, or "freshness," incentivizes companies such as The Washington Post to create their webpages with metadata keywords that the search engine can easily read.
But not even the best experts can know for sure whether their techniques are working.
"The file that has the smart meta-title on it is the file that people land on for the latest updates," said Megan Chan, director of digital operations at The Post. "A lot of it is trial and error."
While The Post has a relationship with Google on special initiatives such as The Trust Project, an effort to promote integrity in journalism, Google does not use those relationships to provide insights into how its algorithms work, Chan said.
Google is certainly capable of tailoring search results, but often does it less than you think
That media outlets tinker with ways to boost their performance on Google is a byproduct of Google's success and dominance, not evidence of any favoritism by Google, experts say.
"If you're a publisher, it's impossible to simply remove yourself from any interaction with Google," said Chris Pedigo, senior vice president for government affairs at Digital Content Next, a trade association representing online publishers.
Still, by setting up the system this way, Google clearly has some degree of control over how information is presented to the user.
The question for many is how aggressively Google intervenes in this process.
Google's algorithm, particularly for search, is a master algorithm that is applied in real time against each search query as it comes in, according to the company. Although the algorithm itself frequently changes as Google makes tweaks, it is applied identically to each search.
If the results differ from person to person, that could be because they may be using a browser in incognito mode, which deletes the cookies and other third-party tracking software. Or they may be searching from a different location, triggering Google's reflex to return local results. Or they may simply be performing a search slightly later in time than another, said Christo Wilson, a computer science professor at Northeastern University who has studied Google's search practices for six years.
Wilson's research involves comparing Google searches under different conditions – having one group of testers search Google in incognito mode for example, while another group uses Google in normal mode. In other studies, Wilson assigned one group to log onto Fox News and another group to browse CNN before performing an identical Google search. What he has found may seem surprising.
"We have never seen big differences," said Wilson. "In fact – what we typically have seen in the past is that your search history – things you've looked for in the past – they only matter for about 10 minutes and even that's not true anymore for most queries."
This may be a function of Google's "bias" toward freshness, Wilson explained. And this is likely what Trump experienced as well.
"The results he's getting are going to be the same results that everyone is getting for those [same] queries," said Wilson, "at least for the U.S."
Overseas, it's a different story
Which brings us to Google's activities in other countries that underscore the company's ability to censor results – literally. In China, the search giant has reportedly sought to build a version of its engine that complies with the government's policy of blocking certain sensitive results, such as those related to the Tiananmen Square protests of 1989.
And it wasn't long ago that Eric Schmidt, then the chairman of Google's parent company, Alphabet, publicly considered the possibility of demoting content that's considered hateful or extreme.
"We're very good at detecting what's the most relevant and what's the least relevant," he told Fox Business in 2017. "It should be possible for computers to detect malicious, misleading and incorrect information and essentially have you not see it. We're not arguing for censorship, we're arguing just take it off the page, put it somewhere else ... make it harder to find." – © 2018. Washington Post