(In every sense of “fix”)
The recent mass censorship of the New York Post by Twitter and Facebook has led to calls for anti-monopoly action. Break them up and regulate them, people say.
That sounds like a terrific pain in the ass, and dubious of success. It would be a fine thing to bring some pain to Big Social, but whether it will change their behavior is doubtful. They can outspend, outlawyer, outflank, or outright capture the regulators. There may be no feasible, believeable solution.
But there is a way to fix Google. It will cause them much pain and humiliation, which they richly deserve. And it will reliably protect the public interest.
Open-source the search engine.
Some Google bigwig had to answer to Congress: does Google manipulate search results? No, the bigwig answered, we don’t manually alter search results. Congress thought that answered the question. They’re pretty dumb.
But then it came out that Google’s bias could be explained by its search algorithm. Some internet wise guy said we should all dump Google and use a different search engine, one without an algorithm. I don’t think it was a member of Congress who said that, but it could have been.
It happens that even smart people like doctors and lawyers often have only the vaguest idea of how computers work. “Where are your documents?” “Oh, I keep them in Word.” The average Congresscritter will not have creative, practical, technical solutions in mind. They are likely to recommend a regulator with the power to impose legal sanctions against Google’s public conduct after the fact. I am confident Google can defeat any regulator which works that way.
But Google can’t defeat a review of their database and source code.
Here’s roughly how a search-engine works. It takes a list of keywords and then queries a vast database linking keywords to the documents in which they appear. A sophisticated algorithm then decides which documents are most relevant, using criteria such as word frequency, keywords in the same clause, inbound links to the document from other documents with the same keywords, and so on. One key innovation Google developed was a search engine that is almost impossible to cheat. You can’t fake relevance.
To make this work efficiently, each page needs to be assigned a base relevancy rating for each keyword. If you post a document as a trap for the unwary, with a single keyword pasted in 7,000 times, Google’s algorithm will know it for the garbage it is, and give it negative relevancy. Documents which show indications of relevance get a positive relevancy rating.
There are a million ways to game this system. Suppose Google decided that the New York Times’s narratives were most important to the public interest. They could simply up-rate all documents from nytimes.com automatically, making them the first result for all searches.
There is some evidence that Google ranks keyword combinations, associating keywords not supplied by the user, but by Google. So if you Google “Donald Trump”, you get results skewed towards “Donald Trump scandal russia rapist racist supremacist prostitute cheats golf” and so on. “Hillary Clinton beloved feminist icon charity women’s rights”, anyone?
That’s how the scam works, and the only way to expose it is to look at the code and the data. Those things can’t be concealed from a trained eye. “It’s on the hard drive,” so to speak.
As much as I fantasize about Google’s pain and suffering, their source code probably shouldn’t be forced out into the open, at least not right away. That would be unprecedented in our legal and Constitutional order. There are other reasons to doubt it would be a good thing.
But being a monopoly power acting against the public interest should erode their right to confidentiality. An investigation could determine with full confidence whether Google was guilty. Ongoing scrutiny could prevent future abuse.
But only if the regulators can see the code and verify on an ongoing basis. That will take a massive, expensive effort. It will entail a raft of novel liabilities and security challenges; the needed expertise to do even the basic task won’t be cheap; and even a regulator with the practical ability to rein in abuse can and will be captured eventually.
Google created a service which the public came to rely on. Google has long abused, and continues to abuse its position of trust and power. Open-sourcing the critical elements of the search engine would expose Google’s algorithms to independent analysis by experts from all over the world. That is the proven best method to detect bugs, or malice, in software. Failing that, a massive system of truly punitive inquiry into Google’s internal affairs must be instituted and maintained in the Federal Government.
Blow the lid off. Rein them in.