Google's subpoena and some corporate BS
Where do we start... The Justice Department is investigating whether the Child Online Protection Act is constitutional. It aims to "protect" children from online pornography.
(Personally, I think children would not need to be "protected" from things like violence, drug-related content, strong language, or sexual content, if parents actually talked to their kids. How about trying to work on the problem of bad parenting, rather than trying to "protect" kids from from things that will only affect them if their parents don't explain these things to them? But that's something I can get into later. Until then, I recommend this excellent article by David Mills that explains why children do not need to be protected from pornography and why the whole idea is ridiculous - it is only pushed by the religious right (who are against any kind of pornography and love any chance to fight it) and by politicians who have something to gain by making voters worried and/or scared and/or outraged for no good reason. In fact, this article says that ...the law is backed by such people as Jack Samad of the National Coalition for Protection of Children and Families, an Ohio-based religious advocacy group which avows to "encourage and challenge Christians to live sexually pure lives". For the record, let me say I support the idea of fundamentalist Christians living sexually pure lives - maybe this way they'll go extinct. Yeah, don't I wish. Anyways, in the future, I will probably also write a post about why violent videogames and violent movies are not harmful at all. Until then, recommended reading is here and here and here, and if you really want to look at the psychology behind it, here and here. But back to protecting kids from pornography... Where were we?)
Ah, yes, the controversial Child Online Protection Act. The Justice Department asked Google, MSN, Yahoo, and AOL to give them tons of information about things like
-All the searches performed over a period of 1 or 2 months,
-All the results returned in all those searches,
-All the webpage addresses that COULD be returned in a search,
and so on. From that information, the Justice dept could see how likely porn is to appear among search results where it does not belong.
Yahoo and MSN complied with this request, several months ago. AOL also complied, and then tried to deny this. Google, though, did not comply. According to this article;
When trying to negotiate with Google, the Justice Department eventually narrowed that request to a "random sample of 1 million URLs" and "copies of the text of each search string entered onto Google's search engine over a 1-week period."
This, however, does not help in understanding how often porn sites come up in innocent searches. Even if the majority of the sites in Google's index is pornographic, there is no problem if these sites are rarely or never returned for innocent searches. Well, the "problem" (as far as the Online Child Protection Act is concerned) would be if these sites allow users access to pornography without verifying their age, but I don't see how this is related to what people search for, or to what percentage of sites are pronographic, or even to whether these sites come up in search results. Well, I guess the Justice Dept thinks it can cite this kind of data to "prove" whether the Online Child Protection Act is constitutional. We'll see how they do.
But back to Google and privacy issues.
Google's non-compliance was discussed here when it became known that a judge was asked to demand this information from Google. Ramifications of this issue, theories behind motivations, and even the court documents themselves, are linked to (and concisely explained) here.
Supposedly, the reason for Google's non-compliance was so that users' privacy would not be unnecessarily compromised. Like, yeah right.
Two questions can be asked here:
1) What's the REAL reason why Google did not comply? Some people have some good theories that I will basically agree with.
and
2) When data about millions of users' actions online is aggregated for the purpose of analysis, even if this analysis discards the part of the data that specifies what users performed any of these actions... Does that violate the users' privacy? A lot of people think it does. I basically think those people are idiots. We'll get to that later.
Regarding the first question... From this article:
Nicole Wong, an associate general counsel for Google, said the company will fight the government's effort ``vigorously.''
``Google is not a party to this lawsuit, and the demand for the information is overreaching,'' Wong said.
...
Privacy consultant Ray Everett-Church, who has consulted with Internet companies facing subpoenas, said Google could argue that releasing the information causes undue harm to its users' privacy.
``The government can't even claim that it's for national security,'' Everett-Church said. ``They're just using it to get the search engines to do their research for them in a way that compromises the civil liberties of other people.''
And from this article we learn that...
"Google's acceding to the request would suggest that it is willing to reveal information about those who use its services," wrote U.S. Google attorney Ashok Ramani in an Oct. 10 letter to U.S. Dept of Justice attorney Joel McElvain. "And one can envision scenarios where queries alone could reveal identifying information about a specific Google user, which is another outcome that Google cannot accept"...
This, of course, won Google some favor with some privacy advocates. For years now, privacy advocates have understood that Google is storing vast amounts of data regarding who searches for what and when (and information of which sites they go on to, from the search results). Privacy advocates have feared that this information could be revealed, through hacking, theft, the kind of subpoena seen here, vague "national security needs", or other such ways. These people want their search information to be completely private even if the government needs it to investigate criminals and terrorists. Remember, the constitution protects them against "unreasonable searches and seizures"... I'm not sure this is unreasonable... I think they're being unreasonable... In any case, Google is at least pretending to side with them, trying to earn back people's trust, trying to make it seem like it wants to fight as hard as it can before giving up these people's precious search terms.
Seeing through this BS, many people like John Battelle offer very good theories about what's really keeping Google from going ahead with this:
Remember this whole goat rodeo (on the size of indexes)? Remember how slippery both Yahoo and Google got when we tried to figure out exactly how many documents were in their indexes? Well, turns out, that's pretty much what the DOJ is trying to do as well. Hence, Google's defense on a "trade secrets" basis.
Apparently, the subpoena originally asked for a lot more than just a million addresses, as reported Thursday. From the motion the DOJ filed to force Google to comply with the subpoena:
"The subpoena asks Google to produce an electronic file containing '[a]ll URL's that rea available to be located through a query on your company's search engine as of July 31 2005."
and
"all queries that have been entered on your company' search engine between June 1, 2005 and July 31, 2005."
HELLO. You think Google is going to give that over? Me no think so.
So how to fight it? Well, standing up to the DOJ and getting major praise for doing so is a very smart strategy, in my book. As much as I'd love to believe Google is fighting this for heroic reasons, I'd wager that the data has more to do with it.
A little more research turns up the following:
In a letter dated 10 October, 2005, Google lawyer Ashok Ramani objected to the Justice Department's request on the grounds that it could disclose trade secrets and was "overbroad, unduly burdensome, vague and intended to harass".
So I guess it's fair to say that Google is doing this:
1) Because it wants to guard its precious information (only WE mine the information we worked so hard to get!) and its index size,
2) Because it doesn't want the US government to get the impression Google is happy to do research for them when this research ough to be done by the government,
3) Because Google wants the crazy privacy advocates to think that Google's large stores of "private" information are safe, that Google actually values privacy (rather than simply fearing the crazy privacy advocates. It's a subtle difference).
Now, the more complicated question... Does aggregate data violate privacy? MSN, Yahoo and AOL seem to not think so. Google even has a whole series of webpages dedicated to showing the most popular searches every week, in different countries, per category, as well as the search terms being searched for with the most quickly-increasing frequency and with the most quickly-decreasing frequency (fads, interests, and cultural fascinations that are just starting or dying down). This also ties in with all those software programs that send information about your online habits to a centralized server. Many people think this is private, even if the company only knows that "a user did this" (or, rather, that "more users did this than did that" and "most of the users who did that were also the users who did this"). This even ties in to the supermarket / retail-chain cards I talked about earlier, except that web surfing and searching happen in a more private space than retail shopping... Right? Maybe, maybe not. I'll talk about that next post.