Wednesday, January 25, 2006

Google's subpoena and some corporate BS

For those of you who don't read SlashDot and BoingBoing every day and who don't pay close attention to the latest developments at the Googleplex or to how often Google appears in the news... What are you doing reading this blog? In any case, if you've been living under a rock for the past week, here's the latest controversy Google got itself in. (I'm sure they did it just because, y'know, people were probably getting bored with the Google Book Search debate/lawsuit).

Where do we start... The Justice Department is investigating whether the Child Online Protection Act is constitutional. It aims to "protect" children from online pornography.

(Personally, I think children would not need to be "protected" from things like violence, drug-related content, strong language, or sexual content, if parents actually talked to their kids. How about trying to work on the problem of bad parenting, rather than trying to "protect" kids from from things that will only affect them if their parents don't explain these things to them? But that's something I can get into later. Until then, I recommend this excellent article by David Mills that explains why children do not need to be protected from pornography and why the whole idea is ridiculous - it is only pushed by the religious right (who are against any kind of pornography and love any chance to fight it) and by politicians who have something to gain by making voters worried and/or scared and/or outraged for no good reason. In fact, this article says that ...the law is backed by such people as Jack Samad of the National Coalition for Protection of Children and Families, an Ohio-based religious advocacy group which avows to "encourage and challenge Christians to live sexually pure lives". For the record, let me say I support the idea of fundamentalist Christians living sexually pure lives - maybe this way they'll go extinct. Yeah, don't I wish. Anyways, in the future, I will probably also write a post about why violent videogames and violent movies are not harmful at all. Until then, recommended reading is here and here and here, and if you really want to look at the psychology behind it, here and here. But back to protecting kids from pornography... Where were we?)

Ah, yes, the controversial Child Online Protection Act. The Justice Department asked Google, MSN, Yahoo, and AOL to give them tons of information about things like

-All the searches performed over a period of 1 or 2 months,
-All the results returned in all those searches,
-All the webpage addresses that COULD be returned in a search,

and so on. From that information, the Justice dept could see how likely porn is to appear among search results where it does not belong.

Yahoo and MSN complied with this request, several months ago. AOL also complied, and then tried to deny this. Google, though, did not comply. According to this article;

When trying to negotiate with Google, the Justice Department eventually narrowed that request to a "random sample of 1 million URLs" and "copies of the text of each search string entered onto Google's search engine over a 1-week period."


This, however, does not help in understanding how often porn sites come up in innocent searches. Even if the majority of the sites in Google's index is pornographic, there is no problem if these sites are rarely or never returned for innocent searches. Well, the "problem" (as far as the Online Child Protection Act is concerned) would be if these sites allow users access to pornography without verifying their age, but I don't see how this is related to what people search for, or to what percentage of sites are pronographic, or even to whether these sites come up in search results. Well, I guess the Justice Dept thinks it can cite this kind of data to "prove" whether the Online Child Protection Act is constitutional. We'll see how they do.

But back to Google and privacy issues.

Google's non-compliance was discussed here when it became known that a judge was asked to demand this information from Google. Ramifications of this issue, theories behind motivations, and even the court documents themselves, are linked to (and concisely explained) here.

Supposedly, the reason for Google's non-compliance was so that users' privacy would not be unnecessarily compromised. Like, yeah right.

Two questions can be asked here:

1) What's the REAL reason why Google did not comply? Some people have some good theories that I will basically agree with.

and

2) When data about millions of users' actions online is aggregated for the purpose of analysis, even if this analysis discards the part of the data that specifies what users performed any of these actions... Does that violate the users' privacy? A lot of people think it does. I basically think those people are idiots. We'll get to that later.

Regarding the first question... From this article:

Nicole Wong, an associate general counsel for Google, said the company will fight the government's effort ``vigorously.''

``Google is not a party to this lawsuit, and the demand for the information is overreaching,'' Wong said.

...

Privacy consultant Ray Everett-Church, who has consulted with Internet companies facing subpoenas, said Google could argue that releasing the information causes undue harm to its users' privacy.

``The government can't even claim that it's for national security,'' Everett-Church said. ``They're just using it to get the search engines to do their research for them in a way that compromises the civil liberties of other people.''


And from this article we learn that...

"Google's acceding to the request would suggest that it is willing to reveal information about those who use its services," wrote U.S. Google attorney Ashok Ramani in an Oct. 10 letter to U.S. Dept of Justice attorney Joel McElvain. "And one can envision scenarios where queries alone could reveal identifying information about a specific Google user, which is another outcome that Google cannot accept"...


This, of course, won Google some favor with some privacy advocates. For years now, privacy advocates have understood that Google is storing vast amounts of data regarding who searches for what and when (and information of which sites they go on to, from the search results). Privacy advocates have feared that this information could be revealed, through hacking, theft, the kind of subpoena seen here, vague "national security needs", or other such ways. These people want their search information to be completely private even if the government needs it to investigate criminals and terrorists. Remember, the constitution protects them against "unreasonable searches and seizures"... I'm not sure this is unreasonable... I think they're being unreasonable... In any case, Google is at least pretending to side with them, trying to earn back people's trust, trying to make it seem like it wants to fight as hard as it can before giving up these people's precious search terms.

Seeing through this BS, many people like John Battelle offer very good theories about what's really keeping Google from going ahead with this:

Remember this whole goat rodeo (on the size of indexes)? Remember how slippery both Yahoo and Google got when we tried to figure out exactly how many documents were in their indexes? Well, turns out, that's pretty much what the DOJ is trying to do as well. Hence, Google's defense on a "trade secrets" basis.

Apparently, the subpoena originally asked for a lot more than just a million addresses, as reported Thursday. From the motion the DOJ filed to force Google to comply with the subpoena:

"The subpoena asks Google to produce an electronic file containing '[a]ll URL's that rea available to be located through a query on your company's search engine as of July 31 2005."

and

"all queries that have been entered on your company' search engine between June 1, 2005 and July 31, 2005."

HELLO. You think Google is going to give that over? Me no think so.

So how to fight it? Well, standing up to the DOJ and getting major praise for doing so is a very smart strategy, in my book. As much as I'd love to believe Google is fighting this for heroic reasons, I'd wager that the data has more to do with it.


A little more research turns up the following:

In a letter dated 10 October, 2005, Google lawyer Ashok Ramani objected to the Justice Department's request on the grounds that it could disclose trade secrets and was "overbroad, unduly burdensome, vague and intended to harass".


So I guess it's fair to say that Google is doing this:

1) Because it wants to guard its precious information (only WE mine the information we worked so hard to get!) and its index size,

2) Because it doesn't want the US government to get the impression Google is happy to do research for them when this research ough to be done by the government,

3) Because Google wants the crazy privacy advocates to think that Google's large stores of "private" information are safe, that Google actually values privacy (rather than simply fearing the crazy privacy advocates. It's a subtle difference).

Now, the more complicated question... Does aggregate data violate privacy? MSN, Yahoo and AOL seem to not think so. Google even has a whole series of webpages dedicated to showing the most popular searches every week, in different countries, per category, as well as the search terms being searched for with the most quickly-increasing frequency and with the most quickly-decreasing frequency (fads, interests, and cultural fascinations that are just starting or dying down). This also ties in with all those software programs that send information about your online habits to a centralized server. Many people think this is private, even if the company only knows that "a user did this" (or, rather, that "more users did this than did that" and "most of the users who did that were also the users who did this"). This even ties in to the supermarket / retail-chain cards I talked about earlier, except that web surfing and searching happen in a more private space than retail shopping... Right? Maybe, maybe not. I'll talk about that next post.

Sometimes, when I'm alone, I Google myself

Ed Felten once said that "privacy is for Google what security is for Microsoft": Basically, it ought to be the highest priority, and failures in this area could destroy the company's image and its users' trust. "It’s high time for Google to figure out that it is one or two privacy disasters away from becoming just another Internet company".

Thing is, though, Google doesn't even have to endanger/violate people's privacy in order to get in trouble. If it just LOOKS like it's endangering/violating people's privacy, its image is hurt badly. This happens often, since so many people have ridiculous and unreasonable expectations of what ought to be kept private, and a very fuzzy understanding of what information goes where and how. Google is in a tough spot: Even if it does things right, it keeps being misunderstood and wrongly criticized by paranoid privacy advocates and by inexperienced/ignorant users.

I was about to start writing a post on the recent subpoena that Google is resisting. And I thought to myself; Hmmm, Google certainly has worried privacy advocates - and still worries some of them - over a huge variety of issues (most of which, incidentally, are blown way out of proportion by privacy advocates and really are not legitimate cause for concern unless you're an idiot, way too paranoid, or careless about how you let other people use your computer). The list I came up with, originally just a way to start talking about the subpoena, is long enough to get a post all for itself:

So let's see, you've got the anti-phishing Firefox extension (that sends information to Google about which sites you visited), the Toolbar (that sends information to Google about what you search for and, possibly, also about which sites you visited), the Desktop Search software (that sends no information to Google at all but LOOKS like it does, thus causing many stupid people to freak out when they see personal emails and Word docs appear among Google websearch results), the Search History service (that keeps track of what you searched, when, and which sites you visited from the ones in the search results... This made many people mad, mostly before these people realized you have to proactively sign up for this service and then turn it on), the Google Phonebook (that does nothing more than connect bits of information about you that you allowed your phone company to publish in a huge book delivered to every doorstep in town), Google Maps (people have actually requested that their addresses NOT be searchable through Google Maps, and that their houses be removed from the satellite pictures... No, seriously, I swear), the Orkut social-networking website (where people post pictures (and other info) publicly and then get all mad that their pictures were stolen by someone else), the different web-browsers' "AutoComplete" function (that remembers the things you searched for on Google and shows a drop-down list of previous search terms that start with the letter you write in the box... This is not a "Google" feature, it's a browser feature, but it happens to reveal to other users of the same computer the fact that you have been searching for odd/inappropriate stuff), the Google Accounts cookie (which keeps you logged in to your Google Account, so if you close a Gmail or Orkut or Google Groups window without first logging out, the next person to use the computer will have access to your Gmail, Orkut, and Groups info/profile/identity), the Google cookie (which supposedly tries to group the searches you do to one individual profile - much like the supermarket savings cards from two posts ago), the Gmail ads (which make it look like someone's reading your email and choosing appropriate ads for it... No, people, it's all automated, no one's reading your email!), the Google Web Accelerator (which blindly and dangerously followed every link it saw, stored webpage info on Google's servers, and could make it look like you were visiting a site under someone else's identity, until it was fixed), the Google search results (that often contain information about you that you think is private, wish were private, or find embarassing and/or inaccurate and/or defamatory), and of course the good ol' Google search engine itself (that might remember what searches were made from what IP addresses and when, and from the computers, using which cookies... It certainly does remember what searches are done, that at least is known, and any site you visit from a Google search results list will know you came from Google and what you searched on Google to find the site. My personal website is visited by people who search for the oddest things).

As I will eventually mention in this blog (if I didn't already in the parenthetical explanations above), almost all these concerns are either 1: caused by a lack of clear understanding of what is private and what is public, 2: caused by worries that the government will know what you do online (not a problem unless you are a criminal), 3: sheer stupidity, or 4: only a problem if you don't know how to properly use a web browser and cookies (and if you are sloppy about deleting things and logging out of sites), which I guess is the same thing as "3": stupid. In any case, you will only worry about any of the things in the previous paragraph if you do not read the Terms of Use and the Privacy Policy of these products...

...except maybe when it comes to finding supposedly defamatory things when you Google your name - a problem that is not Google's fault (or Google's liability), and in fact indicates that Google's search engine is working very well. This "problem" is caused by you being careless about what information about you gets to webmasters - THEY are the ones that publish what appears in Google's search results, THEY are the ones you have to go after. Theoretically, getting in touch with the webmaster will solve the problem of undesired search results on a search for your name, if you do indeed have any right to be mad over the material written about you. More on this in a future post.

...AND, I guess that Web Accelerator bug really was pretty serious, but it got fixed, it looked more serious than it was (you didn't actually have access to other people's online profiles / logins, it just looked like you did), and its potentially destructive link-following (such as following links that said "remove this" or "delete this") was really not too different from regular search-engine spider / crawler indexing, so only websites developed by inept webmasters were damaged (one wonders how those websites survive the crawling / indexing done by search engine spiders / robots...).

But other than that, Google's services offer no cause for worry, privacy-wise, if your expectations of "what is private" are reasonable, and if you are just a tad mindful while doing "private" things on a computer used by other people. Over the next few posts I will go into more detail about why I feel I can safely say this.

BNM

PS: The title of this post is just a reference to this hilarious T-shirt, now available as a variety of products (and in the proper Google font) here.

PPS: Coming up next: My reaction to Google's subpoena mess.

Monday, January 23, 2006

Customers who have looked at this Sony laptop have also bought... Pants!!!

I just realized that, in my Private self vs Public self post, I said almost all the things I wanted to say about merchant-customer privacy. Once the concept of "your public self versus your private self" is introduced, it becomes trivial to see why the list of things you previously bought (and the times when you bought them) should not be private: It pertains to actions done by your public self in a public place. Unless these things were bought online from a merchant that specifically tells you the fact you bought them will be kept confidential.

There are a few more dimensions to this kind of thing, so while I'm at it, I guess I'll just finish writing up my current thoughts on them.

So, yeah, many stores have a "profile" on you, and this profile is a list of the things you bought and when. Sometimes this is facilitated by a "savings card", sometimes by the store's asking your phone number, or it could be done by matching credit card numbers. Privacy-wise, what do you have to gain or lose with this? Economics-wise, what do you have to gain or lose, what do the other customers have to gain or lose, and what does the store have to gain or lose? (Right now I'm just talking about brick-and-mortar stores; I'll get to online merchants at the end).

Privacy-wise, the issue is clear and I've already addressed it: These purchases were made in a public place, in plain sight of everyone. Expecting them to be "private" in any way is ridiculous.

Now, you may say; "But there's a difference between scattered people possibly seeing (and then probably forgetting) what I bought on any one trip, and a database that precisely and eternally records what I buy on EVERY trip".

To that, I answer; "Not anymore, there's not". It used to be that bits of information made "public" in different places, in different forums, in different spaces, could be expected to never be compiled together. It used to be that you could, in plain sight, buy Item A in Store X and later buy Item B in Store Y, and expect those two pieces of information to never come together (and to never allow anyone to use them to make a "pattern" out of your shopping habits). It used to be that the people who knew of one purchase would not know how to reach information about another purchase somewhere else. Well, with the digital age, that's out the window. The nature of information has changed, the way it is stored and retrieved, in such a way that scattered public information can easily be grouped. All the things you do publicly can be brought together. And it's ridiculous to think that this compiling of different kinds of public information constitutes a violation of privacy. (This will come up again and again, like when I talk about your information appearing in Google search results and about social networking sites).

It's like there's a giant network of computers stalking EVERYONE. But since they only get to see the "public you", then they are not invading your privacy; just compiling public information.

If you don't want that information to be public, then you're going to have to pay cash and to give phony information when the store/business asks you for your phone number and whatnot.

What about the fact that, once a merchant has your name and address, they often sell it to marketers, resulting in you receiving junk mail? Isn't that "private information"? Well, yes and no. When the people in the store asked you for this information, you probably just gave it to them. Did you first ask with whom this information would be shared and under what circumstances? (Did you read the privacy policy?) No? Then you just gave away that information, and you have no right to expect that this information will not be shared.

If this information is so valuable, or so private, why did you give it away so carelessly? People often wonder how mailing lists, search engines, etc, find out information about them. Well, most of the time the people themselves gave the information away without bothering to find out where it was going. They just assumed it would be kept private, that the commercial institutions would pass up the chance to sell it, out of the goodness of their hearts. Um, yeah.

Why do the stores do this, anyways? Is it because they are mischievously curious to know things about YOU? Is it so that, if you buy a certain series of items, the computer at the store warns the FBI that you are probably a terrorist and/or a sex offender? NO! The store could not care less about YOU! It's so that they can have lots of data, which will allow them to group all their customers into a few (or several) sub-groups. It's so that they can say "the people who shop here most often seem to buy these items", and "the people who buy items in category C also seem to like buying items in category Q", and "smaller and more frequent shopping trips seem to include these items, while less-frequent and bigger shopping trips tend to include those items". This kind of information will allow them to better organize their store layout (it's anyone's guess whether that means that the items usually bought in the same trip will be placed closer together or farther apart), it will allow them to decide which products to sell at low prices and which products to sell at high prices (if everyone who buys product X also buys product Y, then we can sell product X at a super low price and advertise this, and meanwhile crank up the price on product Y and hope no one notices they're actually spending more money). It will allow them to figure out the shopping profile of the kind of customer that brings in the most profit, so that the store can taylor their product selection, their marketing, their prices, and their store layout to bring in and profit from that kind of customer.

The stores claim that savings cards "bring savings to our most loyal customers". So if you buy there a lot, each trip will cost less than it would if you did not have the card. This "motivation" to help their "loyal customers" is, of course, BS. While it may be true that a bunch of items paid for with a card will cost less than those same items would without the card, the seldom-mentioned fact is that most prices are RAISED upon the implementation of a card system, so you're paying about the same that you were paying before the card existed - maybe even more, if they're careful about which prices were raised and how much. You're probably not saving money compared to the pre-card days, you're just saving money compared to current non-card purchases. It's a big difference; It means the "savings" are an illusion and are only relative to artifically inflated prices. Ah, and meanwhile, the other people (those without cards) are 1: paying way more, and 2: shopping in a store that will slowly disfavor their "shopping profiles" by taking away the items that they buy and you don't, by raising the prices on the items that they buy and you don't, and by devoting less store space to the items that they buy and you don't. They want their store to cater to the most loyal (read: profitable) customers. They care less about the less-profitable customers, but unless some law says otherwise, there's nothing wrong with that.

All of which sounds very very good to me. It makes excellent business sense. The fact that the store is hiding their true intentions (they say "We want to reward YOU with savings" rather than "We want to make more money off people like YOU"), and the fact that they falsely claim "big savings" over prices that are too high anyways, are a little annoying and dishonest, sure. But the idea of doing all this makes good business sense and will allow for the store to be more profitable, selling only the more profitable items and drawing the more profitable customers.

One interesting side-effect is that if you provide phony information when they ask you for your name, phone number, address, etc etc, it won't make the least bit of difference. They will still get to see your shopping profile, and will still be able to aggregate your shopping data with that of similar customers. At least you won't get junk mail. (But you might also miss out on their preferred-customer-only gift certificates and coupons! Are those worth getting junk mail? Up to you). Some people are so paranoid about privacy, they have put much effort into making fake supermarket savings cards. If tons of people download these and all use effectively the same card, this may harm the supermarket's ability to track their purchases... or it might be doing the supermarket a favor by having a whole class of customers (techie dorks) use a single card!

So you see, this has nothing to do with "your private information" (unless you give it away and don't bother to make sure this information won't be sold to junk-mailers). It has to do with running a business more efficiently.

So far I've been talking about brick-and-mortar stores. Online merchants have an even easier time: They can match sales in their database by address, credit card info, or name, so in their case it's trivial to group their customers into different categories (and thus to find out which kind of customer is most profitable so that more marketing can be aimed at them, and also to find out which prices can be raised based on what items are usually bought along with what items).

Online merchants can go one step beyond: they can create algorithms that recommend items to YOU based on your individual shopping/browsing history and on the purchases made by people with similar histories. Now, it could be said that what you buy online is NOT public, so the privacy policy of the online merchant might (but might not) allow the merchant to share information about you and your shopping habits (and/or your wish list) with anyone. Whatever their privacy policy says, though, this does not prevent them from having a computer look at your shopping history and, with no human intervention, tell YOU that "We have noticed you are interested in [blah]. People who also seem to be interested in [blah] have also looked at and/or bought [this other thing]".

Personally, I think this is GREAT. Amazon.com regularly recommends this to me that I genuinely would be interested in. I usually know about them, but sometimes not. So they make money, I learn (or am reminded) about neat stuff, my privacy is maintained (if I care about that), everyone wins.

The title of this post comes from the fact that a friend of mine, a Microsoft programmer, has gotten Amazon recommendations for items unrelated to the items he bought or looked at. Sure, it may be true that "People who have looked at the Sony Vaio PCG-GRT100P have also bought... Pants!!!", but that's because most people buy pants, not because pants have anything to do with laptop computers. Are users of Sony's laptops more likely to buy pants (or to not know that you can buy pants online) than most other people? Since the items are recommended by correlating shopping/browsing histories, and not by considering the relatedness (or lack thereof) of the items, the results can sometimes be quite humorous. Or controversial.

Besides, isn't it just NICE when the people at a store know you? If I shop a lot somewhere, I want the people there to know this, so that when I send them an email or something, they see that my business should be valued. In a related note, most retail stores today cut costs by paying their employees very little, which causes high turn-over rates and does not make for knowledgeable or motivated sales associates. Almost-gone are the days when you could walk into a store and be greeted by the same genuinely cheerful person, year after year, a person who would call you by your name, ask you how the family is doing, ask you how that new (whatever you bought there last, or talked about buying) is working, and say "we just got these in, I bet you're gonna love'em". Personally, I'd feel great if I were treated like that. And it just doesn't happen anymore. Is that mom-and-pop store employee violating my privacy by remembering what I bought and remembering things about me? Sure, maybe it's less frightening when a person does it, not a computer or a huge greedy faceless corporation, but that person is also primarily doing it for the sake of keeping his store profitable - the fact that he gets to be nice in the process is just a welcomed side-effect. Or am I being too cynical? Maybe I am.

One last interesting point I could bring up is the concept that the information these stores have on you could literally be treated like property. Heck, the stores go through a lot of trouble and expense to acquire that information - it must be worth something! How much? Could you sell it? Who'd buy it?

It might be useful to treat your "personal information" as property whose ownership is shared between you and the people/company with whom you do business, and who provide you with serivces. Once you tell them your personal information, their privacy policy is a contract that determines how much they own that information, what they can do with it, with whom they can share it. This concept will be brought up again when I talk about information that ends up online, and about how John Battelle feels about this.

One last thing: Sometimes you purchase something, and the price is something like "$300 or $250 after rebate". You buy the thing, pay $300 plus tax, and then when it comes time to get the rebate, you find out the rebate people want information about you, like your career field and occupation title. That's just wrong - it's deceptive and misleading. It assumes your information has no value, since it says you get the rebate without having to provide anything other than $300 (and proof of purchase). At the same time, it's effectively paying you $50 (or whatever) just for your information. That kinda puts a value on it right there. Of course, you can just lie - but then are YOU being deceptive, "selling" information that is inaccurate? Now everything gets REAL confusing. It becomes necessary to formalize the value of your information before we can safely make progress past this mess.

However, one could say that (under some definitions of "privacy"), the information asked for on the rebate (your career field and occupation title) is not really private. What I mean is, if you ever made that information public - such as mentioning your career field and occupation title in some non-private forum, or to someone you did not know well enough to be sure your "secret" would not be spread - then your career field and occupation title are not private. In other words, unless you keep your career field and occupation title a secret (only telling it to people you trust will not tell others, and never revealing it in public where random people could hear/read it), then asking you for this information is NOT a violation of privacy.

Concisely put

Most of this blog can be reduced to the following:

Do I do things that I wish no one else ever finds out about? Sure. Do I think I have a fundamental right to get away with doing such things?

No, unless these things are done in my home or said through one-on-one correspondence, are legal, and are done with/to people who can be expected to keep quiet.

Yeah, it can be challenging to determine which spaces are private and which are public, especially in the online world. But don't think that, just because you WANT a certain space to be private, that it will be. Assume all online spaces are public unless you have carefully determined that they are private.

See, I can be concise if I try really hard! (Better enjoy it, though, as it happens infrequently and never lasts long).

Private self vs Public self

So tonight I was initially planning on writing something about stores that keep track of what you buy - what the stores gain and lose from this, what you gain and lose from this, what other customers gains and lose from this, and what society as a whole gains and loses from this. I thought I would start stimulating my thoughts on the subject by visiting NoCards.org, a site dedicated to revealing the problems caused by supermarket savings-card programmes. (In short, prices go up overall, product selection decreases, and poor people (those who least closely match the shopping profile of the most profitable shoppers) are affected the worst. All good points, but you can't really blame a business for optimizing their products and prices like that. Catering to the poor is often not good business. Who knew).

In any case, that site contains a variety of essays about many aspects of the supermarket-card programs. They are very interesting, thoughtful, insightful essays - Check them out. One in particular caught my eye, in their FAQ. The title:

"Why shouldn't my life be an open book? Only people with something to hide worry about surveillance and tracking."

The essay's response:

If you really believe that your life is an open book, let me ask you a few questions. Do you close the door when you go to the bathroom? Do you try not to pass gas in mid-conversation? Do you resist the urge to scratch your genitals or pick your nose while others are watching you? Do you tone down arguments with your mate when your boss walks in the room? Do you sit a little straighter and dress a little nicer when you want to impress someone?

If you answered yes to any of these questions you are a normal human being. You are also -- like it or not -- a privacy advocate. Everything I just named involves a distinction between the private self and the public self.

As far as I'm concerned, as long as we are not hurting anyone, all of our other activities have the same right to be protected from the observation of others. Do you have a terrible singing voice but occasionally like to belt out a tune in the shower? Do you write gushy love poetry? Draw moustaches on photos of supermodels? Drink milk from the carton? Bite your fingernails? Bite your toenails? Suck your thumb? Sleep with a teddy bear? Perform rain dances in your living room? Wear a superman cape to mop your kitchen? Have meaningful conversations with your goldfish? Actually, you probably don't do any of these things (and it's absolutely none of my business if you do), but I'm sure we could each expand the list with a few idiosyncrasies of our own.

Do these sorts of things hurt anybody? Sheesh, no. Are they weird? Yeah, maybe, which is why we might not want our neighbor, or the NBC film crew, or the agent from the NSA watching us do them.

The need to have time alone, to engage in activities (or make purchases) in private is not the guilty response of a person who has something to hide, it is a place of refuge for our psyche which, for good or bad, is highly attuned to other's opinions and needs to let its hair down once in a while. The stresses of living in society with other people are enormous. If we never had the opportunity to relax, be ourselves, do something maybe a little bit weird (though ultimately harmless) I think we'd all go mad.


As far as I'm concerned, that article shot itself in the foot right there at the end. There are two things fundamentally wrong with this argument - one of which is practually pointed out by the text itself:

The fact is, EVERYONE does weird things. We should not judge each other by the weird things we do. Everyone has strange habits, strange curiosities, strange fascinations, strange pleasures. And that's fine. This means everyone should realize that, if someone does something odd and disgusting but ultimately harmless, then there's nothing wrong with that. A person does not have the right to be judgmental about people's weird idiosyncrasies unless that person is completely free of them - which almost no one is. A world where people were comfortable with each other's idiosyncrasies would be a world where everyone realizes how pointless it is to try to be private about everything in fear that someone else disapproves of it.

(To which you might respond: Yes, but sadly we live in a world of irrational, unreasonable, ignorant, judgmental people, and there are such things as standards for civilized social behavior, and we need to pretend to not be weird in order to fit in. As long as that is the case, privacy is needed. To that I would say, you are right, but it makes you think about how absurd, arbitrary, and impossible those standards really are. I will even confess I have hidden aspects of my beliefs from people because I wanted approval from them, but this kind of intolerance is uncivilized, and we should all hope (and ensure) that it will not be around forever).

Now, the second thing wrong with the argument of that essay is even clearer. It is this: Your shopping habits ought to be a part of your public self. Sure, you may have physically disgusting habits and/or interests in strange and disturbing things, but none of this makes itself evident during a day out shopping. The fact is, when you go shopping, you are in public spaces - stores, markets, the street - and the things you buy are visible to anyone who cares to notice. If your disgusting habits or strange interests require the purchase of speciaized... stuff, you will probably not find that stuff in the supermarket or at your local electronics store. If you do, then it probably has more "normal" uses. To get that stuff, you will probably have to go online, and the website that caters to your odd interest will probably fulfill your desire for privacy, otherwise they'd go out of business.

Things you do out in public are simply not private. If you're doing something in a public space - a store, a market, a mall, an internet cafe, a restaurant, a school, your workplace, or just out on the street - then this will not, cannot, and should not, be private. Expecting actions done "out in the open" to be private is ridiculous.

Which brings me to the last thing I want to say. Just as acts done in public (where others are watching) cannot be private, it is similarly unnatural to expect that all aspects of an interaction with another entity should be kept private. If you bought something from someone, you can't expect that fact to be private unless that someone said it would be. Take your phone company, for example. What information do they have about you? You name, where you live, your phone number, with whom you talk on the phone and when, how you pay your phone bills (credit card, bank account). How much of that information did they explicitly tell you they would not share? Why would you expect that they cannot share any of it? (It all comes down to reading the privacy policy so as to have realistic expectations. I'm not saying those things should not be private. They should. I'm just saying that simply assuming they will be private (without having read the privacy policy) is stupid).

What about your interactions with your ISP (which sites you looked at, what your email says) and with your search engine (what you searched for and when, plus where those searches led)? How private ought all THAT to be? That's a whole separate (if related) issue I want to address sometime very soon. Stay tuned!

Sunday, January 22, 2006

A Guide For The Uninitiated

This site lays out the basics of online privacy simply and thoroughly:

http://www.privacyrights.org/fs/fs18-cyb.htm

If you use the internet regularly and have thought at least a little bit about privacy, you should have realized all this. Problem is, a lot of people haven't...

BNM

StupendousMan Begins

Hobbes can always be trusted to make sensible comments (see panel 4):

Calvin and Hobbes - November 2, 1988, by Bill Watterson

Topics to be covered in the near future

- Google and the Child Online Protection Act-related subpoena; Does aggregate information violate privacy? And why does Google not want to give out that information, anyways?

- Software that sends your information to the software-makers' server - does it violate your privacy even if this is in the Privacy Policy, even if no human ever sees this information except as an anonymous drop in an ocean of data? No, of course not! Silly people raising a fuss over the iTunes mini-store, FireFox, Google's anti-phishing plug-in, the Google Toolbar... Google even took an opportunity once to vent about this!

- Merchants that ask you for your address and phone number. Rebates that require you to state an occupation. Supermarket-savings cards, real or fake ones. Merchants that keep track of what each customer buys. How much is that information worth, anyways? Why do we keep giving it out? Ought we not to? Can we sell it? Isn't it advantageous for merchants to figure out who wants what where, so as to be able to distribute and stock more efficiently, and offer lower prices? Don't you want to be able to walk into a store and have the salesperson ask "So, how's that (latest gadget you bough there) workin' out for you?"? Now THAT'S service! You just don't get that anymore.

- A quick look at the Privacy Policies for Google web search and for some popular email services, maybe some hosting services, online social networking services, and so on.

- Why it is that information about you ends up in the Google search results and what you can do about it.

- Why orkut users (read: Brazilian people) don't really care about privacy, and why it is that they are actually wrong when the DO complain about "violations of privacy".

- More and more of our information, and our lives in general, is web-based. What are the implications? What does the future hold?

- Combining information: Bits of information about you which can separately be made public harmlessly could, when brought together, be (or look) dangerous to you. In the computerized world, people need to realize that different bits of information about you from different sources can easily be brought together. If your address and phone number are in the phonebook under your name, then anyone with your phone number can find out where you live. Things you say on different websites, different blogs, different online forums, and different periodicals, can all easily be found. Anything you say (and stuff others say about you) that ends up on the web is public.

- John Battelle speaks his mind on these issues (especially the last ones about information on the web, what sites know what, how much of it can be found by anyone), and I agree with him. (See, I'm not always disagreeable). You can watch this, starting 35 minutes and 45 seconds into this video.

- My thoughts on the controversy over Google's Book Search - because "private information" issues and "intellectual property" issues go hand in hand. My opinion on this particular issue is not one I have seen online; it's a mix between this one and this one.

- My reflections about the many(1, 2, 3, 4, 5, 6, 7, 8, etc) articles and sites out there that are WAY too paranoid about privacy. My favorite is this video (it wasn't an easy choice, there are just so many wild and crazy privacy nuts out there).

- Other random internet-related things.

Anonymity versus Liability

The internet offers many options to be anonymous. You can post anonymously to a blog. You can get yourself a Hotmail or Yahoo email account with some random name and send emails to whomever you want, and these emails will not contain information that says who you are. You can get free hosting and start a whole website without having to say who you are! In all these cases, the illusion of anonymity is only skin-deep, however. The server logs of any of these services will remember what computer accessed these services when all these things were done. They will at the very least remember your ISP, and then your ISP may or may not rat you out when asked (again highlighting the important of reading that privacy policy). So you're never really anonymous.

It used to be that many public computer networks - in public libraries, airports, coffee shops, and "free-wireless" places - could be accessed without any self-identifying information required. Now, though, most of them do log who used what computer when.

I think this is great. I think the only reason why you would ever want anonymity online is if you were up to no good, if you were doing you knew to be illegal or at least motivated by illegitimate and/or malicious purposes. If you and your actions have Right and/or the Law on your side, why would you want anonymity? A common theme in this blog will be "only the guilty have anything to hide".

I do accept that "free speech", protected by the first amendment of the US constitution, is a right, and might include the right to anonymous speech. However, I do not see why it is expected that people/institutions who have servers and sell (or give away) internet access ought to allow anonymous access. Many websites do not allow certain language, or the discussion of certain topics. Does that infring on the right to "free speech"? Many email providers (like Hotmail and Yahoo mail) limit the number of emails you can send in one day. Does that infring on the right to "free speech"? In other words, you cannot "speak freely" wherever the heck you want. A company who provides access to the internet has the right to demand that you not speak quite freely while using their services, by restricting your vocabulary, the subjects mentioned, or your ability to be anonymous. Sure, they may choose to allow you to be anonymous when you access the internet, if they feel that choice will bring in more users or something, but it is a choice.

Most importantly, I believe that if something wrong is done online, someone must be liable. I believe the ability to identify the person who did something wrong, or the computer from which the wrong thing was done, is important. Who sent all this spam? Who shared this copyrighted content? Who revealed this secret? Who is looking to make, distribute, or acquire child pornography? Who made these libelious statements? If a provider (be it a library, a cafe, or a wireless spot) allows users to be anonymous and the users do these things, is the provider liable?

I like the United Stated better than Brazil (where I was born and grew up) for a few reasons. The main one is that, when you do something wrong in the US, you are liable. You are responsible for the consequences of your actions. If something bad happens, someone is at fault and has to pay, usually. This to me is a sign of a highly civilized society: It's hard to get away with malicious stuff. In Brazil, anyone can get awya with all kinds of terrible things, and many people regularly do. Law enforcement is a joke, the police is corrupt, there is usually no hope of catching someone who committed a crime against you. Things are better in the US - it is harder to commit a crime and get away with it, and most importantly, there are few if any wrong things you can do and expect to get away with them. (As you may guess, I am also a fan of security cameras, and I think the UK is even more civilized than the US when it comes to a variety of things).

For example, take a look at the comments aboutthis blog post by Mark Evans (featured in SearchEngineWatch. The post is about Google sponsoring (i.e. "slowly taking over", probably) some wireless networks in parks in New York. People's comments emphasize their preference - rather, their expectation - for the ability to use these networks truly anonymously. A prime example of over-the-top "what I do is private, even if I do it through servers belonging to 3 different companies, and for free" thinking:

The last time I checked, the Bryant Park free wireless... put the public's interest first. They gave... truly anonymous access. You read the access policy, which states clearly that they intend to offer anonymous access because they think this serves a civic need. You click agree. That's it. No email registration or tracking or anything (unlike comments on this blog site).

If Google's sponsorship has somehow given them any kind of power over the network, then I'd much rather the NYC city taxes that I pay support this network and keep it free and anonymous, than have Google's Central Committee roll out their shady data mining activities and notoriously non-transparent governance. The City needs Bryant Park wireless as-is to prove irrefutably again and again what is possible. Pundits and planners continually assert that anonymous Internet access is impossible, but so far we can point to Bryant Park which has been running quite well since 2001 to show that yes, free anonymous speech on the Internet is sustainable. I, for one, value it greatly and consider it an important part of what our city stands for.


In the near future, I plan on dedicating a post to why Google's data mining is probably not evil (something especially relevant given the recent controversy over the Child-Online-Protection-Act-related subpoena). For now, I just want to say that the above comment does a great job at illustrating how many people

-think that anonymous internet access is a right,

-think Google's data mining is evil

These people are worried they will no longer be able to use the internet without being traced. Much more worrisome is the fact that over 5% of registered domains have phony information about the domain owners. From Slashdot:

According to research carried out by the US Government Accountability Office (GAO) many domain owners are hiding their true identity. The findings could mean that many websites are fronts for spammers, phishing gangs and other net criminals. The report also found that measures to improve information about domain owners were not proving effective...

The GAO took 300 random domain names from each of the .com, .org and .net registries and looked up the centrally held information about their owners. Any user can look up this data via one of the many whois sites on the net. The report found that owner data for 5.14% of the domains it looked at was clearly fake as it used phone numbers such as (999) 999-9999; listed nonsense addresses such as 'asdasdasd' or used invalid zip codes such as 'XXXXX'. In a further 3.65% of domain owner records data was missing or incomplete in one or more fields.


Now THAT is worrisome. Can you own a house or operate a business anonymously? No. Then why can you own and operate a domain anonymously?

Anonymous speech may be a right. But if I don't want it on my website or running through my server, can you blame me? Am I doing something fundamentally wrong by not allowing it on the computers I pay to operate? If something bad is done through these computers, the authorities will be knocking on my door, and they would get rightfully mad if I was not able to tell them who said what, who did what. It would be irresponsible of me, and would be a big inconvenience. I would almost say "Props to those providers who accept this risk and allow their users to speak anonymously", except these providers are opening the door to all kinds of mischief. Only the guilty have reason to hide behind the wall of anonymity.

(I bet all comments made to this post will be anonymous. Yeah, yeah, you show him!)

Have a good rest-of-the-weekend,

Bernardo

Friday, January 20, 2006

Hello World

Hi. Welcome to my blog.

(Let me say right away that my future posts will not be this long. This is just the first one where I explain my point of view in general. My future posts will consist of a link to article, an excerpt, and a very short explanation of why I think the people who wrote the article are paranoid idi.. I mean, should not worry about privacy as much as they do).

(Let me also say that if you disagree with any of what I say, please post a comment or send me an email explaining why. I am open-minded and realize I don't know everything, so this blog could be a good learning experience, and ought to be a good place for debate).

My name is Bernardo Malfitano. All the information you could possibly want about me is here. But here are some tidbits that may be useful in this blog: I am Brazilian, but I went to middle-school in the US, and liked the US enough that I came back for college and have not left since (I will become a US citizen in a few months). Right after graduating, I worked for Google for almost a year. I just quit my job at Google so that I could move to LA and look for an aerospace job, which is what I really want to work on.

This blog is primarily a reaction to:

- People who are worried or upset when they discover that there is information about them on the web;

- People who feel uncomfortable having their shopping habits or web-browsing habits tracked by companies who simply lump all that data into huge statistics in an non-personally-identifiable way.

- People who feel uncomfortable having their shopping habits or web-browsing habits tracked by companies who use this information to provide better services, often with no human intervention (i.e. only computers look at this data).

- People who feel they have the right to go online, shop (online or in a store), and use any service on the internet (searching, posting comments, blogging, emailing, having a site), anonymously.

- - - -

My point of view, which applies to all these cases and more, is the following:

When you interact with someone - be it a website (online store, search engine, email, blog, host/server, ISP), a company (a store, your employer, your phone company, your airline, a company you contacted just to ask a question or make a comment), or a person (someone you have a conversation with, someone who sees you walking down the street) - the information describing this interaction belongs jointly to you and to the other entity/website/person/company.

In other words, the information describing this interaction (which allows the other entity to know some things about you) is like a "thing", a piece of property with certain rights of use and ownership. These rights have to be defined in a contract between you and the other entity. The other entity has the right to define this contract however they want. You have the right to not interact with the other entity, if you do not like the terms of this contract over the co-ownership of your information.

In other words, you do not have an inalienable right to give no one any information. You just have a right to not interact with entities who (you feel) take too much information and/or use it in ways you don't like. You also have a right to complain and say "I do not wish for you to require this information, or to use that information in such and such a way", but the other entity may choose to ignore this request and to stand by their terms (you can always walk away).

If these rights of ownership are not defined by the other entity, then you may not just assume them to be what you wish they were. What I mean is: Read the Terms of Service and the Privacy Policy! In some cases (like meeting with a doctor or a lawyer, or writing email), you can be certain that the information regarding this interaction is fairly well protected, and should be. But in any other circumstances, people need to realize that their information may be required, and may be used in any way the other entity finds appropriate. People should not be surprised when this happens. And people should not mind it too much either.

What many people consider to be "private information" is much more public than they realize. Often, it BECAME public through carelessness on the part of the "owner": They interacted with some entity and assumed their information would not be shared. You know what they say about "When you assume..."

People then realize that all kinds of information about them are all over the place, and begin carefully guarding their information to unreasonable levels.

Then, software that records/emits information about your computing/shopping/browsing/searching habits is frowned upon and extensively bashed in BoingBoing and Slashdot and such places. People start liking the more anonymous ways to interact on the internet, and then when these ways are restricted, people become furious and think they have a right to do things anonymously.

Why do you think you have the right to post on blogs anonymously, to go online anonymously, to search anonymously, to shop anonymously? Do you also have the right to use a car, a phone, an airplane, or a hotel room anonymously? All these things involve interactions with other entities. These other entities may decide that they will only allow you to do these things if you agree to not be anonymous.

And why worry about this? Only the guilty have anything to fear. From whom are you hiding? Who cares if these acts are attributed to you? If you did them for innocent or legitimate reasons, then you should not be worried.

Most of all, many of the people with this kind of paranoia are stupid enough to fall for phishing scams, to not shred their financial documents before throwing them in the dumpster, etc.

- - - -

There are two things that motivated me to write this blog:

1) I worked at Google until very recently. I did user support, among other things. This means I got to deal with a lot of people whose orkut accounts were phished (and they think their account was "hacked into" because of "failures in our security system". Idiots). I also got to deal with lots of people who were upset to be in our reverse-lookup phone directory (even though they appeared to not mind having their name, address, and phone number in a book delivered to every house in town) or upset to have information about them on the web (usually in websites belonging to their friends, their churches, the government, or sites like personals and blogs where they entered the info themselves). I got more frustrated with these people each day. I hope it did not show when I wrote them back.

2) All the time - even after I quit my Google job - I read articles on Slashdot and BoingBoing about how evil some program/company is because they take/share user information, thus violating their customers' "privacy".

- - - -

In this blog, I will mention these kinds of articles whenever I see them (which is a couple times a week, roughly) and, well, disagree with them. I invite my readers to disagree with me. Hopefully, we can get some good debate going. If someone posts a comment or sends me an email that is insightful enough, I might even dedicate a post to it. Unless the author objects, of course. (Yes, I realize email IS, and SHOULD be, private).

Whew, I think that's all for now. Again, please feel free to post comments or send me emails if you agree with me, disagree with me, or see something on the internet you think I ought to check out.

Thanks for reading! And Welcome!

Bernardo

PS: I tend to be a patient, open-minded person who really spends a lot of effort into understanding another person's point of view. In many blogs such as God Vs No God, you will see me trying to understand what assumptions and basic values might cause my "opponent" to come to a different conclusion from mine about what is good and what is bad. I tend to dislike blogs where the opposing point of view is cynically made fun of, where the author does not even recognize that the other side might have a point. However, I am so frustrated and annoyed at people's unrealistic and not-well-thought-out privacy paranoia (and computer illiteracy in general), than in this blog I'm just going to let it all out and call them "idiots" and whatnot. I swear I will only do this on the posts themselves, not in the comments. In the comments I will behave like my usual, civil, open-minded, polite, empathetic self. But I don't ever get a chance to write in the inflamatory, confrontational, disrespectful, intolerant, uncaring, narrow-minded tone used by so many vocal people (mostly conservatives, but other groups too), so please allow me to indulge myself here and exercise those muscles just a little bit. Thanks, and sorry if I offend anyone. I swear I'm pretty nice in person.