An experimental, all-day Workshop on Freedom and Privacy by Design took place on April 4, 2000, as part of the 10th annual Conference on Computers, Freedom, and Privacy. It invited over 30 well-known systems architects, implementors, and experts in privacy and usability to investigate, in depth, three different proposals for creating political artifacts: technology intended to bring about particular types of political change -- in this case the reinforcement of certain civil liberties. This article summarizes the rationale and structure of the workshop, the three proposals investigated, and what the workshop and audience members had to say about the issues. It concludes with some information on the workshop's impact to date and what lessons were learned by running such an experiment.
The Workshop on Freedom and Privacy by Design instead explored using technology to bring about strong protections of civil liberties which are guaranteed by the technology itself---in short, to get hackers, system architects, and implementors strongly involved in CFP and its goals. The intent was to explore the technology of (a) implemented, fielded systems, and (b) what principles and architectures should be developed, including which open problems must be solved, to implement and field novel systems that can be inherently protective of civil liberties. The CFP conference had never before attempted to answer these questions in an intensive; one-day workshop -- it typically has a more policy-oriented focus, concentrating on plenary sessions and a few more informal Birds-of-a-Feather sessions.
The end goal -- not of the workshop itself per se, but of the follow-on activities it may inspire -- is the eventual development of political artifacts. These are technological devices intended to facilitate particular political aims -- in this case, to enable certain civil liberties to be more easily protected worldwide. This is not a new idea -- for example, Code and Other Laws of Cyberspace [Lessig] is a well-known, recent treatment of the hypothesis that software necessarily imposes a particular world-view and political agenda and is rarely if ever value-neutral. However, Lessig's work is merely the most widely known lay work; the idea is quite old amongst those who have done anthropology in the field of computer science. For example, both the classic The Mythical Man-Month [Brooks] and the nearly-unknown The Psychology of Computer Programming [Weinberg] have addressed this topic, and the former was published in 1971, a third of a century ago, when the ARPAnet was not even 100 nodes. In addition, the field of computer-supported cooperative work also addresses many of the issues of how computation shapes social behavior, albeit generally at far smaller scales -- that of a workgroup or a corporation -- than the frankly political aims of this workshop.
The workshop was deliberately uneven in its treatment of invited participants versus the audience -- while the audience was allowed to interact and to make comments, priority was given to workshop members most of the time. The intent was to help focus the discussion -- even 30 people is a very large number with which to attempt meaningful interaction. This strategy was reasonably successful, and still allowed quite a bit of substantive, informative participation from the audience.
The topics to be discussed at the workshop were chosen in advance, primarily by myself, with valuable input from other members of the CFP Program Committee and from some of the position papers submitted by those responding to the call for papers. The workshop addressed three topics, to be described in much greater depth below:
Several months before the workshop, participants were given detailed descriptions (about 7600 words total) of the issues that we would be discussing, and also given access to a mailing list to discuss some of the issues beforehand. All participants were strongly urged to arrive at the workshop with a single overhead slide (per topic) representing their viewpoints on the issue; the intent was to encourage prior thought and participation before the day of the workshop. In addition, half a dozen participants were asked in advance to present 5-minute short talks about particular issues, such as the usability and human-factors issues of cryptographic systems in real-world deployment; the politics of mobilizing popular opposition (as was demonstrated by the antinuclear activism in the US of the 60's and 70's); or the motivations and practicalities of basing a business on free software in the early- to mid-90's, before it was fashionable to do so.
The day of the workshop, all participants and audience members received handouts of the information that had been available to participants in advance, which consisted of the detailed descriptions of the issues to be addressed, one-paragraph biographies of each of the participants, and various other supplementary materials. Half a dozen moderators were appointed, in advance, to facilitate the bulk of the discussion, both guiding the conversation and handling the logistics of sequencing the stack of pending comments -- I felt it neither appropriate nor feasible to directly moderate every moment of discussion all day. Participants were identified by large placards on the tables in front of them, and turning these placards sideways was a very effective way for them to request a turn at the floor. The participants were not individually introduced at the start, since even two minutes of introduction per participant would have consumed an hour of workshop time; this was why biographies were distributed in advance to all attendees. Non-electronic visual aids, such as acetate transparencies, whiteboards, and flipchart easels, were emphasized throughout, again to streamline the process and avoid the inevitable unproductive delays that occur when presenters must interface laptops to video projection systems. Finally, several volunteers were solicited, again in advance, to take notes during the workshop, and these notes were later posted to the post-workshop web pages as part of the documentation of the workshop's results.
[WFPD] is a permanent repository of the public record of the workshop. It contains the original call for participation; the submitted position papers from those responding and those who were invited; the descriptive material in advance to participants, and as handouts to everyone during the workshop; the raw notes of the volunteer scribes; other related resources and references; and various public mailing lists via which those who wish to participate in continuing discussion and technological development may coordinate.
The domain name system was initially conceived as a tree-structured, hierarchical naming scheme that was primarily designed to map host names into IP addresses. At the time, it seemed a reasonably elegant solution to the problem of keeping a potentially large database of mappings up-to-date and robustly available worldwide, while allowing delegation of the authority for creating and maintaining these mappings to the individual entities which were responsible for the hosts being named.
This was a wonderful idea in a world of academia, nonprofits, and gentlemens' agreements. But it has lately become obvious that in the real world of money, greed, scoundrels, lawyers, and intellectual property land grabs, the DNS has many political deficiencies, and that these deficiencies directly impact important civil-liberties issues.
It is important to realize that the DNS is not a law of nature -- it is a technological artifact, created to solve a particular problem in a particular environment. The environment for which it was created has now changed, and it no longer solves the problem effectively. It is therefore time to create a new solution.
There are countless bad examples, most of which never make the press in the first place. Additionally, such land grabs lead to very rapid exhaustion of the DNS namespace -- for example, the dominant registrar, Network Solutions, Inc, encourages everyone to grab 3 domain names at once (.com, .net, .org), and why not? NSI will make three times as much money registering them. Finally, the land-grab mentality has lead to preemptive grabs of just about every single word [Woodhead] in the English language, mostly by squatters hoping to sell those names to the highest bidder.
The current situation is encouraging land grabs for four reasons:
Thus, the resulting situation is one in which everyone is suing everyone else, the address space has been exhausted in a very short amount of time, and political attempts to deal with the problem (the creation of ICANN itself; attempts to very slightly increase the number of gTLDs; trademark and dispute resolution procedures; etc) are having very few socially-positive effects -- they instead look like rearranging deck chairs on the Titanic.
Political chokepoint. The DNS is currently serving as one of the most important political pinch-points on the net, because it gives those with an axe to grind a central place to exert political pressure. If what comes out of your domain is sufficiently unpopular with the local government, authorities can probably arrange to get your domain name mapping yanked. Furthermore, the current situation gives unwarranted authority to domain name registrars in dictating what a name may be -- for example, NSI has historically been arbitrary and capricious about whether or not to register "obscene" names, whether to allow names longer than 25 characters (the DNS technically supports 256), etc.
Little guys. Despite all this, what happens when one is trying to find the Acme Hardware store just down the street? Unless they are affiliated with a national brand, and unless the national website has a store locator, one is unlikely to ever figure out what their domain name is -- there are just too many other Acme's, and no obvious way to pick out the one that was wanted. If one actually wants to use geographical information to help find a local entity, the situation can be grim, even if the site being sought mentions their location in their web pages. A few search engines can sometimes help, but for the average end-user, the domain might as well not exist, and even search engines are limited: they cannot cover the entire web, they are months behind, most users cannot use them effectively, and often what is sought is an email address or some other service that is not a web page and hence is not indexed by a search engine.
Anonymity. Sometimes it is necessary to say things anonymously. The authors of the Federalist Papers, back around the time of the US Revolution, published many of their political tracts anonymously. Yet domain names, by virtue of their hierarchical arrangement, are easily traceable back to someone higher in the tree who is, by definition, "responsible" for the delegation farther down the tree -- and who can often be pressured to revoke the delegation, therefore effectively seizing control of the domain name and shutting it down. Thus, it is essentially impossible to protect a domain name from retribution while simultaneously advertising its existence to potential correspondents.
Hence, freedom of expression has been greatly compromised by the existing DNS. Being unable to publish anonymously online means that the online world has deprived people of a right that has been repeatedly upheld by courts in the real world both in the US and elsewhere. The vulnerability of the DNS to political manipulation -- plus the ease with which packet flows may be traced -- is driving a number of technical attempts to publish anonymously (FreeNet [FreeNet] et al), and to distribute resources against attacks via the courts (Napster [Napster], Gnutella [Gnutella], et al).
The fundamentally anarchic polity which influenced the early design of the Internet is, alas, fundamentally incompatible with any global namespace. Why is this so? Because global namespaces require enforcement against duplication. So the first principle of a new naming system is to permit duplication of names. This is, after all, how names in the real world operate -- very few people or businesses have globally unique names. Instead, they are disambiguated locally, using several methods (geography, profession [for people] or market segment [for businesses], etc), while remaining globally ambiguous.
If we do this, then the hierarchy that is fundamental to the old DNS -- established precisely so that searches were fast and duplication was impossible -- is no longer strictly necessary. So get rid of it. Instead, design a name system which encourages this proliferation of names. Resolution of conflicts becomes important, as well as the principle that some names will not be resolvable. This principle of irresolution is in fact the main privacy protection in such a system, and this kind of naming system must fundamentally support private name systems.
Such a system, inconceivable in the resource-limited world of mid-80's computers, now seems achievable. The most important problems are unlikely to be technical, but rather sociologic and political: how will people re-adapt to a world in which names on the network, just like names in real life, are no longer automatically guaranteed unique? The major stakeholders will fight to the death -- the squatters who have invested in thousands of names, the major corporations, the registrars which expect an infinite stream of money from registrations. How may they be appeased or circumvented?
Let us first examine the strawman implementation. The essentially political questions above were addressed in part by the workshop discussion; we shall turn to that next.
What will life with Smoosh be like?
How might we implement this rather radical proposal?
Clearly, since we are disambiguating names locally, we expect that computers that are "nearby" each other in some way are probably engaging in a collaborative dialog to determine what a name "means" to a user in the local neighborhood of machines.
So we can probably plan on a few megabytes of persistent storage in each computer to hold the (necessarily dynamic) representation of local relationships. This capacity constraint led to the centralization present in DNS. This constraint is no longer with us, and any successor system should feel free to break with that tradition of paucity. In addition, once a name is resolved, the chances are probably very high that the same name in the future should be resolved the same way -- we should cache the results locally, both to help the local machine and probably to help neighbors which might need to determine the same thing.
The resulting system might look like a patchwork of cached information about relationships between clusters of machines. Within a patch, resolution might be very fast and require very little additional information. Across a patch, something like a negotiation may be required -- followed by some cached information which (a) might tend to merge the patch (or not), and which (b) might look something like a treaty or a trade agreement, implemented at network speeds and maintained as names change and hosts move by the caches managed by the cooperating machines.
We should also plan some extensions to the user interface, since users may have a pretty good idea of how to disambiguate a reference (they may know the geographic location, etc, of the name they just typed in). Exactly what form this takes is a big question and depends to a large extent on how the rest of the system is implemented.
We also want to give users the ability to manipulate these mappings -- after all, if names are no longer globally unique, every user and every machine should feel free to invent any number of names in any way they want, somehow including additional resolution information that may help to disambiguate the name later. We have now empowered random end-users, effectively, to be able to register any number of domain names, for free.
We should ensure that any two identically-named, but different, SmooshNames can be told apart from each other, even if we do not know which one is the one the user wanted. Each one could have a unique 256-bit random bitstring associated with it, for example -- such a long string, if chosen in a cryptographically random fashion, guarantees uniqueness with extremely high probability without requiring coordination to avoid duplication. (Relying on this, and not IPv4 32-bit addresses, or even IPv6 128-bit addresses, allows SmooshNames to change the mapping information that they are being used for without losing their identity if this information is changed.) So someone can change their IP address (perhaps dynamically, every time they connect, via DHCP), but people who already know their SmooshName aren't affected by this.
The final system thus consists of machines constantly propagating SmooshNames to each other. The pattern of which machines understand the mapping of particular SmooshNames to 256-bit unique numbers (and particular IP addresses) reflects the communications patterns of their users. Duplicated human-readable SmooshNames are only seen by a particular machine if its social network consists of two entities which both claim the same name. These are then further disambiguated contextually, and the resulting computation is cached by all who participated in it -- hence the work is done only once, and even SmooshNames that appear to collide can be told apart both by machines (via their 256-bit unique-ID's) and by users (by the disambiguating contexual information).
One way of thinking about the resulting system is by comparison to physical street addresses in the current world. While they are certainly longer than the typically one-word domain names now used, they are also far less likely to be duplicated, and far less likely to engender battles for their control. And those who address physical letters are perfectly willing to write the small additional amount of information required for this physical addressing mechanism to work, and having local computation available means that most users might only have to write down such an address once, rather than every time it is used.
[WFPD], in its section on DNS replacement, has a section of sample scenarios for how this whole system might work for individuals with common names (e.g., Sally Smith), how users can create and propagate new SmooshName entries, and how the system can help support anonymous speech.
The substantive technical exploration of the issues led to the following insights:
Instead, the social consequences of fielding such a system led to intense and sometimes acrimonious debate. For example, at least one participant was vociferous in his objection to any system that would make it more difficult to find Barnes & Noble on the net (to use his example) than typing barnesandnoble.com at any browser in the world. Others pointed out that this example can only work when one has a single mapping to a globally-known brand (and hence is inapplicable to small companies, individuals, and so forth), but such observations were not deemed sufficiently convincing by the complainant. It was theorized, at least when it comes to web browsing -- which itself makes up the vast majority of online activity for many nonprogrammers -- that many users care almost exclusively about such large commercial interests. Hence, they would likewise be reluctant to embrace any naming system that makes it even slightly harder to find them. After all, most users are unaware of the enormous number of smaller entities in existence (because they have never heard of the vast majority of them) and are therefore unlikely to be motivated by a system that purports to make them easier to find.
The problem of coping with established players in the current DNS imbroglio -- squatters, powerful trademark holders, and registrars -- was not extensively investigated during the workshop. In part, this was because the workshop tried (albeit unsuccessfully) to stay more technically-focused and implementation-oriented than the rest of the CFP conference, and in part because it was unclear how such powerful entities could be appeased. The DNS-registration business alone is currently worth many billions (computed simply via multiplying the typical price charged by a registrar times the number of active DNS entries), and major trademark holders also control billion-dollar aggregations of resources.
There are serious implications involved in attempting to explain to users how such a distributed, amorphous, and complicated system might actually function. It was generally agreed that the details of the user interface, as presented to users in fielded implementations, would critically impact user acceptance of the technology and its consequent wide deployment and usability. Studies of apparently simpler systems such as, e.g., the PGP application [Whitten] demonstrate that even small details can have large effects on whether users are able to correctly use a cryptographic system effectively; Smoosh is presumably no exception.
A side-discussion about the existing DNS and whether adding a large number (e.g., hundreds) of gTLDs was inconclusive, with one faction asserting that such a large number could effectively discourage land-grabs, and another faction asserting that large corporations would aggressively pursue any identifiable domain which included one of their trademarks as a substring, no matter how large the address space examined.
The fact that most users who use the network use it exclusively to browse the web -- and that most of those people use domain names, and not search engines, yellow pages, or other directory services as their primary means of navigation -- caused quite a bit of discussion about names qua naming versus names for navigation. The eventual upshot of the discussion appeared to be that simply solving the problem of names qua naming -- e.g., the DNS overlay idea -- was itself a sufficiently big idea that getting into more sophisticated navigational issues should wait for some other implementation. This certainly seems entirely reasonable from a technical standpoint, but the widespread and consistently-confused discussion during the workshop about naming things versus finding things bodes badly for getting this right in a way that users can understand, barring an unusually lucid and capable -- and widely-deployed -- implementation.
Deployment of the resulting system is an important consideration as well, and one that was briefly addressed by the workshop. The DNS existed long before the Web; as a result, when web browsers were first created, they naturally expected to use the DNS to identify resources, and hence all web browsers understand how to interact with the DNS. What would motivate the major vendors of web-browsing software today -- for all intents and purposes, Microsoft and Netscape/AOL -- to include in their web browsers any software which makes it easier for users to find otherwise-marginalized, small, non-trademark-holding entities on the web? This would seem to be inimical to the interests of both Microsoft and AOL, both of which are building media empires in an attempt to exclude smaller players from the table. Getting these vendors to include such a politically contrary idea as Smoosh -- no matter how well-implemented and no matter how easy to understand by users -- could prove potentially difficult. Whether this is a real problem that requires solving is not known; the discussion in the workshop was speculative.
It is very important to get businesses actively involved in protecting civil liberties where those liberties are threatened by information systems, because most technological means to fix the situation require the sort of widespread deployment and support which is only feasible in a business environment. Yet most business models are directly antithetical to these goals. They make money off data-mining their customers' purchases and either using them internally, or selling them to third parties. They work to reduce their liability and the complexity of doing business through any means necessary, even if that means selling their customers' privacy up the river (e.g., they will not optimize for shielding customers from subpoenas unless the business itself finds answering lots of subpoenas to be costing it too much money). And their customers, for the most part, either do not care, or have little choice because there is great uniformity in this business behavior across many market segments.
How, then, do we motivate businesses to attempt to protect their customers' civil liberties? This is even thornier if their customers don't care, and worse yet if doing so puts them at any sort of competitive disadvantage (either in ease-of-use of the business's product, or profit margin relative to competitors).
A few solutions spring to mind, but they are each incomplete, and the total set is no doubt massively incomplete. The workshop was asked to investigate methods that might be better than strawmen such as:
This session was kicked off by two 5-minute presentations -- the first on analogies from other forms of activism (particularly antinuclear activism) and how they might be used in the context of information privacy, by Philips; and the second on the compatibility of free software and for-profit businesses, drawing examples from Cygnus Software, by its founder, Gilmore.
Gilmore's main point -- that free software and a profitable business were perfectly compatible -- was well-received by the workshop. While this may seem relatively obvious at the turn of the millennium, it was far from obvious when Cygnus was started in the early 90's. Indeed, it was widely believed that it would be impossible to make money in this way, and, as a result, Cygnus had its market segment virtually to itself, since other businesses were reluctant to enter because they had convinced themselves that it would be ruinous to do so.
The description of this part of the workshop included a strawman of a "privacy Chernobyl". This turned out to be a poor choice, because approximately half of the workshop leapt to the conclusion that a reasonable course of action would be to encourage such a Chernobyl, in the guise of then forcing a panicky polarity response that might lead to governmental regulation. It was generally acknowledged that such a reaction, by nature of being an ill-thought-out emergency response, might well do more harm than good. Discussion continued to return to the idea of encouragement of a Chernobyl despite various assertions to the contrary; obviously, the whole idea is potentially inflammatory and a public debate on privacy which includes such concepts must be carefully framed lest those involved start out with the wrong assumptions.
One major problem, mentioned by several participants, is that the majority of US citizens apparently seem unconcerned about particular violations of civil liberties -- such as various forms of ubiquitous surveillance and detailed, individualized data mining -- and hence are unmotivated to learn about the issues or to take action. Thus, there is little political will for large-scale reforms. It was emphasized by a few that a visceral, motivating incident was generally required to get most people involved, but this then tended to focus on the above idea of a privacy Chernobyl. It was also fairly obvious to all concerned that incentive programs (such as supermarket discount cards) are generally very successful at getting people to give up the privacy of their transactional and purchasing data in exchange for relatively trivial monetary rewards.
It was also observed that occasionally one might be surprised at how willing businesses are to protect people's privacy (for instance), if one manages to point out how they are failing to do so and if it does not negatively affect their profit margin. An example of writing to many OEMs and their resultant change in behavior was cited. On the other hand, a similar example, detailed by someone involved with the relevant organization, showed how apparently innocuous reuse of database fields could accidentally link fields that were intended to be separated and lead to, in this case, the widespread violation of the privacy of thousands of users of a computerized dating service. Here, even though the business initially tried to do the right thing, a careless decision made years after the database was first designed managed to undo its initial good intentions.
This session turned out to have little to say at a deep technical level, and did not expose any novel ways to motivate businesses to change their models to be inherently protective of civil liberties. While the discussion was informative for many present, it suffered from not actually being in the presence of many businesspeople whose behavior might well have been changed by it. This is a continuing problem in education -- many businesspeople are insufficiently aware of what incentives and technologies might exist for them to change their business practices.
We define cash here to mean something very much like paper money. For something to be cash, it must have the properties of:
We are not necessarily talking about a micropayments system, e.g., one that is efficient for very small transactions, such as fractions of a cent. People have been trying to bring cash to the net for years, and the prospects do not look much better now than they did years ago. Efforts have been stymied by at least four major factors:
And yet real cash has numerous advantages of privacy, bounded liability, and the fact that everyone is a "merchant" in the same way that anyone can hand a paper dollar to anyone else without the actions of an intermediary.
One potential solution to the deployment issue, as proposed to the workshop, might be to convince a credit-card company to sell prepaid cash cards. These would work like anonymous phone cards -- one might walk into a 7-11 and trade a $10 bill for a debit card worth $10 (or, perhaps, a tiny bit less, to pay the card provider). If these were efficient for small enough amounts, one might use a new card for each transaction, hence making individual transactions unlinkable. Paper money is used to initialize the cards because it is already anonymous and ubiquitous. This scenario sidesteps thorny issues such as deployment of a cryptographic infrastructure.
What are the problems with this scenario? There are several:
Workshop discussion of this proposal tended to focus on a small number of issues:
While it was understood that no real design or implementation work could possibly happen in such a large group and under such tight time pressures -- such work typically is most successful in small, isolated groups given the time to think carefully about the problem -- nonetheless discussion tended to stay more on policy and less on detailed design issues than had originally been intended. Increasing pressure on participants, via more-aggressive moderation, to avoid unproductive conversational loops and digressions might have helped. In addition, while every effort was made to get participants to prepare extensively beforehand and hence increase the rate at which members came up to speed at the workshop itself, not everyone had the time to do so -- this was a group of busy professionals who, of necessity, had other demands on their time as well.
In retrospect, the workshop should probably have been made available afterwards in RealAudio format, as was done for the plenary sessions of this year's CFP. It was originally assumed that 8 hours of conversation amongst a very large group would be essentially incomprehensible by those who weren't present; given that participants and audience members were quite good at identifying themselves each time they spoke, and given the rather structured nature of the discourse, a RealAudio record would probably have been more useful than was originally supposed. However, the workshop is fairly well-represented by the several sets of notes taken by the volunteer scribes.
While this experimental workshop had high goals and lived up to some but not all of them, it was nonetheless deemed reasonably successful by a large number of its participants and audience members. A similar workshop, addressing perhaps more-tractable design issues and keeping in mind some of the lessons learned this time, would likely be of value.
This workshop could not have happened without the support of Lorrie Cranor, chair of CFP2000, and the support of the rest of the CFP2000 Program Committee. The initial idea, though not the implementation details, of replacing the DNS with a system that did not guarantee unique naming is due to Eric Hughes. The workshop received partial funding from the National Science Foundation. I am also indebted to my co-moderators, the volunteer scribes, the rest of the workshop participants, and the audience members, many of whom had insightful and helpful suggestions.
[Brooks] Frederick Brooks, Jr, The Mythical Man-Month, Addison-Wesley, 1972 and (reprinted with corrections) 1982.
[CFP] The 10th annual Computers, Freedom, and Privacy conference (CFP2000), http://www.cfp2000.org/
[FreeNet] http://freenet.sourceforge.net/
[Gnutella] http://gnutella.wego.com/
[Lessig] Lawrence Lessig, Code and Other Laws of Cyberspace, Basic Books, June 2000.
[Napster] http://www.napster.com/
[RFC] P.V. Mockapetris, RFC1034: Domain names - concepts and facilities and P.V. Mockapetris, RFC1035: Domain names - implementation and specification
[Weinberg] Gerald Weinberg, The Psychology of Computer Programming, Van Nostrand Reinhold Co, 1971.
[Whitten] Alma Whitten, "Why Johnny Can't Encrypt: A Usability Evaluation of PGP 5.0", http://www.cs.cmu.edu/~alma/johnny.pdf
[Woodhead] Robert Woodhead, Selfpromotion.com, http://www.selfpromotion.com/domainfun.t
[WFPD] The CFP200 Workshop on Freedom and Privacy by Design, http://www.cfp2000.org/workshop/materials/
The following people were participants in the workshop: