Tracking user activity on the Web using methods other than those defined for the purpose by the Web platform (“unsanctioned tracking”) is harmful to the Web, for a variety of reasons. This Finding details the TAG's stance on different forms of tracking, and how they should be addressed.
This document has been produced by the W3C Technical Architecture Group (TAG). The TAG approved this finding at its July 2015 F2F. Please send comments on this finding to the publicly archived TAG mailing list www-tag@w3.org (archive).
When you use the Web, the sites you visit — including advertisements, analytics services, and other included content on them — use various tools to collect information about who you are and what you do on the site. This is very common on the Web; many sites that you browse will share what you do on them with several others — in some cases, dozens.
Collectively, tracking technologies form the basis of common Web features like shopping carts, persistent site preferences, and behavioral advertising, which allows many Web sites to fund themselves.
Some tracking mechanisms are defined by Web standards, and their design takes into account user needs for privacy and control over data flows. One of the best-known and most widespread is cookies [[RFC6265]]. More recently, other mechanisms such as [[webstorage]] have been standardized to complement cookies.
In particular, browsers provide explicit ways for you to limit when standards-defined tracking technologies are used, either directly or with extensions. For example, a privacy-conscious user can choose to use a cookie blocker, or manually delete cookies. As such, the standards-defined tracking technologies are effectively “opt out” — while they are on by default, you remain in control of them, as long as you accept that sites may not work as well (or at all) if you don't allow their use.
Standards-defined tracking mechanisms also have the benefit of transparency. Users can inspect cookies and other locally stored data and user agents can provide some notice to the user that data is stored by this site. Tools have been developed that enable those users specifically interested in awareness of the tracking of their online activity to document and visualize the use of cookies and tracking pixels; for example, Lightbeam.
In practice, many end users do not themselves understand the details of the local storage mechanisms and their use for tracking. However, tracking based upon standards allows researchers, advocates and regulators to leverage their visibility and use tools to identify and evaluate the privacy-sensitive behavior of online tracking. This work is important input to making tools that can help users manage their privacy appropriately.
However, sites also track user activity outside of these well-defined mechanisms:
Unlike standards-defined tracking, the operation of these unsanctioned techniques is not defined by Web standards, is not user-visible, and it is not under user control. If you use the same browser to visit two different sites, it is technically possible for the sites to identify your browser and correlate your behavior between them (and any other site that they work with). While there are a few legitimate uses of such methods (e.g., combatting Denial of Service attacks, or providing greater certainty about user identity for sites such as banks), unsanctioned tracking is often used for purposes that many consider malicious.
There is ample evidence that many sites already use such unsanctioned tracking methods. For more information, see resources like Panopticlick, Evercookie, and FPDetective.
Staying in control of personal data is important to many people, because data about a person — in particular their activity on the Web — can be used to understand how they think, work and live. Users expect that their browsing information will be kept relatively private. This trust, and users controlling their experience, is a fundamental part of how the Web works.
Recognizing the importance of this information in monetary terms, the World Economic Forum has classified personal data as “a new asset class” — with the implication that if you are unable to control your data, you are on the losing side of a forced transaction.
Furthermore, tracking users' activity without their consent or knowledge is also a blatant violation of the human right to privacy [[udhr]].
As a result, a growing body of legal, social and technical constraints have developed around the use of standards-based tracking technology on the Web. Because they are well-defined, it is possible to discuss and regulate their use, as well as build tools to understand, visualize and control them.
For example, the EU Cookie Directive regulates the use of cookies in that jurisdiction; browsers have cookie control interfaces and extensions; and researchers can plot how cookies are used on the Web.
Unsanctioned tracking, on the other hand, has little such affordance; it is difficult (and sometimes, impossible) to detect using purely technical means in the browser. It stems not from a well-defined specification, but instead from exploitation of certain aspects of how the Web works.
The aggregate effect of unsanctioned tracking is to undermine user trust in the Web itself. Moreover, if browsers cannot isolate activity between sites and offer users control over their data, they are unable to act as trusted agents for the user.
Notably, unsanctioned tracking can be harmful even if non-identifying data is shared, because it provides the linkage among disparate information streams across contextual boundaries. For example the sharing of an opaque fingerprint among a set of unrelated online purchases can provide enough information to enable advertisers to determine that user of that browser is pregnant — and hence to target her with pregnancy-specific advertisements even before she has disclosed her pregnancy.
We have had numerous discussions throughout the Web community about limiting the the browser fingerprinting “surface area” that a browser exposes, by reducing the variability in how browsers behave. In those discussions, we have tried to consider the full span of characteristics about a user, their browser and their activities that may be tracked.
While reducing fingerprinting surface area may mitigate some kinds of unsanctioned tracking, it is inadequate to foil a determined adversary. The variety of documented techniques for browser fingerprinting, from enumerating the extensions installed in the browser to examining exactly how fonts are displayed on screens, continues to increase as new features are developed.
As an extreme example, it has now been shown possible [[spy-sandbox]] to “listen” to the CPU on a computer to detect mouse, network and other activity, using only some JavaScript in a Web page. This information can then be used in the machine fingerprint.
In this environment, it is impractical for specification design to eliminate fingerprinting; not only would such restriction severely hobble the capability of the Web, it would also break a substantial amount of existing content. Moreover, theory confirms that we cannot expect to eliminate these problems on a general-purpose system: From a theoretical perspective, eliminating browser fingerprinting is essentially the same problem as eliminating covert channels [[confinement]].
As a result, we cannot solve the issues that unsanctioned tracking raises through solely technical means. At times, they may be more appropriately addressed through policy (e.g., legislation and/or regulation).
Therefore, the TAG:
The TAG is happy to provide guidance to community members who need specific advice regarding fingerprinting in their specifications.