New features make the web a stronger and livelier platform. Throughout the feature development process — planning, drafting, and implementing — there are both foreseeable and unexpected security and privacy risks. These risks may arise from the nature of the feature, some of its part(s), or unforeseen interactions with other features and may be mitigated through careful design and consideration of security and privacy design patterns.
Standardizing web features presents unique challenges as, inherently, the web standardization process requires multiple independent implementations of a feature. As a result, descriptions, protocols and algorithms need to be considered strictly so they are broadly adoption by vendors with large user bases.
A user agent is the user’s agent, and it is important that feature authors consider that if their feature contains privacy risks, then, user agents may implement the feature in a compatibility breaking way to protect user privacy. This is especially true if and when features are not clearly defined and privacy exposures are identified and left unmitigated — or left to user agents to mitigate — vendors may implement their own compatibility breaking mitigations that risk wide adoption of the feature.
This is why each Working Group needs to consider security and privacy by default. This consideration is mandatory. Furthermore, assessing the impact of a mechanism on privacy should be done from the ground up, during each iteration of the specification. The questionnaire and guidance presented here is meant to assist authors consider security and privacy in their feature as earlier as possible — beginning with ideation in Community Groups through to formalization in Working Groups. The guidance should help authors develop standards with privacy by default allowing for the questionnaire to be addressed simply and straightforwardly.
Providing adequate and informative answers to the questions, such as providing context, or explaining the reasoning behind certain choices, will also help in a wide security and privacy review performed later on, within the W3C review proces, and beyond.
This document encourages early review by posing a number of questions that
you as an individual reader/writer/contributor of a specification can ask —
This document does not attempt to define what privacy is (in a Web context). Instead privacy is the sum of what is contained in this document. While this may not be exactly what most readers would typically assume but privacy is a complicated concept with a rich history that spans many disciplines and there remains confusion over the meaning.
The audience of this document is general:
The editors and contributors who are responsible for the development of the feature,
The W3C TAG, who receive the questionnaire along with the request, and in line with the W3C Process,
External audience (developers, designers, etc.) wanting to understand the possible security and privacy implications.
2. How To Use The Questionnaire
Thinking about security and privacy risks and mitigations early in a project is the best approach as it helps ensure the privacy of your feature at an architectural level and ensures the result, descriptions, protocols and algorithms incorporate privacy by default as opposed to through possible implementation mitigations.
The Privacy Interest Group (PING) recommends that a feature group review the guidance and questionnaire when first considering their feature and meet with PING at that time to discuss any questions they have about how the guidance/questionnaire intersects with their feature at a conceptual level. After the feature group has developed their feature with the guidance/questionnaire informing their development process, the group should bring an early draft of their feature specification with Privacy consideration section to PING for review. From there the feature group should iterate on their design.
When requesting a Technical Architecture Group review, include the filled questionnaire, along with the description of changes or observations made during the design process. This allows external reviewers understand the rationale, as well as the challenges and evolution of the feature, with respect to security and privacy.
It is understandable that developers may not always have the necessary data to see the broader picture and possible implications, for example in relation to other existing web functionalities. The answers to the questionnaire are meant as help and input for people who may nonetheless make security and privacy remarks, or the assessment.
3. Threat Models
To consider security and privacy it is convenient to think in terms of threat models, a way to illuminate the possible risks.
There are some concrete privacy concerns that should be considered when developing a feature for the web platform:
Surveillance: Surveillance is the observation or monitoring of an individual’s communications or activities.
Stored Data Compromise: End systems that do not take adequate measures to secure stored data from unauthorized or inappropriate access.
Intrusion: Intrusion consists of invasive acts that disturb or interrupt one’s life or activities.
Misattribution: Misattribution occurs when data or communications related to one individual are attributed to another.
Correlation: Correlation is the combination of various pieces of information related to an individual or that obtain that characteristic when combined.
Identification: Identification is the linking of information to a particular individual to infer an individual’s identity or to allow the inference of an individual’s identity.
Secondary Use: Secondary use is the use of collected information about an individual without the individual’s consent for a purpose different from that for which the information was collected.
Disclosure: Disclosure is the revelation of information about an individual that affects the way others judge the individual.
Exclusion: Exclusion is the failure to allow individuals to know about the data that others have about them and to participate in its handling and use.
In the mitigations section, this document outlines a number of techniques that can be applied to mitigate these risks.
Enumerated below are some broad classes of threats that should be considered when developing a web feature.
3.1. Passive Network Attackers
A passive network attacker has read-access to the bits going over the wire between users and the servers they’re communicating with. She can’t modify the bytes, but she can collect and analyze them.
Due to the decentralized nature of the internet, and the general level of interest in user activity, it’s reasonable to assume that practically every unencrypted bit that’s bouncing around the network of proxies, routers, and servers you’re using right now is being read by someone. It’s equally likely that some of these attackers are doing their best to understand the encrypted bits as well, including storing encrypted communications for later cryptanalysis (though that requires significantly more effort).
The IETF’s "Pervasive Monitoring Is an Attack" document [RFC7258] is useful reading, outlining some of the impacts on privacy that this assumption entails.
Governments aren’t the only concern; your local coffee shop is likely to be gathering information on its customers, your ISP at home is likely to be doing the same.
3.2. Active Network Attackers
ISPs and caching proxies regularly cache and compress images before delivering them to users in an effort to reduce data usage. This can be especially useful for users on low-bandwidth, high-latency devices like phones.
If your ISP is willing to modify substantial amounts of traffic flowing through it for profit, it’s difficult to believe that state-level attackers will remain passive.
3.3. Same-Origin Policy Violations
The same-origin policy is the cornerstone of security on the web; one origin should not have direct access to another origin’s data (the policy is more formally defined in Section 3 of [RFC6454]). A corollary to this policy is that an origin should not have direct access to data that isn’t associated with any origin: the contents of a user’s hard drive, for instance. Various kinds of attacks bypass this protection in one way or another. For example:
Cross-site scripting attacks involve an attacker tricking an origin into executing attacker-controlled code in the context of a target origin.
Data leakage occurs when bits of information are inadvertantly made available cross-origin, either explicitly via CORS headers [CORS], or implicitly, via side-channel attacks like [TIMING].
3.4. Third-Party Tracking
The simplest example is injecting a link to a site that behaves differently under specific condition, for example based on the fact that user is or is not logged to the site. This may reveal that the user has an account on a site.
3.5. Legitimate Misuse
Even when powerful features are made available to developers, it does not mean that all the uses should always be a good idea, or justified; in fact, data privacy regulations around the world may even put limits on certain uses of data. In the context of first party, a legitimate website is potentially able to interact with powerful features to learn about the user behavior or habits. For example:
Tracking the user while browsing the website via mechanisms such as mouse move tracking
Behavioral profiling of the user based on the usage patterns
Accessing powerful features enabling to reason about the user system, himself or the user surrounding, such as a webcam, Web Bluetooth or sensors
This point is admittedly different from others - and underlines that even if something may be possible, it does not mean it should always be done, including the need for considering a privacy impact assessment or even an ethical assessment. When designing a specification with security and privacy in mind, all both use and misuse cases should be in scope.
4. Questions to Consider
4.1. How does the specification deal with personal information allowing to single out the user?
Personal data are information relating to individuals. Singling out the user is possible when some unique information or traits are collected. For example, personal data allow singling out individuals on its own, or in combination with other information, to identify a specific person. The exact definition of what is considered PII varies around the world and is rapidly evolving, but may include, among the others, things like a home address, an email address, birthdates, usernames, fingerprints, biometric data, health information, but also identifiers such as cookies, identifiers, IP address, health status, preferences, affinities, beliefs, and much more, depending on the context.
If the specification under consideration exposes personal data to the web, it’s important to consider ways to mitigate the impacts. For instance:
A feature which uses biometric data (fingerprints or retina scans) should refuse to expose the raw data to the web, instead using the raw data only to unlock some origin-specific and ephemeral secret and transmitting that secret instead.
Including a factor of user mediation should be considered, in order to ensure that no data is exposed without a user’s explicit choice (and hopefully understanding). One way to achieve this may be the use of Permission API [PERMISSIONS], or additional dialogs like in Payment Request API [PAYMENT-REQUEST-API]
The users should be made aware if and when their private data may be exposed to the web
If the specification deals with or introduces identifiers (persistent or not), the process should be documented, along with the nature of the identifier and its lifetime, whether it is limited to a particular origin or not.
4.2. How does this specification deal with high-value data?
form-actionvalues to further mitigate the risk of exfiltration.
4.3. Might this specification introduce new state for an origin that persists across browsing sessions?
Service Worker [SERVICE-WORKERS] intercept all requests made by an origin, allowing sites to function perfectly even when offline. A maliciously-injected service worker, however, would be devastating (as documented in that spec’s security considerations section). They mitigate the risks an active network attacker or XSS vulnerability present by requiring an encrypted and authenticated connection in order to register a service worker.
Platform-specific DRM implementations might expose origin-specific information in order to help identify users and determine whether they ought to be granted access to a specific piece of media. These kinds of identifiers should be carefully evaluated to determine how abuse can be mitigated; identifiers which a user cannot easily change are very valuable from a tracking perspective, and protecting the identifiers from an active network attacker is an important concern.
Indexed DB, etc. all allow an origin to store information about a user, and retrieve it later, directly or indirectly. User agents mitigate the risk that these kinds of storage mechanisms will form a persistent identifier by offering users the ability to wipe out the data contained in these types of storage.
Fingerprinting is a technique establishing a unique identifier based on the capabilities of specific feature alone (e.g. Canvas), or combining a number of features. Specifications and user agents should treat the risk of fingerprinting by carefully considering the surface of available information, and the relative differences between software and hardware stacks. Sometimes reducing fingerprintability may be revolve to specific operational considerations, for example providing information in same order (i.e. list of fonts), but sometimes this is not so simple.
Some readout information that is subject to change may still act as a short-term identifier, and possibly introduce a risk of misuse (examples: Leaking Battery, Battery Status Not Included). Good example may be the readout of ambient light level [AMBIENT-LIGHT], or battery [BATTERY-STATUS-API]
4.4. Does this specification expose persistent, cross-origin state to the web?
GL_RENDERERstring exposed by some WebGL implementations improves performance in some kinds of applications, but does so at the cost of adding persistent state to a user’s fingerprint. These kinds of device-level details should be carefully weighed to ensure that the costs are outweighed by the benefits.
NavigatorPluginslist exposed via the DOM practically never changes for most users. Some user agents have taken steps to reduce the entropy introduced by disallowing direct enumeration of the plugin list.
The unexpected cases such as the one of [BATTERY-STATUS-API] when the battery level allowed to reason about the capacity as provided by the operating system
4.5. Does this specification expose any other data to an origin that it doesn’t currently have access to?
As noted above in §3.3 Same-Origin Policy Violations, the same-origin policy is an important security barrier that new features need to carefully consider. If a specification exposes details about another origin’s state, or allows POST or GET requests to be made to another origin, the consequences can be severe.
Content Security Policy [CSP] unintentionally exposed redirect targets cross-origin by allowing one origin to infer details about another origin through violation reports (see [HOMAKOV]). The working group eventually mitigated the risk by reducing a policy’s granularity after a redirect.
Beacon [BEACON] allows an origin to send POST requests to an endpoint on another origin. They decided that this feature didn’t add any new attack surface above and beyond what normal form submission entails, so no extra mitigation was necessary.
4.6. Does this specification enable new script execution/loading mechanisms?
HTML Imports [HTML-IMPORTS] create a new script-loading mechanism, using
script, which might be easy to overlook when evaluating an application’s attack surface. The working group notes this risk, and ensured that they required reasonable interactions with Content Security Policy’s
New string-to-script mechanism? (e.g. `eval()` or `setTimeout([string], ...)`)
What about style?
4.7. Does this specification allow an origin access to a user’s location?
A user’s location is highly-desirable information for a variety of use cases. It is also, understandably, information which many users are reluctant to share, as it can be both highly identifying, and potentially creepy. New features which make use of geolocation information, or which expose it to the web in new ways should carefully consider the ways in which the risks of unfettered access to a user’s location could be mitigated. For instance:
Geolocation information can serve many use cases at a much less granular precision than the user agent can offer. For instance, a resturaunt recommendation can be generated by asking for a user’s city-level location rather than a position accurate to the centimeter.
A recent Geofencing proposal [GEOFENCING] ties itself to service workers and therefore to encrypted and authenticated origins.
4.8. Does this specification allow an origin access to sensors on a user’s device?
Powerful features allowing to query information about the user system or environment may open new interesting use cases, as well as expanding risks and changing threat models. Examples of new feature of the kind may include sensors, communication channels such as Bluetooth or USB.
4.9. Does this specification allow an origin access to aspects of a user’s local computing environment?
Features enabling to modify or query screen sizes or installed fonts may be useful but in some contexts might be introducing possible risks. Additionally, functionality facilitating access reason about the user computing environment such as by means of bluetooth or USB should be accounted for, whether the employed identifiers are long term or not.
[AMBIENT-LIGHT] and [GENERIC-SENSORS] have an extensive discussion around the security and privacy risks
4.10. Does this specification allow an origin access to other devices?
Accessing other devices, both via network connections and via direct connection to the user’s machine (e.g. via Bluetooth, NFC, or USB), could expose vulnerabilities - some of these devices were not created with web connectivity in mind and may be inadequately hardened against malicious input, or with the use on the web.
The Network Service Discovery API [DISCOVERY] recommends CORS preflights before granting access to a device, and requires user agents to involve the user with a permission request of some kind. The spec’s Security and privacy considerations" section has more details.
Likewise, the Web Bluetooth [BLUETOOTH] has an extensive discussion of "Security and privacy considerations", which is worth reading as an example for similar work.
Direct connections might be also be used to bypass security checks that other APIs would provide. For example:
Attackers used the WebUSB API to access others sites' crendentials on a hardware security, bypassing same-origin checks in an early U2F API. [YUBIKEY-ATTACK]
4.11. Does this specification allow an origin some measure of control over a user agent’s native UI?
This concerns interaction with the UI, such as modification of browser display, or even such things as displaying a common pop-up with origin-controlled input, such as Notifications.
4.12. Does this specification expose temporary identifiers to the web?
Some identifiers may be difficult to pin-point but they revolve around information that is stable in short-term manner (seconds, minutes, days) and may be available to web origins, whether in cross-origin manner or even cross-browser manner.
Ambient Light readout may be perceived in a similar manner
Information such as ConnectionType, ConnectionType and rtt exposed by [NETWORK-INFORMATION-API]
4.13. Does this specification distinguish between behavior in first-party and third-party contexts?
Section 2.1 of [FIRST-PARTY-ONLY] defines "first-party" in line with existing browser behavior (Chrome and Firefox).
4.14. How does this specification work in the context of a user agent’s Private Browsing Modes mode?
Some web features behave differently in private browsing modes (with respect to normal browsing modes). This is not always desirable. Features should work in such a way that the website would not be able to determine that the user was in a privacy browsing mode. This includes graceful degradation when a feature stops functioning in a way still not revealing the browsing mode. Some examples
Ideally, the feature would work in such a way that the website would not be able to determine that the user was in private browsing mode.
Less ideally, the feature wouldn’t work, but the website still wouldn’t be able to distinguish a private browsing context from simply being denied permission to use the feature (for instance).
Unideally, the feature wouldn’t exist at all in private browsing mode, which means that the user wouldn’t be exposing data, but the website can probably tell that the user is in that state.
[PAYMENT-REQUEST-API] allowed the detection of “incognito” mode
4.15. Does this specification persist data to a user’s local device?User agents are able to store persistent data to user’s devices. The simplest examples are browsing history, cookies, or localStorage.
How should user agent’s "Clear browsing data" functionality work with this data? Are there caches that the user agent needs to be particularly careful with?
4.16. Does this specification have a "Security Considerations" and "Privacy Considerations" section?
Documenting the various concerns and potential abuses in "Security Considerations" and "Privacy Considerations" sections of a document is a good way to help implementers and web developers understand the risks that a feature presents, and to ensure that adequate mitigations are in place. I f it seems like a feature does not have security or privacy impacts, then say so inline in the spec section for that feature:
There are no known security or privacy impacts of this feature.
Saying so explicitly in the specification serves several purposes:
- Shows that a spec author/editor has explicitly considered security and privacy when designing a feature.
- Provides some sense of confidence that there might be no such impacts.
- Challenges security and privacy minded individuals to think of and find even the potential for such impacts.
- Demonstrates the spec author/editor’s receptivity to feedback about such impacts.
- Demonstrates a desire that the specification should not be introducing security and privacy issues
When saying this, however, the crucial aspect is to actually considering security and privacy. All new specifications must have security and privacy considerations sections to be considered for wide reviews. Interesting features added to the web platform generally often already had security and/or privacy impacts.
4.17. Does this specification allow downgrading default security characteristics?
4.18. Does this specification allow the persistent monitoring of user behavior?Users on the web can be monitored in variaty of means, for example using access to sensors potentially providing mobility patterns (e.g. accelerator, light sensor, web bluetooth), or directly on the web such as the monitoring of mouse movement, which are biometric data conveying rich information.
5. Mitigation Strategies
5.1. Secure Contexts
In the presence of an active network attacker, offering a feature to an insecure origin is the same as offering that feature to every origin (as the attacker can inject frames and code at will). Requiring an encrypted and authenticated connection in order to use a feature can mitigate this kind of risk.
5.2. Explicit user mediation
If a feature has privacy or security impacts that are endemic to the feature itself, then one valid strategy for exposing it to the web is to require user mediation before granting an origin access. For instance, [GEOLOCATION-API] reveals a user’s location, and wouldn’t be particularly useful if it didn’t; user agents generally gate access to the feature on a permission prompt which the user may choose to accept.
Designing such prompts is difficult. Choosers are good. Walls of text are bad.
Bring in some of felt@'s ideas here.
5.3. Drop the feature
The simplest way to mitigate potential negative security or privacy impacts of a feature, and even discussing the possibility, is to drop the feature. Every feature in a spec should be considered guilty (of harming security and/or privacy) until proven otherwise.
Every specification should seek to be as small as possible, even if only for the reasons of reducing and minimizing security/privacy attack surface(s).
By doing so we can reduce the overall security (and privacy) attack surface of not only a particular feature, but of a module (related set of features), a specification, and the overall web platform.
It is always a good strategy to consider the kinds of data a new feature is processing. For example, new features allowing the readout of data may want to adopt specific privacy strategies such as minimizing the quality of datas (quantization) or reducing the frequency, in line with standard privacy engineering practices. Examples
[BATTERY-STATUS-API] “The user agent should not expose high precision readouts”
[SENSORS-API] “Limit maximum sampling frequency”, “Reduce accuracy”
Some features are potentially supplying very sensitive data, and it is the end-developer, system owners, or managers responsibility to realize this and act accordingly in the design of his/her system. Some use may warrant conducting as privacy impact assessment, especially when data relating to individuals may be processed. Examples.
[GENERIC-SENSORS] advices to consider performing of a privacy impact assessment
6. How to Use the questionnaire
To ensure good designs, security and privacy should be considered as early as possible. This questionnaire facilitates this and the questions should be considered early in the specification development process, kept in mind as it matures, with the answers being updated along the specification evolution. This questionnaire should not be used as a “check box" excercise before requesting final publication - acting in this manner does not help improve privacy or security on the Web. Each question needs to be considered and that any privacy or security concerns are described, along with a possible mitigation strategy. It is not a good approach to provide a one-word answer (“yes” / “no”). Rather, it is expected to include an explanatory description. The questions in the questionnaire are more about “why” and “how”, rather than “if”.
It is expected that a questionnaire must be filled in prior to obtaining a W3C Working Draft status, and prior to requiring a review, along the Privacy by Design principles. The questionnaire and its answers should not be included in the specification itself. It is preferable to keep it in a standard and easily available place, with a link available in the TAG repository.