diff --git a/pep-0xxx.rst b/pep-0xxx.rst new file mode 100644 index 00000000000..57ea8046867 --- /dev/null +++ b/pep-0xxx.rst @@ -0,0 +1,1566 @@ +PEP: XXX +Title: A Unified TLS API for Python +Version: $Revision$ +Last-Modified: $Date$ +Author: Cory Benfield , + Christian Heimes +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 17-Oct-2016 +Python-Version: 3.7 +Post-History: 11-Jan-2017, 19-Jan-2017, 02-Feb-2017, 09-Feb-2017 + + +Abstract +======== + +This PEP would define a standard TLS interface in the form of a collection of +abstract base classes. This interface would allow Python implementations and +third-party libraries to provide bindings to TLS libraries other than OpenSSL +that can be used by tools that expect the interface provided by the Python +standard library, with the goal of reducing the dependence of the Python +ecosystem on OpenSSL. + + +Rationale +========= + +In the 21st century it has become increasingly clear that robust and +user-friendly TLS support is an extremely important part of the ecosystem of +any popular programming language. For most of its lifetime, this role in the +Python ecosystem has primarily been served by the `ssl module`_, which provides +a Python API to the `OpenSSL library`_. + +Because the ``ssl`` module is distributed with the Python standard library, it +has become the overwhelmingly most-popular method for handling TLS in Python. +An extraordinary majority of Python libraries, both in the standard library and +on the Python Package Index, rely on the ``ssl`` module for their TLS +connectivity. + +Unfortunately, the preeminence of the ``ssl`` module has had a number of +unforeseen side-effects that have had the effect of tying the entire Python +ecosystem tightly to OpenSSL. This has forced Python users to use OpenSSL even +in situations where it may provide a worse user experience than alternative TLS +implementations, which imposes a cognitive burden and makes it hard to provide +"platform-native" experiences. + + +Problems +-------- + +The fact that the ``ssl`` module is built into the standard library has meant +that all standard-library Python networking libraries are entirely reliant on +the OpenSSL that the Python implementation has been linked against. This +leads to the following issues: + +* It is difficult to take advantage of new, higher-security TLS without + recompiling Python to get a new OpenSSL. While there are third-party bindings + to OpenSSL (e.g. `pyOpenSSL`_), these need to be shimmed into a format that + the standard library understands, forcing projects that want to use them to + maintain substantial compatibility layers. + +* For Windows distributions of Python, they need to be shipped with a copy of + OpenSSL. This puts the CPython development team in the position of being + OpenSSL redistributors, potentially needing to ship security updates to the + Windows Python distributions when OpenSSL vulnerabilities are released. + +* For macOS distributions of Python, they need either to be shipped with a copy + of OpenSSL or linked against the system OpenSSL library. Apple has formally + deprecated linking against the system OpenSSL library, and even if they had + not, that library version has been unsupported by upstream for nearly one + year as of the time of writing. The CPython development team has started + shipping newer OpenSSLs with the Python available from python.org, but this + has the same problem as with Windows. + +* Many systems, including but not limited to Windows and macOS, do not make + their system certificate stores available to OpenSSL. This forces users to + either obtain their trust roots from elsewhere (e.g. `certifi`_) or to + attempt to export their system trust stores in some form. + + Relying on `certifi`_ is less than ideal, as most system administrators do + not expect to receive security-critical software updates from PyPI. + Additionally, it is not easy to extend the `certifi`_ trust bundle to include + custom roots, or to centrally manage trust using the `certifi`_ model. + + Even in situations where the system certificate stores are made available to + OpenSSL in some form, the experience is still sub-standard, as OpenSSL will + perform different validation checks than the platform-native TLS + implementation. This can lead to users experiencing different behaviour on + their browsers or other platform-native tools than they experience in Python, + with little or no recourse to resolve the problem. + +* Users may wish to integrate with TLS libraries other than OpenSSL for many + other reasons, such as OpenSSL missing features (e.g. TLS 1.3 support), or + because OpenSSL is simply too large and unwieldy for the platform (e.g. for + embedded Python). Those users are left with the requirement to use + third-party networking libraries that can interact with their preferred TLS + library or to shim their preferred library into the OpenSSL-specific ``ssl`` + module API. + +Additionally, the ``ssl`` module as implemented today limits the ability of +CPython itself to add support for alternative TLS backends, or remove OpenSSL +support entirely, should either of these become necessary or useful. The +``ssl`` module exposes too many OpenSSL-specific function calls and features to +easily map to an alternative TLS backend. + + +Proposal +======== + +This PEP proposes to introduce a few new Abstract Base Classes in Python 3.7 to +provide TLS functionality that is not so strongly tied to OpenSSL. It also +proposes to update standard library modules to use only the interface exposed +by these abstract base classes wherever possible. There are three goals here: + +1. To provide a common API surface for both core and third-party developers to + target their TLS implementations to. This allows TLS developers to provide + interfaces that can be used by most Python code, and allows network + developers to have an interface that they can target that will work with a + wide range of TLS implementations. +2. To provide an API that has few or no OpenSSL-specific concepts leak through. + The ``ssl`` module today has a number of warts caused by leaking OpenSSL + concepts through to the API: the new ABCs would remove those specific + concepts. +3. To provide a path for the core development team to make OpenSSL one of many + possible TLS backends, rather than requiring that it be present on a system + in order for Python to have TLS support. + +The proposed interface is laid out below. + + +Interfaces +---------- + +There are several interfaces that require standardisation. Those interfaces +are: + +1. Configuring TLS, currently implemented by the `SSLContext`_ class in the + ``ssl`` module. +2. Providing an in-memory buffer for doing in-memory encryption or decryption + with no actual I/O (necessary for asynchronous I/O models), currently + implemented by the `SSLObject`_ class in the ``ssl`` module. +3. Wrapping a socket object, currently implemented by the `SSLSocket`_ class + in the ``ssl`` module. +4. Applying TLS configuration to the wrapping objects in (2) and (3). Currently + this is also implemented by the `SSLContext`_ class in the ``ssl`` module. +5. Specifying TLS cipher suites. There is currently no code for doing this in + the standard library: instead, the standard library uses OpenSSL cipher + suite strings. +6. Specifying application-layer protocols that can be negotiated during the + TLS handshake. +7. Specifying TLS versions. +8. Reporting errors to the caller, currently implemented by the `SSLError`_ + class in the ``ssl`` module. +9. Specifying certificates to load, either as client or server certificates. +10. Specifying which trust database should be used to validate certificates + presented by a remote peer. +11. Finding a way to get hold of these interfaces at run time. + +For the sake of simplicitly, this PEP proposes to take a unified approach to +(2) and (3) (that is, buffers and sockets). The Python socket API is a +sizeable one, and implementing a wrapped socket that has the same behaviour as +a regular Python socket is a subtle and tricky thing to do. However, it is +entirely possible to implement a *generic* wrapped socket in terms of wrapped +buffers: that is, it is possible to write a wrapped socket (3) that will work +for any implementation that provides (2). For this reason, this PEP proposes to +provide an ABC for wrapped buffers (2) but a concrete class for wrapped sockets +(3). + +This decision has the effect of making it impossible to bind a small number of +TLS libraries to this ABC, because those TLS libraries *cannot* provide a +wrapped buffer implementation. The most notable of these at this time appears +to be Amazon's `s2n`_, which currently does not provide an I/O abstraction +layer. However, even this library consider this a missing feature and are +`working to add it`_. For this reason, it is safe to assume that a concrete +implementation of (3) in terms of (2) will be a substantial effort-saving +device and a great tool for correctness. Therefore, this PEP proposes doing +just that. + +Obviously, (5) doesn't require an abstract base class: instead, it requires a +richer API for configuring supported cipher suites that can be easily updated +with supported cipher suites for different implementations. + +(9) is a thorny problem, because in an ideal world the private keys associated +with these certificates would never end up in-memory in the Python process +(that is, the TLS library would collaborate with a Hardware Security Module +(HSM) to provide the private key in such a way that it cannot be extracted from +process memory). Thus, we need to provide an extensible model of providing +certificates that allows concrete implementations the ability to provide this +higher level of security, while also allowing a lower bar for those +implementations that cannot. This lower bar would be the same as the status +quo: that is, the certificate may be loaded from an in-memory buffer or from a +file on disk. + +(10) also represents an issue because different TLS implementations vary wildly +in how they allow users to select trust stores. Some implementations have +specific trust store formats that only they can use (such as the OpenSSL CA +directory format that is created by ``c_rehash``), and others may not allow you +to specify a trust store that does not include their default trust store. + +For this reason, we need to provide a model that assumes very little about the +form that trust stores take. The "Trust Store" section below goes into more +detail about how this is achieved. + +Finally, this API will split the responsibilities currently assumed by the +`SSLContext`_ object: specifically, the responsibility for holding and managing +configuration and the responsibility for using that configuration to build +wrapper objects. + +This is necessarily primarily for supporting functionality like Server Name +Indication (SNI). In OpenSSL (and thus in the ``ssl`` module), the server has +the ability to modify the TLS configuration in response to the client telling +the server what hostname it is trying to reach. This is mostly used to change +certificate chain so as to present the correct TLS certificate chain for the +given hostname. The specific mechanism by which this is done is by returning +a new `SSLContext`_ object with the appropriate configuration. + +This is not a model that maps well to other TLS implementations. Instead, we +need to make it possible to provide a return value from the SNI callback that +can be used to indicate what configuration changes should be made. This means +providing an object that can hold TLS configuration. This object needs to be +applied to specific TLSWrappedBuffer, and TLSWrappedSocket objects. + +For this reason, we split the responsibility of `SSLContext`_ into two separate +objects. The ``TLSConfiguration`` object is an object that acts as container +for TLS configuration: the ``ClientContext`` and ``ServerContext`` objects are +objects that are instantiated with a ``TLSConfiguration`` object. All three +objects would be immutable. + +.. note:: The following API declarations uniformly use type hints to aid + reading. Some of these type hints cannot actually be used in practice + because they are circularly referential. Consider them more a + guideline than a reflection of the final code in the module. + +Configuration +~~~~~~~~~~~~~ + +The ``TLSConfiguration`` concrete class defines an object that can hold and +manage TLS configuration. The goals of this class are as follows: + +1. To provide a method of specifying TLS configuration that avoids the risk of + errors in typing (this excludes the use of a simple dictionary). +2. To provide an object that can be safely compared to other configuration + objects to detect changes in TLS configuration, for use with the SNI + callback. + +This class is not an ABC, primarily because it is not expected to have +implementation-specific behaviour. The responsibility for transforming a +``TLSConfiguration`` object into a useful set of configuration for a given TLS +implementation belongs to the Context objects discussed below. + +This class has one other notable property: it is immutable. This is a desirable +trait for a few reasons. The most important one is that it allows these objects +to be used as dictionary keys, which is potentially extremely valuable for +certain TLS backends and their SNI configuration. On top of this, it frees +implementations from needing to worry about their configuration objects being +changed under their feet, which allows them to avoid needing to carefully +synchronize changes between their concrete data structures and the +configuration object. + +This object is extendable: that is, future releases of Python may add +configuration fields to this object as they become useful. For +backwards-compatibility purposes, new fields are only appended to this object. +Existing fields will never be removed, renamed, or reordered. + +The ``TLSConfiguration`` object would be defined by the following code:: + + ServerNameCallback = Callable[[TLSBufferObject, Optional[str], TLSConfiguration], Any] + + + _configuration_fields = [ + 'validate_certificates', + 'certificate_chain', + 'ciphers', + 'inner_protocols', + 'lowest_supported_version', + 'highest_supported_version', + 'trust_store', + 'sni_callback', + ] + + + _DEFAULT_VALUE = object() + + + class TLSConfiguration(namedtuple('TLSConfiguration', _configuration_fields)): + """ + An immutable TLS Configuration object. This object has the following + properties: + + :param validate_certificates bool: Whether to validate the TLS + certificates. This switch operates at a very broad scope: either + validation is enabled, in which case all forms of validation are + performed including hostname validation if possible, or validation + is disabled, in which case no validation is performed. + + Not all backends support having their certificate validation + disabled. If a backend does not support having their certificate + validation disabled, attempting to set this property to ``False`` + will throw a ``TLSError`` when this object is passed into a + context object. + + :param certificate_chain Tuple[Tuple[Certificate],PrivateKey]: The + certificate, intermediate certificate, and the corresponding + private key for the leaf certificate. These certificates will be + offered to the remote peer during the handshake if required. + + The first Certificate in the list must be the leaf certificate. All + subsequent certificates will be offered as intermediate additional + certificates. + + :param ciphers Tuple[Union[CipherSuite, int]]: + The available ciphers for TLS connections created with this + configuration, in priority order. + + :param inner_protocols Tuple[Union[NextProtocol, bytes]]: + Protocols that connections created with this configuration should + advertise as supported during the TLS handshake. These may be + advertised using either or both of ALPN or NPN. This list of + protocols should be ordered by preference. + + :param lowest_supported_version TLSVersion: + The minimum version of TLS that should be allowed on TLS + connections using this configuration. + + :param highest_supported_version TLSVersion: + The maximum version of TLS that should be allowed on TLS + connections using this configuration. + + :param trust_store TrustStore: + The trust store that connections using this configuration will use + to validate certificates. + + :param sni_callback Optional[ServerNameCallback]: + A callback function that will be called after the TLS Client Hello + handshake message has been received by the TLS server when the TLS + client specifies a server name indication. + + Only one callback can be set per ``TLSConfiguration``. If the + ``sni_callback`` is ``None`` then the callback is disabled. If the + ``TLSConfiguration`` is used for a ``ClientContext`` then this + setting will be ignored. + + The ``callback`` function will be called with three arguments: the + first will be the ``TLSBufferObject`` for the connection; the + second will be a string that represents the server name that the + client is intending to communicate (or ``None`` if the TLS Client + Hello does not contain a server name); and the third argument will + be the original ``TLSConfiguration`` that configured the + connection. The server name argument will be the IDNA *decoded* + server name. + + The ``callback`` must return a ``TLSConfiguration`` to allow + negotiation to continue. Other return values signal errors. + Attempting to control what error is signaled by the underlying TLS + implementation is not specified in this API, but is up to the + concrete implementation to handle. + + The Context will do its best to apply the ``TLSConfiguration`` + changes from its original configuration to the incoming connection. + This will usually include changing the certificate chain, but may + also include changes to allowable ciphers or any other + configuration settings. + """ + __slots__ = () + + def __new__(cls, validate_certificates: Optional[bool] = None, + certificate_chain: Optional[Tuple[Tuple[Certificate], PrivateKey]] = None, + ciphers: Optional[Tuple[Union[CipherSuite, int]]] = None, + inner_protocols: Optional[Tuple[Union[NextProtocol, bytes]]] = None, + lowest_supported_version: Optional[TLSVersion] = None, + highest_supported_version: Optional[TLSVersion] = None, + trust_store: Optional[TrustStore] = None, + sni_callback: Optional[ServerNameCallback] = None): + + if validate_certificates is None: + validate_certificates = True + + if ciphers is None: + ciphers = DEFAULT_CIPHER_LIST + + if inner_protocols is None: + inner_protocols = [] + + if lowest_supported_version is None: + lowest_supported_version = TLSVersion.TLSv1 + + if highest_supported_version is None: + highest_supported_version = TLSVersion.MAXIMUM_SUPPORTED + + return super().__new__( + cls, validate_certificates, certificate_chain, ciphers, + inner_protocols, lowest_supported_version, + highest_supported_version, trust_store, sni_callback + ) + + def update(self, validate_certificates=_DEFAULT_VALUE, + certificate_chain=_DEFAULT_VALUE, + ciphers=_DEFAULT_VALUE, + inner_protocols=_DEFAULT_VALUE, + lowest_supported_version=_DEFAULT_VALUE, + highest_supported_version=_DEFAULT_VALUE, + trust_store=_DEFAULT_VALUE, + sni_callback=_DEFAULT_VALUE): + """ + Create a new ``TLSConfiguration``, overriding some of the settings + on the original configuration with the new settings. + """ + if validate_certificates is _DEFAULT_VALUE: + validate_certificates = self.validate_certificates + + if certificate_chain is _DEFAULT_VALUE: + certificate_chain = self.certificate_chain + + if ciphers is _DEFAULT_VALUE: + ciphers = self.ciphers + + if inner_protocols is _DEFAULT_VALUE: + inner_protocols = self.inner_protocols + + if lowest_supported_version is _DEFAULT_VALUE: + lowest_supported_version = self.lowest_supported_version + + if highest_supported_version is _DEFAULT_VALUE: + highest_supported_version = self.highest_supported_version + + if trust_store is _DEFAULT_VALUE: + trust_store = self.trust_store + + if sni_callback is _DEFAULT_VALUE: + sni_callback = self.sni_callback + + return self.__class__( + validate_certificates, certificate_chain, ciphers, + inner_protocols, lowest_supported_version, + highest_supported_version, trust_store, sni_callback + ) + + + +Context +~~~~~~~ + +We define two Context abstract base classes. These ABCs define objects that +allow configuration of TLS to be applied to specific connections. They can be +thought of as factories for ``TLSWrappedSocket`` and ``TLSWrappedBuffer`` +objects. + +Unlike the current ``ssl`` module, we provide two context classes instead of +one. Specifically, we provide the ``ClientContext`` and ``ServerContext`` +classes. This simplifies the APIs (for example, there is no sense in the server +providing the ``server_hostname`` parameter to ``ssl.SSLContext.wrap_socket``, +but because there is only one context class that parameter is still available), +and ensures that implementations know as early as possible which side of a TLS +connection they will serve. Additionally, it allows implementations to opt-out +of one or either side of the connection. For example, SecureTransport on macOS +is not really intended for server use and has an enormous amount of +functionality missing for server-side use. This would allow SecureTransport +implementations to simply not define a concrete subclass of ``ServerContext`` +to signal their lack of support. + +One of the other major differences to the current ``ssl`` module is that a +number of flags and options have been removed. Most of these are self-evident, +but it is worth noting that ``auto_handshake`` has been removed from +``wrap_socket``. This was removed because it fundamentally represents an odd +design wart that saves very minimal effort at the cost of a complexity increase +both for users and implementers. This PEP requires that all users call +``do_handshake`` explicitly after connecting. + +As much as possible implementers should aim to make these classes immutable: +that is, they should prefer not to allow users to mutate their internal state +directly, instead preferring to create new contexts from new TLSConfiguration +objects. Obviously, the ABCs cannot enforce this constraint, and so they do not +attempt to. + +The ``Context`` abstract base class has the following class definition:: + + TLSBufferObject = Union[TLSWrappedSocket, TLSWrappedBuffer] + + + class _BaseContext(metaclass=ABCMeta): + @abstractmethod + def __init__(self, configuration: TLSConfiguration): + """ + Create a new context object from a given TLS configuration. + """ + + @property + @abstractmethod + def configuration(self) -> TLSConfiguration: + """ + Returns the TLS configuration that was used to create the context. + """ + + + class ClientContext(_BaseContext): + def wrap_socket(self, + socket: socket.socket, + server_hostname: Optional[str]) -> TLSWrappedSocket: + """ + Wrap an existing Python socket object ``socket`` and return a + ``TLSWrappedSocket`` object. ``socket`` must be a ``SOCK_STREAM`` + socket: all other socket types are unsupported. + + The returned SSL socket is tied to the context, its settings and + certificates. The socket object originally passed to this method + should not be used again: attempting to use it in any way will lead + to undefined behaviour, especially across different TLS + implementations. To get the original socket object back once it has + been wrapped in TLS, see the ``unwrap`` method of the + TLSWrappedSocket. + + The parameter ``server_hostname`` specifies the hostname of the + service which we are connecting to. This allows a single server to + host multiple SSL-based services with distinct certificates, quite + similarly to HTTP virtual hosts. This is also used to validate the + TLS certificate for the given hostname. If hostname validation is + not desired, then pass ``None`` for this parameter. This parameter + has no default value because opting-out of hostname validation is + dangerous, and should not be the default behaviour. + """ + buffer = self.wrap_buffers(server_hostname) + return TLSWrappedSocket(socket, buffer) + + @abstractmethod + def wrap_buffers(self, server_hostname: Optional[str]) -> TLSWrappedBuffer: + """ + Create an in-memory stream for TLS, using memory buffers to store + incoming and outgoing ciphertext. The TLS routines will read + received TLS data from one buffer, and write TLS data that needs to + be emitted to another buffer. + + The implementation details of how this buffering works are up to + the individual TLS implementation. This allows TLS libraries that + have their own specialised support to continue to do so, while + allowing those without to use whatever Python objects they see fit. + + The ``server_hostname`` parameter has the same meaning as in + ``wrap_socket``. + """ + + + class ServerContext(_BaseContext): + def wrap_socket(self, socket: socket.socket) -> TLSWrappedSocket: + """ + Wrap an existing Python socket object ``socket`` and return a + ``TLSWrappedSocket`` object. ``socket`` must be a ``SOCK_STREAM`` + socket: all other socket types are unsupported. + + The returned SSL socket is tied to the context, its settings and + certificates. The socket object originally passed to this method + should not be used again: attempting to use it in any way will lead + to undefined behaviour, especially across different TLS + implementations. To get the original socket object back once it has + been wrapped in TLS, see the ``unwrap`` method of the + TLSWrappedSocket. + """ + buffer = self.wrap_buffers() + return TLSWrappedSocket(socket, buffer) + + @abstractmethod + def wrap_buffers(self) -> TLSWrappedBuffer: + """ + Create an in-memory stream for TLS, using memory buffers to store + incoming and outgoing ciphertext. The TLS routines will read + received TLS data from one buffer, and write TLS data that needs to + be emitted to another buffer. + + The implementation details of how this buffering works are up to + the individual TLS implementation. This allows TLS libraries that + have their own specialised support to continue to do so, while + allowing those without to use whatever Python objects they see fit. + """ + + +Buffer +~~~~~~ + +The buffer-wrapper ABC will be defined by the ``TLSWrappedBuffer`` ABC, which +has the following definition:: + + class TLSWrappedBuffer(metaclass=ABCMeta): + @abstractmethod + def read(self, amt: int) -> bytes: + """ + Read up to ``amt`` bytes of data from the input buffer and return + the result as a ``bytes`` instance. + + Once EOF is reached, all further calls to this method return the + empty byte string ``b''``. + + May read "short": that is, fewer bytes may be returned than were + requested. + + Raise ``WantReadError`` or ``WantWriteError`` if there is + insufficient data in either the input or output buffer and the + operation would have caused data to be written or read. + + May raise ``RaggedEOF`` if the connection has been closed without a + graceful TLS shutdown. Whether this is an exception that should be + ignored or not is up to the specific application. + + As at any time a re-negotiation is possible, a call to ``read()`` + can also cause write operations. + """ + + @abstractmethod + def readinto(self, buffer: Any, amt: int) -> int: + """ + Read up to ``amt`` bytes of data from the input buffer into + ``buffer``, which must be an object that implements the buffer + protocol. Returns the number of bytes read. + + Once EOF is reached, all further calls to this method return the + empty byte string ``b''``. + + Raises ``WantReadError`` or ``WantWriteError`` if there is + insufficient data in either the input or output buffer and the + operation would have caused data to be written or read. + + May read "short": that is, fewer bytes may be read than were + requested. + + May raise ``RaggedEOF`` if the connection has been closed without a + graceful TLS shutdown. Whether this is an exception that should be + ignored or not is up to the specific application. + + As at any time a re-negotiation is possible, a call to + ``readinto()`` can also cause write operations. + """ + + @abstractmethod + def write(self, buf: Any) -> int: + """ + Write ``buf`` in encrypted form to the output buffer and return the + number of bytes written. The ``buf`` argument must be an object + supporting the buffer interface. + + Raise ``WantReadError`` or ``WantWriteError`` if there is + insufficient data in either the input or output buffer and the + operation would have caused data to be written or read. In either + case, users should endeavour to resolve that situation and then + re-call this method. When re-calling this method users *should* + re-use the exact same ``buf`` object, as some backends require that + the exact same buffer be used. + + This operation may write "short": that is, fewer bytes may be + written than were in the buffer. + + As at any time a re-negotiation is possible, a call to ``write()`` + can also cause read operations. + """ + + @abstractmethod + def do_handshake(self) -> None: + """ + Performs the TLS handshake. Also performs certificate validation + and hostname verification. + """ + + @abstractmethod + def cipher(self) -> Optional[Union[CipherSuite, int]]: + """ + Returns the CipherSuite entry for the cipher that has been + negotiated on the connection. If no connection has been negotiated, + returns ``None``. If the cipher negotiated is not defined in + CipherSuite, returns the 16-bit integer representing that cipher + directly. + """ + + @abstractmethod + def negotiated_protocol(self) -> Optional[Union[NextProtocol, bytes]]: + """ + Returns the protocol that was selected during the TLS handshake. + This selection may have been made using ALPN, NPN, or some future + negotiation mechanism. + + If the negotiated protocol is one of the protocols defined in the + ``NextProtocol`` enum, the value from that enum will be returned. + Otherwise, the raw bytestring of the negotiated protocol will be + returned. + + If ``Context.set_inner_protocols()`` was not called, if the other + party does not support protocol negotiation, if this socket does + not support any of the peer's proposed protocols, or if the + handshake has not happened yet, ``None`` is returned. + """ + + @property + @abstractmethod + def context(self) -> Context: + """ + The ``Context`` object this buffer is tied to. + """ + + @abstractproperty + def negotiated_tls_version(self) -> Optional[TLSVersion]: + """ + The version of TLS that has been negotiated on this connection. + """ + + @abstractmethod + def shutdown(self) -> None: + """ + Performs a clean TLS shut down. This should generally be used + whenever possible to signal to the remote peer that the content is + finished. + """ + + @abstractmethod + def receive_from_network(self, data): + """ + Receives some TLS data from the network and stores it in an + internal buffer. + """ + + @abstractmethod + def peek_outgoing(self, amt): + """ + Returns the next ``amt`` bytes of data that should be written to + the network from the outgoing data buffer, without removing it from + the internal buffer. + """ + + @abstractmethod + def consume_outgoing(self, amt): + """ + Discard the next ``amt`` bytes from the outgoing data buffer. This + should be used when ``amt`` bytes have been sent on the network, to + signal that the data no longer needs to be buffered. + """ + + +Socket +~~~~~~ + +The socket-wrapper class will be a concrete class that accepts two items in its +constructor: a regular socket object, and a ``TLSWrappedBuffer`` object. This +object will be too large to recreate in this PEP, but will be submitted as part +of the work to build the module. + +The wrapped socket will implement all of the socket API, though it will have +stub implementations of methods that only work for sockets with types other +than ``SOCK_STREAM`` (e.g. ``sendto``/``recvfrom``). That limitation can be +lifted as-and-when support for DTLS is added to this module. + +In addition, the socket class will include the following *extra* methods on top +of the regular socket methods:: + + class TLSWrappedSocket: + def do_handshake(self) -> None: + """ + Performs the TLS handshake. Also performs certificate validation + and hostname verification. This must be called after the socket has + connected (either via ``connect`` or ``accept``), before any other + operation is performed on the socket. + """ + + def cipher(self) -> Optional[Union[CipherSuite, int]]: + """ + Returns the CipherSuite entry for the cipher that has been + negotiated on the connection. If no connection has been negotiated, + returns ``None``. If the cipher negotiated is not defined in + CipherSuite, returns the 16-bit integer representing that cipher + directly. + """ + + def negotiated_protocol(self) -> Optional[Union[NextProtocol, bytes]]: + """ + Returns the protocol that was selected during the TLS handshake. + This selection may have been made using ALPN, NPN, or some future + negotiation mechanism. + + If the negotiated protocol is one of the protocols defined in the + ``NextProtocol`` enum, the value from that enum will be returned. + Otherwise, the raw bytestring of the negotiated protocol will be + returned. + + If ``Context.set_inner_protocols()`` was not called, if the other + party does not support protocol negotiation, if this socket does + not support any of the peer's proposed protocols, or if the + handshake has not happened yet, ``None`` is returned. + """ + + @property + def context(self) -> Context: + """ + The ``Context`` object this socket is tied to. + """ + + def negotiated_tls_version(self) -> Optional[TLSVersion]: + """ + The version of TLS that has been negotiated on this connection. + """ + + def unwrap(self) -> socket.socket: + """ + Cleanly terminate the TLS connection on this wrapped socket. Once + called, this ``TLSWrappedSocket`` can no longer be used to transmit + data. Returns the socket that was wrapped with TLS. + """ + + + +Cipher Suites +~~~~~~~~~~~~~ + +Supporting cipher suites in a truly library-agnostic fashion is a remarkably +difficult undertaking. Different TLS implementations often have *radically* +different APIs for specifying cipher suites, but more problematically these +APIs frequently differ in capability as well as in style. Some examples are +shown below: + +OpenSSL +^^^^^^^ + +OpenSSL uses a well-known cipher string format. This format has been adopted as +a configuration language by most products that use OpenSSL, including Python. +This format is relatively easy to read, but has a number of downsides: it is +a string, which makes it remarkably easy to provide bad inputs; it lacks much +detailed validation, meaning that it is possible to configure OpenSSL in a way +that doesn't allow it to negotiate any cipher at all; and it allows specifying +cipher suites in a number of different ways that make it tricky to parse. The +biggest problem with this format is that there is no formal specification for +it, meaning that the only way to parse a given string the way OpenSSL would is +to get OpenSSL to parse it. + +OpenSSL's cipher strings can look like this:: + + 'ECDH+AESGCM:ECDH+CHACHA20:DH+AESGCM:DH+CHACHA20:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!eNULL:!MD5' + +This string demonstrates some of the complexity of the OpenSSL format. For +example, it is possible for one entry to specify multiple cipher suites: the +entry ``ECDH+AESGCM`` means "all ciphers suites that include both +elliptic-curve Diffie-Hellman key exchange and AES in Galois Counter Mode". +More explicitly, that will expand to four cipher suites:: + + "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256" + +That makes parsing a complete OpenSSL cipher string extremely tricky. Add to +the fact that there are other meta-characters, such as "!" (exclude all cipher +suites that match this criterion, even if they would otherwise be included: +"!MD5" means that no cipher suites using the MD5 hash algorithm should be +included), "-" (exclude matching ciphers if they were already included, but +allow them to be re-added later if they get included again), and "+" (include +the matching ciphers, but place them at the end of the list), and you get an +*extremely* complex format to parse. On top of this complexity it should be +noted that the actual result depends on the OpenSSL version, as an OpenSSL +cipher string is valid so long as it contains at least one cipher that OpenSSL +recognises. + +OpenSSL also uses different names for its ciphers than the names used in the +relevant specifications. See the manual page for ``ciphers(1)`` for more +details. + +The actual API inside OpenSSL for the cipher string is simple:: + + char *cipher_list = ; + int rc = SSL_CTX_set_cipher_list(context, cipher_list); + +This means that any format that is used by this module must be able to be +converted to an OpenSSL cipher string for use with OpenSSL. + +SecureTransport +^^^^^^^^^^^^^^^ + +SecureTransport is the macOS system TLS library. This library is substantially +more restricted than OpenSSL in many ways, as it has a much more restricted +class of users. One of these substantial restrictions is in controlling +supported cipher suites. + +Ciphers in SecureTransport are represented by a C ``enum``. This enum has one +entry per cipher suite, with no aggregate entries, meaning that it is not +possible to reproduce the meaning of an OpenSSL cipher string like +"ECDH+AESGCM" without hand-coding which categories each enum member falls into. + +However, the names of most of the enum members are in line with the formal +names of the cipher suites: that is, the cipher suite that OpenSSL calls +"ECDHE-ECDSA-AES256-GCM-SHA384" is called +"TLS_ECDHE_ECDHSA_WITH_AES_256_GCM_SHA384" in SecureTransport. + +The API for configuring cipher suites inside SecureTransport is simple:: + + SSLCipherSuite ciphers[] = {TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, ...}; + OSStatus status = SSLSetEnabledCiphers(context, ciphers, sizeof(ciphers)); + +SChannel +^^^^^^^^ + +SChannel is the Windows system TLS library. + +SChannel has extremely restrictive support for controlling available TLS +cipher suites, and additionally adopts a third method of expressing what TLS +cipher suites are supported. + +Specifically, SChannel defines a set of ``ALG_ID`` constants (C unsigned ints). +Each of these constants does not refer to an entire cipher suite, but instead +an individual algorithm. Some examples are ``CALG_3DES`` and ``CALG_AES_256``, +which refer to the bulk encryption algorithm used in a cipher suite, +``CALG_DH_EPHEM`` and ``CALG_RSA_KEYX`` which refer to part of the key exchange +algorithm used in a cipher suite, ``CALG_SHA1`` and ``CALG_MD5`` which refer to +the message authentication code used in a cipher suite, and ``CALG_ECDSA`` and +``CALG_RSA_SIGN`` which refer to the signing portions of the key exchange +algorithm. + +This can be thought of as the half of OpenSSL's functionality that +SecureTransport doesn't have: SecureTransport only allows specifying exact +cipher suites, while SChannel only allows specifying *parts* of the cipher +suite, while OpenSSL allows both. + +Determining which cipher suites are allowed on a given connection is done by +providing a pointer to an array of these ``ALG_ID`` constants. This means that +any suitable API must allow the Python code to determine which ``ALG_ID`` +constants must be provided. + + +Network Security Services (NSS) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +NSS is Mozilla's crypto and TLS library. It's used in Firefox, Thunderbird, +and as alternative to OpenSSL in multiple libraries, e.g. curl. + +By default, NSS comes with secure configuration of allowed ciphers. On some +platforms such as Fedora, the list of enabled ciphers is globally configured +in a system policy. Generally, applications should not modify cipher suites +unless they have specific reasons to do so. + +NSS has both process global and per-connection settings for cipher suites. It +does not have a concept of SSLContext like OpenSSL. A SSLContext-like behavior +can be easily emulated. Specifically, ciphers can be enabled or disabled +globally with ```SSL_CipherPrefSetDefault(PRInt32 cipher, PRBool enabled)```, +and ```SSL_CipherPrefSet(PRFileDesc *fd, PRInt32 cipher, PRBool enabled)``` +for a connection. The cipher ```PRInt32``` number is a signed 32bit integer +that directly corresponds to an registered IANA id, e.g. ```0x1301``` +is ```TLS_AES_128_GCM_SHA256```. Contrary to OpenSSL, the preference order +of ciphers is fixed and cannot be modified at runtime. + +Like SecureTransport, NSS has no API for aggregated entries. Some consumers +of NSS have implemented custom mappings from OpenSSL cipher names and rules +to NSS ciphers, e.g. ```mod_nss```. + + +Proposed Interface +^^^^^^^^^^^^^^^^^^ + +The proposed interface for the new module is influenced by the combined set of +limitations of the above implementations. Specifically, as every implementation +*except* OpenSSL requires that each individual cipher be provided, there is no +option but to provide that lowest-common denominator approach. + +The simplest approach is to provide an enumerated type that includes a large +subset of the cipher suites defined for TLS. The values of the enum members +will be their two-octet cipher identifier as used in the TLS handshake, +stored as a 16 bit integer. The names of the enum members will be their +IANA-registered cipher suite names. + +As of now, the `IANA cipher suite registry`_ contains over 320 cipher suites. +A large portion of the cipher suites are irrelevant for TLS connections to +network services. Other suites specify deprecated and insecure algorithms +that are no longer provided by recent versions of implementations. The enum +does not contain ciphers with: + +* key exchange: NULL, Kerberos (KRB5), pre-shared key (PSK), secure remote + transport (TLS-SRP) +* authentication: NULL, anonymous, export grade, Kerberos (KRB5), + pre-shared key (PSK), secure remote transport (TLS-SRP), DSA cert (DSS) +* encryption: NULL, ARIA, DES, RC2, export grade 40bit +* PRF: MD5 +* SCSV cipher suites + +3DES, RC4, SEED, and IDEA are included for legacy applications. Further more +five additional cipher suites from the TLS 1.3 draft (draft-ietf-tls-tls13-18) +are included, too. TLS 1.3 does not share any cipher suites with TLS 1.2 and +earlier. The resulting enum will contain roughly 110 suites. + +Because of these limitations, and because the enum doesn't contain every +defined cipher, and also to allow for forward-looking applications, all parts +of this API that accept ``CipherSuite`` objects will also accept raw 16-bit +integers directly. + +Rather than populate this enum by hand, we have a `TLS enum script`_ that +builds it from Christian Heimes' `tlsdb JSON file`_ (warning: +large file) and `IANA cipher suite registry`_. The TLSDB also opens up the +possibility of extending the API with additional querying function, +such as determining which TLS versions support which ciphers, if that +functionality is found to be useful or necessary. + +If users find this approach to be onerous, a future extension to this API can +provide helpers that can reintroduce OpenSSL's aggregation functionality. + +:: + + class CipherSuite(IntEnum): + TLS_RSA_WITH_RC4_128_SHA = 0x0005 + TLS_RSA_WITH_IDEA_CBC_SHA = 0x0007 + TLS_RSA_WITH_3DES_EDE_CBC_SHA = 0x000a + TLS_DH_RSA_WITH_3DES_EDE_CBC_SHA = 0x0010 + TLS_DHE_RSA_WITH_3DES_EDE_CBC_SHA = 0x0016 + TLS_RSA_WITH_AES_128_CBC_SHA = 0x002f + TLS_DH_RSA_WITH_AES_128_CBC_SHA = 0x0031 + TLS_DHE_RSA_WITH_AES_128_CBC_SHA = 0x0033 + TLS_RSA_WITH_AES_256_CBC_SHA = 0x0035 + TLS_DH_RSA_WITH_AES_256_CBC_SHA = 0x0037 + TLS_DHE_RSA_WITH_AES_256_CBC_SHA = 0x0039 + TLS_RSA_WITH_AES_128_CBC_SHA256 = 0x003c + TLS_RSA_WITH_AES_256_CBC_SHA256 = 0x003d + TLS_DH_RSA_WITH_AES_128_CBC_SHA256 = 0x003f + TLS_RSA_WITH_CAMELLIA_128_CBC_SHA = 0x0041 + TLS_DH_RSA_WITH_CAMELLIA_128_CBC_SHA = 0x0043 + TLS_DHE_RSA_WITH_CAMELLIA_128_CBC_SHA = 0x0045 + TLS_DHE_RSA_WITH_AES_128_CBC_SHA256 = 0x0067 + TLS_DH_RSA_WITH_AES_256_CBC_SHA256 = 0x0069 + TLS_DHE_RSA_WITH_AES_256_CBC_SHA256 = 0x006b + TLS_RSA_WITH_CAMELLIA_256_CBC_SHA = 0x0084 + TLS_DH_RSA_WITH_CAMELLIA_256_CBC_SHA = 0x0086 + TLS_DHE_RSA_WITH_CAMELLIA_256_CBC_SHA = 0x0088 + TLS_RSA_WITH_SEED_CBC_SHA = 0x0096 + TLS_DH_RSA_WITH_SEED_CBC_SHA = 0x0098 + TLS_DHE_RSA_WITH_SEED_CBC_SHA = 0x009a + TLS_RSA_WITH_AES_128_GCM_SHA256 = 0x009c + TLS_RSA_WITH_AES_256_GCM_SHA384 = 0x009d + TLS_DHE_RSA_WITH_AES_128_GCM_SHA256 = 0x009e + TLS_DHE_RSA_WITH_AES_256_GCM_SHA384 = 0x009f + TLS_DH_RSA_WITH_AES_128_GCM_SHA256 = 0x00a0 + TLS_DH_RSA_WITH_AES_256_GCM_SHA384 = 0x00a1 + TLS_RSA_WITH_CAMELLIA_128_CBC_SHA256 = 0x00ba + TLS_DH_RSA_WITH_CAMELLIA_128_CBC_SHA256 = 0x00bc + TLS_DHE_RSA_WITH_CAMELLIA_128_CBC_SHA256 = 0x00be + TLS_RSA_WITH_CAMELLIA_256_CBC_SHA256 = 0x00c0 + TLS_DH_RSA_WITH_CAMELLIA_256_CBC_SHA256 = 0x00c2 + TLS_DHE_RSA_WITH_CAMELLIA_256_CBC_SHA256 = 0x00c4 + TLS_AES_128_GCM_SHA256 = 0x1301 + TLS_AES_256_GCM_SHA384 = 0x1302 + TLS_CHACHA20_POLY1305_SHA256 = 0x1303 + TLS_AES_128_CCM_SHA256 = 0x1304 + TLS_AES_128_CCM_8_SHA256 = 0x1305 + TLS_ECDH_ECDSA_WITH_RC4_128_SHA = 0xc002 + TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA = 0xc003 + TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA = 0xc004 + TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA = 0xc005 + TLS_ECDHE_ECDSA_WITH_RC4_128_SHA = 0xc007 + TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA = 0xc008 + TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA = 0xc009 + TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA = 0xc00a + TLS_ECDH_RSA_WITH_RC4_128_SHA = 0xc00c + TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA = 0xc00d + TLS_ECDH_RSA_WITH_AES_128_CBC_SHA = 0xc00e + TLS_ECDH_RSA_WITH_AES_256_CBC_SHA = 0xc00f + TLS_ECDHE_RSA_WITH_RC4_128_SHA = 0xc011 + TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA = 0xc012 + TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA = 0xc013 + TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA = 0xc014 + TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 = 0xc023 + TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384 = 0xc024 + TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256 = 0xc025 + TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA384 = 0xc026 + TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 = 0xc027 + TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 = 0xc028 + TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256 = 0xc029 + TLS_ECDH_RSA_WITH_AES_256_CBC_SHA384 = 0xc02a + TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 = 0xc02b + TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 = 0xc02c + TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256 = 0xc02d + TLS_ECDH_ECDSA_WITH_AES_256_GCM_SHA384 = 0xc02e + TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 = 0xc02f + TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 = 0xc030 + TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256 = 0xc031 + TLS_ECDH_RSA_WITH_AES_256_GCM_SHA384 = 0xc032 + TLS_ECDHE_ECDSA_WITH_CAMELLIA_128_CBC_SHA256 = 0xc072 + TLS_ECDHE_ECDSA_WITH_CAMELLIA_256_CBC_SHA384 = 0xc073 + TLS_ECDH_ECDSA_WITH_CAMELLIA_128_CBC_SHA256 = 0xc074 + TLS_ECDH_ECDSA_WITH_CAMELLIA_256_CBC_SHA384 = 0xc075 + TLS_ECDHE_RSA_WITH_CAMELLIA_128_CBC_SHA256 = 0xc076 + TLS_ECDHE_RSA_WITH_CAMELLIA_256_CBC_SHA384 = 0xc077 + TLS_ECDH_RSA_WITH_CAMELLIA_128_CBC_SHA256 = 0xc078 + TLS_ECDH_RSA_WITH_CAMELLIA_256_CBC_SHA384 = 0xc079 + TLS_RSA_WITH_CAMELLIA_128_GCM_SHA256 = 0xc07a + TLS_RSA_WITH_CAMELLIA_256_GCM_SHA384 = 0xc07b + TLS_DHE_RSA_WITH_CAMELLIA_128_GCM_SHA256 = 0xc07c + TLS_DHE_RSA_WITH_CAMELLIA_256_GCM_SHA384 = 0xc07d + TLS_DH_RSA_WITH_CAMELLIA_128_GCM_SHA256 = 0xc07e + TLS_DH_RSA_WITH_CAMELLIA_256_GCM_SHA384 = 0xc07f + TLS_ECDHE_ECDSA_WITH_CAMELLIA_128_GCM_SHA256 = 0xc086 + TLS_ECDHE_ECDSA_WITH_CAMELLIA_256_GCM_SHA384 = 0xc087 + TLS_ECDH_ECDSA_WITH_CAMELLIA_128_GCM_SHA256 = 0xc088 + TLS_ECDH_ECDSA_WITH_CAMELLIA_256_GCM_SHA384 = 0xc089 + TLS_ECDHE_RSA_WITH_CAMELLIA_128_GCM_SHA256 = 0xc08a + TLS_ECDHE_RSA_WITH_CAMELLIA_256_GCM_SHA384 = 0xc08b + TLS_ECDH_RSA_WITH_CAMELLIA_128_GCM_SHA256 = 0xc08c + TLS_ECDH_RSA_WITH_CAMELLIA_256_GCM_SHA384 = 0xc08d + TLS_RSA_WITH_AES_128_CCM = 0xc09c + TLS_RSA_WITH_AES_256_CCM = 0xc09d + TLS_DHE_RSA_WITH_AES_128_CCM = 0xc09e + TLS_DHE_RSA_WITH_AES_256_CCM = 0xc09f + TLS_RSA_WITH_AES_128_CCM_8 = 0xc0a0 + TLS_RSA_WITH_AES_256_CCM_8 = 0xc0a1 + TLS_DHE_RSA_WITH_AES_128_CCM_8 = 0xc0a2 + TLS_DHE_RSA_WITH_AES_256_CCM_8 = 0xc0a3 + TLS_ECDHE_ECDSA_WITH_AES_128_CCM = 0xc0ac + TLS_ECDHE_ECDSA_WITH_AES_256_CCM = 0xc0ad + TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8 = 0xc0ae + TLS_ECDHE_ECDSA_WITH_AES_256_CCM_8 = 0xc0af + TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 = 0xcca8 + TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256 = 0xcca9 + TLS_DHE_RSA_WITH_CHACHA20_POLY1305_SHA256 = 0xccaa + + +Enum members can be mapped to OpenSSL cipher names:: + + >>> import ssl + >>> ctx = ssl.SSLContext(ssl.PROTOCOL_TLS) + >>> ctx.set_ciphers('ALL:COMPLEMENTOFALL') + >>> ciphers = {c['id'] & 0xffff: c['name'] for c in ctx.get_ciphers()} + >>> ciphers[CipherSuite.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256] + 'ECDHE-RSA-AES128-GCM-SHA256' + + +For SecureTransport, these enum members directly refer to the values of the +cipher suite constants. For example, SecureTransport defines the cipher suite +enum member ``TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384`` as having the value +``0xC02C``. Not coincidentally, that is identical to its value in the above +enum. This makes mapping between SecureTransport and the above enum very easy +indeed. + +For SChannel there is no easy direct mapping, due to the fact that SChannel +configures ciphers, instead of cipher suites. This represents an ongoing +concern with SChannel, which is that it is very difficult to configure in a +specific manner compared to other TLS implementations. + +For the purposes of this PEP, any SChannel implementation will need to +determine which ciphers to choose based on the enum members. This may be more +open than the actual cipher suite list actually wants to allow, or it may be +more restrictive, depending on the choices of the implementation. This PEP +recommends that it be more restrictive, but of course this cannot be enforced. + + +Protocol Negotiation +~~~~~~~~~~~~~~~~~~~~ + +Both NPN and ALPN allow for protocol negotiation as part of the HTTP/2 +handshake. While NPN and ALPN are, at their fundamental level, built on top of +bytestrings, string-based APIs are frequently problematic as they allow for +errors in typing that can be hard to detect. + +For this reason, this module would define a type that protocol negotiation +implementations can pass and be passed. This type would wrap a bytestring to +allow for aliases for well-known protocols. This allows us to avoid the +problems inherent in typos for well-known protocols, while allowing the full +extensibility of the protocol negotiation layer if needed by letting users pass +byte strings directly. + +:: + + class NextProtocol(Enum): + H2 = b'h2' + H2C = b'h2c' + HTTP1 = b'http/1.1' + WEBRTC = b'webrtc' + C_WEBRTC = b'c-webrtc' + FTP = b'ftp' + STUN = b'stun.nat-discovery' + TURN = b'stun.turn' + +TLS Versions +~~~~~~~~~~~~ + +It is often useful to be able to restrict the versions of TLS you're willing to +support. There are many security advantages in refusing to use old versions of +TLS, and some misbehaving servers will mishandle TLS clients advertising +support for newer versions. + +The following enumerated type can be used to gate TLS versions. Forward-looking +applications should almost never set a maximum TLS version unless they +absolutely must, as a TLS backend that is newer than the Python that uses it +may support TLS versions that are not in this enumerated type. + +Additionally, this enumerated type defines two additional flags that can always +be used to request either the lowest or highest TLS version supported by an +implementation. + +:: + + class TLSVersion(Enum): + MINIMUM_SUPPORTED = auto() + SSLv2 = auto() + SSLv3 = auto() + TLSv1 = auto() + TLSv1_1 = auto() + TLSv1_2 = auto() + TLSv1_3 = auto() + MAXIMUM_SUPPORTED = auto() + + +Errors +~~~~~~ + +This module would define four base classes for use with error handling. Unlike +many of the the other classes defined here, these classes are not abstract, as +they have no behaviour. They exist simply to signal certain common behaviours. +Backends should subclass these exceptions in their own packages, but needn't +define any behaviour for them. + +In general, concrete implementations should subclass these exceptions rather +than throw them directly. This makes it moderately easier to determine which +concrete TLS implementation is in use during debugging of unexpected errors. +However, this is not mandatory. + +The definitions of the errors are below:: + + class TLSError(Exception): + """ + The base exception for all TLS related errors from any backend. + Catching this error should be sufficient to catch *all* TLS errors, + regardless of what backend is used. + """ + + class WantWriteError(TLSError): + """ + A special signaling exception used only when non-blocking or + buffer-only I/O is used. This error signals that the requested + operation cannot complete until more data is written to the network, + or until the output buffer is drained. + + This error is should only be raised when it is completely impossible + to write any data. If a partial write is achievable then this should + not be raised. + """ + + class WantReadError(TLSError): + """ + A special signaling exception used only when non-blocking or + buffer-only I/O is used. This error signals that the requested + operation cannot complete until more data is read from the network, or + until more data is available in the input buffer. + + This error should only be raised when it is completely impossible to + write any data. If a partial write is achievable then this should not + be raised. + """ + + class RaggedEOF(TLSError): + """ + A special signaling exception used when a TLS connection has been + closed gracelessly: that is, when a TLS CloseNotify was not received + from the peer before the underlying TCP socket reached EOF. This is a + so-called "ragged EOF". + + This exception is not guaranteed to be raised in the face of a ragged + EOF: some implementations may not be able to detect or report the + ragged EOF. + + This exception is not always a problem. Ragged EOFs are a concern only + when protocols are vulnerable to length truncation attacks. Any + protocol that can detect length truncation attacks at the application + layer (e.g. HTTP/1.1 and HTTP/2) is not vulnerable to this kind of + attack and so can ignore this exception. + """ + + +Certificates +~~~~~~~~~~~~ + +This module would define an abstract X509 certificate class. This class would +have almost no behaviour, as the goal of this module is not to provide all +possible relevant cryptographic functionality that could be provided by X509 +certificates. Instead, all we need is the ability to signal the source of a +certificate to a concrete implementation. + +For that reason, this certificate implementation defines only constructors. In +essence, the certificate object in this module could be as abstract as a handle +that can be used to locate a specific certificate. + +Concrete implementations may choose to provide alternative constructors, e.g. +to load certificates from HSMs. If a common interface emerges for doing this, +this module may be updated to provide a standard constructor for this use-case +as well. + +Concrete implementations should aim to have Certificate objects be hashable if +at all possible. This will help ensure that TLSConfiguration objects used with +an individual concrete implementation are also hashable. + +:: + + class Certificate(metaclass=ABCMeta): + @abstractclassmethod + def from_buffer(cls, buffer: bytes): + """ + Creates a Certificate object from a byte buffer. This byte buffer + may be either PEM-encoded or DER-encoded. If the buffer is PEM + encoded it *must* begin with the standard PEM preamble (a series of + dashes followed by the ASCII bytes "BEGIN CERTIFICATE" and another + series of dashes). In the absence of that preamble, the + implementation may assume that the certificate is DER-encoded + instead. + """ + + @abstractclassmethod + def from_file(cls, path: Union[pathlib.Path, AnyStr]): + """ + Creates a Certificate object from a file on disk. This method may + be a convenience method that wraps ``open`` and ``from_buffer``, + but some TLS implementations may be able to provide more-secure or + faster methods of loading certificates that do not involve Python + code. + """ + + +Private Keys +~~~~~~~~~~~~ + +This module would define an abstract private key class. Much like the +Certificate class, this class has almost no behaviour in order to give as much +freedom as possible to the concrete implementations to treat keys carefully. + +This class has all the caveats of the ``Certificate`` class. + +:: + + class PrivateKey(metaclass=ABCMeta): + @abstractclassmethod + def from_buffer(cls, + buffer: bytes, + password: Optional[Union[Callable[[], Union[bytes, bytearray]], bytes, bytearray]] = None): + """ + Creates a PrivateKey object from a byte buffer. This byte buffer + may be either PEM-encoded or DER-encoded. If the buffer is PEM + encoded it *must* begin with the standard PEM preamble (a series of + dashes followed by the ASCII bytes "BEGIN", the key type, and + another series of dashes). In the absence of that preamble, the + implementation may assume that the certificate is DER-encoded + instead. + + The key may additionally be encrypted. If it is, the ``password`` + argument can be used to decrypt the key. The ``password`` argument + may be a function to call to get the password for decrypting the + private key. It will only be called if the private key is encrypted + and a password is necessary. It will be called with no arguments, + and it should return either bytes or bytearray containing the + password. Alternatively a bytes, or bytearray value may be supplied + directly as the password argument. It will be ignored if the + private key is not encrypted and no password is needed. + """ + + @abstractclassmethod + def from_file(cls, + path: Union[pathlib.Path, bytes, str], + password: Optional[Union[Callable[[], Union[bytes, bytearray]], bytes, bytearray]] = None): + """ + Creates a PrivateKey object from a file on disk. This method may + be a convenience method that wraps ``open`` and ``from_buffer``, + but some TLS implementations may be able to provide more-secure or + faster methods of loading certificates that do not involve Python + code. + + The ``password`` parameter behaves exactly as the equivalent + parameter on ``from_buffer``. + """ + + +Trust Store +~~~~~~~~~~~ + +As discussed above, loading a trust store represents an issue because different +TLS implementations vary wildly in how they allow users to select trust stores. +For this reason, we need to provide a model that assumes very little about the +form that trust stores take. + +This problem is the same as the one that the Certificate and PrivateKey types +need to solve. For this reason, we use the exact same model, by creating an +opaque type that can encapsulate the various means that TLS backends may open +a trust store. + +A given TLS implementation is not required to implement all of the +constructors. However, it is strongly recommended that a given TLS +implementation provide the ``system`` constructor if at all possible, as this +is the most common validation trust store that is used. Concrete +implementations may also add their own constructors. + +Concrete implementations should aim to have TrustStore objects be hashable if +at all possible. This will help ensure that TLSConfiguration objects used with +an individual concrete implementation are also hashable. + +:: + + class TrustStore(metaclass=ABCMeta): + @abstractclassmethod + def system(cls) -> TrustStore: + """ + Returns a TrustStore object that represents the system trust + database. + """ + + @abstractclassmethod + def from_pem_file(cls, path: Union[pathlib.Path, bytes, str]) -> TrustStore: + """ + Initializes a trust store from a single file full of PEMs. + """ + + +Runtime Access +~~~~~~~~~~~~~~ + +A not-uncommon use case for library users is to want to allow the library to +control the TLS configuration, but to want to select what backend is in use. +For example, users of Requests may want to be able to select between OpenSSL or +a platform-native solution on Windows and macOS, or between OpenSSL and NSS on +some Linux platforms. These users, however, may not care about exactly how +their TLS configuration is done. + +This poses a problem: given an arbitrary concrete implementation, how can a +library work out how to load certificates into the trust store? There are two +options: either all concrete implementations can be required to fit into a +specific naming scheme, or we can provide an API that makes it possible to grab +these objects. + +This PEP proposes that we use the second approach. This grants the greatest +freedom to concrete implementations to structure their code as they see fit, +requiring only that they provide a single object that has the appropriate +properties in place. Users can then pass this "backend" object to libraries +that support it, and those libraries can take care of configuring and using the +concrete implementation. + +All concrete implementations must provide a method of obtaining a ``Backend`` +object. The ``Backend`` object can be a global singleton or can be created by a +callable if there is an advantage in doing that. + +The ``Backend`` object has the following definition:: + + Backend = namedtuple( + 'Backend', + ['client_context', 'server_context', + 'certificate', 'private_key', 'trust_store'] + ) + +Each of the properties must provide the concrete implementation of the relevant +ABC. This ensures that code like this will work for any backend:: + + trust_store = backend.trust_store.system() + + +Changes to the Standard Library +=============================== + +The portions of the standard library that interact with TLS should be revised +to use these ABCs. This will allow them to function with other TLS backends. +This includes the following modules: + +- asyncio +- ftplib +- http +- imaplib +- nntplib +- poplib +- smtplib +- urllib + + +Migration of the ssl module +--------------------------- + +Naturally, we will need to extend the ``ssl`` module itself to conform to these +ABCs. This extension will take the form of new classes, potentially in an +entirely new module. This will allow applications that take advantage of the +current ``ssl`` module to continue to do so, while enabling the new APIs for +applications and libraries that want to use them. + +In general, migrating from the ``ssl`` module to the new ABCs is not expected +to be one-to-one. This is normally acceptable: most tools that use the ``ssl`` +module hide it from the user, and so refactoring to use the new module should +be invisible. + +However, a specific problem comes from libraries or applications that leak +exceptions from the ``ssl`` module, either as part of their defined API or by +accident (which is easily done). Users of those tools may have written code +that tolerates and handles exceptions from the ``ssl`` module being raised: +migrating to the ABCs presented here would potentially cause the exceptions +defined above to be thrown instead, and existing ``except`` blocks will not +catch them. + +For this reason, part of the migration of the ``ssl`` module would require that +the exceptions in the ``ssl`` module alias those defined above. That is, they +would require the following statements to all succeed:: + + assert ssl.SSLError is tls.TLSError + assert ssl.SSLWantReadError is tls.WantReadError + assert ssl.SSLWantWriteError is tls.WantWriteError + +The exact mechanics of how this will be done are beyond the scope of this PEP, +as they are made more complex due to the fact that the current ``ssl`` +exceptions are defined in C code, but more details can be found in +`an email sent to the Security-SIG by Christian Heimes`_. + + +Future +====== + +Major future TLS features may require revisions of these ABCs. These revisions +should be made cautiously: many backends may not be able to move forward +swiftly, and will be invalidated by changes in these ABCs. This is acceptable, +but wherever possible features that are specific to individual implementations +should not be added to the ABCs. The ABCs should restrict themselves to +high-level descriptions of IETF-specified features. + +However, well-justified extensions to this API absolutely should be made. The +focus of this API is to provide a unifying lowest-common-denominator +configuration option for the Python community. TLS is not a static target, and +as TLS evolves so must this API. + + +Credits +======= + +This document has received extensive review from a number of individuals in the +community who have substantially helped shape it. Detailed review was provided +by: + +* Alex Chan +* Alex Gaynor +* Antoine Pitrou +* Ashwini Oruganti +* Donald Stufft +* Ethan Furman +* Glyph +* Hynek Schlawack +* Jim J Jewett +* Nathaniel J. Smith +* Nick Coghlan +* Paul Kehrer +* Steve Dower +* Steven Fackler +* Wes Turner +* Will Bond + +Further review was provided by the Security-SIG and python-ideas mailing lists. + + +Copyright +========= + +This document has been placed in the public domain. + + +.. _ssl module: https://docs.python.org/3/library/ssl.html +.. _OpenSSL Library: https://www.openssl.org/ +.. _PyOpenSSL: https://pypi.org/project/pyOpenSSL/ +.. _certifi: https://pypi.org/project/certifi/ +.. _SSLContext: https://docs.python.org/3/library/ssl.html#ssl.SSLContext +.. _SSLSocket: https://docs.python.org/3/library/ssl.html#ssl.SSLSocket +.. _SSLObject: https://docs.python.org/3/library/ssl.html#ssl.SSLObject +.. _SSLError: https://docs.python.org/3/library/ssl.html#ssl.SSLError +.. _MSDN articles: https://msdn.microsoft.com/en-us/library/windows/desktop/mt490158(v=vs.85).aspx +.. _TLS enum script: https://github.com/tiran/tlsdb/blob/master/tlspep_ciphersuite.py +.. _tlsdb JSON file: https://github.com/tiran/tlsdb/blob/master/tlsdb.json +.. _IANA cipher suite registry: https://www.iana.org/assignments/tls-parameters/tls-parameters.xhtml#tls-parameters-4 +.. _an email sent to the Security-SIG by Christian Heimes: https://mail.python.org/pipermail/security-sig/2017-January/000213.html +.. _s2n: https://github.com/awslabs/s2n +.. _working to add it: https://github.com/awslabs/s2n/issues/358 + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: