python · brettcannon · Dec 14, 2016 · Dec 10, 2016 · brettcannon · Dec 13, 2016
diff --git a/pep-0536.txt b/pep-0536.txt
@@ -0,0 +1,183 @@
+PEP: 536
+Title: Final Grammar for Literal String Interpolation
+Version: $Revision$
+Last-Modified: $Date$
+Author: Philipp Angerer <[email protected]>
+Status: Draft
+Type: Standards Track
+Content-Type: text/x-rst
+Created: 11-Dec-2016
+Python-Version: 3.7
+Post-History: 12-Dec-2016
+
+Abstract
+========
+
+PEP 498 introduced Literal String Interpolation (or “f-strings”).
+The expression portions of those literals however are subject to
+certain restrictions.  This PEP proposes a formal grammar lifting
+those restrictions, promoting “f-strings” to “f expressions” or f-literals.
+
+This PEP expands upon the f-strings introduced by PEP 498,
+so this text requires familiarity with PEP 498.
+
+Terminology
+===========
+
+This text will refer to the existing grammar as “f-strings”,
+and the proposed one as “f-literals”.
+
+Furthermore, it will refer to the ``{}``-delimited expressions in
+f-literals/f-strings as “expression portions” and the static string content
+around them as “string portions”.
+
+Motivation
+==========
+
+The current implementation of f-strings in CPython relies on the existing
+string parsing machinery and a post processing of its tokens.  This results in
+several restrictions to the possible expressions usable within f-strings:
+
+#. It is impossible to use the quote character delimiting the f-string
+   within the expression portion::
+
+    >>> f'Magic wand: { bag['wand'] }'
+                                 ^
+    SyntaxError: invalid syntax
+
+#. A previously considered way around it would lead to escape sequences
+   in executed code and is prohibited in f-strings::
+
+    >>> f'Magic wand { bag[\'wand\'] } string'
+    SyntaxError: f-string expression portion cannot include a backslash
+
+#. Comments are forbidden even in multi-line f-strings::
+
+    >>> f'''A complex trick: {
+    ... bag['bag']  # recursive bags!
+    ... }'''
+    SyntaxError: f-string expression part cannot include '#'
+
+#. Expression portions need to wrap ``':'`` and ``'!'`` in braces::
+
+    >>> f'Useless use of lambdas: { lambda x: x*2 }'
+    SyntaxError: unexpected EOF while parsing
+
+These limitations serve no purpose from a language user perspective and
+can be lifted by giving f-literals a regular grammar without exceptions
+and implementing it using dedicated parse code.
+
+Rationale
+=========
+
+.. https://mail.python.org/pipermail/python-ideas/2016-August/041727.html
+
+The restrictions mentioned in Motivation_ are non-obvious and counter-intuitive
+unless the user is familiar with the f-literals’ implementation details.
+
+As mentioned, a previous version of PEP 498 allowed escape sequences
+anywhere in f-strings, including as ways to encode the braces delimiting
+the expression portions and in their code.  They would be expanded before
+the code is parsed, which would have had several important ramifications:
+
+#. It would not be clear to human readers which portions are Expressions
+and which are strings.  Great material for an “obfuscated/underhanded
+Python challenge”
+#. Syntax highlighters are good in parsing nested grammar, but not
+in recognizing escape sequences.  ECMAScript 2016 (JavaScript) allows
+escape sequences in its identifiers [1]_ and the author knows of no
+syntax highlighter able to correctly highlight code making use of this.
+
+As a consequence, the expression portions would be harder to recognize
+with and without the aid of syntax highlighting.  With the new grammar,
+it is easy to extend syntax highlighters to correctly parse
+and display f-literals:
+
+.. raw:: html
+
+   <pre><span style=color:#ff5500>f'Magic wand: </span><span style=color:#3daee9>{</span>bag[<span style=color:#bf0303>'wand'</span>]<span style=color:#3daee9>:^10}</span><span style=color:#ff5500>'</span></pre>
+
+.. This is the output of kate-syntax-highlighter when given that code
+   (with some quotes stripped)
+
+Highlighting expression portions with possible escape sequences would
+mean to create a modified copy of all rules of the complete expression
+grammar, accounting for the possibility of escape sequences in key words,
+delimiters, and all other language syntax. One such duplication would
+yield one level of escaping depth and have to be repeated for a deeper
+escaping in a recursive f-literal. This is the case since no highlighting
+engine known to the author supports expanding escape sequences before
+applying rules to a certain context. Nesting contexts however is a
+standard feature of all highlighting engines.
+
+Familiarity also plays a role: Arbitrary nesting of expressions
+without expansion of escape sequences is available in every single
+other language employing a string interpolation method that uses
+expressions instead of just variable names. [2]_
+
+Specification
+=============
+
+PEP 498 specified f-strings as the following, but places restrictions on it::
+
+    f ' <text> { <expression> <optional !s, !r, or !a> <optional : format specifier> } <text> ... '
+
+All restrictions mentioned in the PEP are lifted from f-literals,
+as explained below:
+
+#. Expression portions may now contain strings delimited with the same
+   kind of quote that is used to delimit the f-literal.
+#. Backslashes may now appear within expressions just like anywhere else
+   in Python code.  In case of strings nested within f-literals,
+   escape sequences are expanded when the innermost string is evaluated.
+#. Comments, using the ``'#'`` character, are possible only in multi-line
+   f-literals, since comments are terminated by the end of the line
+   (which makes closing a single-line f-literal impossible).
+#. Expression portions may contain ``':'`` or ``'!'`` wherever
+   syntactically valid.  The first ``':'`` or ``'!'`` that is not part
+   of an expression has to be followed a valid coercion or format specifier.
+
+A remaining restriction not explicitly mentioned by PEP 498 is line breaks
+in expression portions.  Since strings delimited by single ``'`` or ``"``
+characters are expected to be single line, line breaks remain illegal
+in expression portions of single line strings.
+
+.. note:: Is lifting of the restrictions sufficient,
+   or should we specify a more complete grammar?
+
+Backwards Compatibility
+=======================
+
+f-literals are fully backwards compatible to f-strings,
+and expands the syntax considered legal.
+
+Reference Implementation
+========================
+
+TBD
+
+References
+==========
+
+.. [1] ECMAScript ``IdentifierName`` specification
+   ( http://ecma-international.org/ecma-262/6.0/#sec-names-and-keywords )
+
+   Yes, ``const cthulhu = { H̹̙̦̮͉̩̗̗ͧ̇̏̊̾Eͨ͆͒̆ͮ̃͏̷̮̣̫̤̣Cͯ̂͐͏̨̛͔̦̟͈̻O̜͎͍͙͚̬̝̣̽ͮ͐͗̀ͤ̍̀͢M̴̡̲̭͍͇̼̟̯̦̉̒͠Ḛ̛̙̞̪̗ͥͤͩ̾͑̔͐ͅṮ̴̷̷̗̼͍̿̿̓̽͐H̙̙̔̄͜\u0042: 42 }`` is valid ECMAScript 2016
+
+.. [2] Wikipedia article on string interpolation
+   ( https://en.wikipedia.org/wiki/String_interpolation )
+
+Copyright
+=========
+
+This document has been placed in the public domain.
+
+
+..
+   Local Variables:
+   mode: indented-text
+   indent-tabs-mode: nil
+   sentence-end-double-space: t
+   fill-column: 70
+   coding: utf-8
+   End: