Python Essentials
上QQ阅读APP看书,第一时间看更新

Examining syntax rules

There are nine fundamental syntax rules in section 2.1 of the Python Language Reference. We'll summarize those rules here:

  1. There are two species of statements: simple and compound. Simple statements must be complete on a single logical line. A compound statement starts with a single logical line and must contain indented statements. The initial clause of a compound statement ends with a : character. It's possible, using rules 5 and 6, to join a number of physical lines together to create a single logical line.
    • Here's a typical simple statement, complete in a single logical line:
      from decimal import Decimal
    • Here's a typical compound statement with a nested simple statement, spread across two logical lines:
      if a > b:
          print(a, "is larger")
  2. A physical line ends with \n. In Windows, \r\n is also accepted.
  3. A comment starts with # and continues to the end of the physical line. It will end the logical line.
    • Here's an example of a comment:
      from fractions import Fraction # We'll use this to improve accuracy
  4. A special comment can be used to annotate the file encoding. This is generally not needed, since most IDE's and text editors handle the file encoding politely. We should generally save Python files in UTF-8 encoding. Older files may be saved in ASCII.
  5. Physical lines can be joined explicitly into a logical line using the \ as an escape character in front of the physical end-of-line character. This is rarely used and generally discouraged.
  6. Physical lines can be joined implicitly into a logical line using (), [], or {}; these must pair properly for the logical line to be complete. An expression beginning with ( can span multiple physical lines until there is a matching ). This is used frequently and is strongly encouraged.
    • Here's an example of a statement that relies on () to join four physical lines into one logical line:
      print (
          "big number",
          2 ** 2048
      )
  7. Blank lines contain only spaces, tabs and newlines. The interactive REPL uses a blank line to end a compound statement; the REPL is the only context in which a blank line is meaningful.
  8. Leading whitespace is required to properly group statements inside the clauses of compound statements. Either spaces or tabs can be used to indent. Consistency is essential. A four space indent is widely used and strongly encouraged.
  9. Except at the beginning of the line,—where it determines nesting of compound statements—whitespace can be used freely between tokens. Note that there are some preferences regarding precisely how spaces are used within a statement; the Python Enhancement Proposal (PEP) number 8 provides some advice. See https://www.python.org/dev/peps/pep-0008/ for fodder for endless disputes.

Perhaps the most important two rules are rule 6 and rule 8. Rule 6 means that it is very common to use (), [], and {} to force multiple physical lines to be joined into a single logical line.

Rule 8 requires that our indentation is done consistently: indents and outdents must be matched. While it's legal to use tabs, spaces, and any haphazard—but consistent—mix of tabs and spaces, four spaces is highly recommended. Tabs are discouraged because they're hard to distinguish from spaces. Most editors can be set to replace the tab key with four spaces. A good text editor can recognize the basics of Python syntax and can handle indents and outdents gracefully.

Tip

Use () to allow a statement to span multiple physical lines; avoid \ at end-of-line.

Use a four space indent.

Also note that Python will merge adjacent strings when parsing the source. We can have code that looks like this:

>>> message = ("Hello"
... "world")
>>> message
'Helloworld'

This assignment statement used a gratuitous () pair to allow the logical line to span multiple physical lines. The expression is simply two adjacent strings, "Hello" and "world". When Python parses the source text, these two adjacent strings are merged; only a single string is used when evaluating the statement.

Additionally, note that the REPL prompt changed from >>> to … because the REPL recognized the first physical line as a partial statement. This is a handy reminder that our statement isn't complete. When the final ) was parsed, the statement was complete and the prompt switched back to >>>.