developers

Types in Python

A bird's view look to the Typing features in Python 3.x

Jun 10, 202111 min read

Introduction

Python is a dynamically typed language. This means that the Python interpreter does type checking only as code runs, and the type of a variable is allowed to change over its lifetime.

However — Python has included a gradual type system for a very long time through the PEP 484.

Gradual Typing is the ability of a type system to have expressions that are typed (and thus checked) while also having some that are untyped. The final program is still valid and executable by the interpreter.

In this article, we're going to take a look at how the typing annotations have been introduced, what tools we can use today to enforce typings, and then have a rundown to the typing capabilities that Python 3.9 is offering.

Function Annotations

Python 3 introduced the support of function annotations through the PEP3107. It was essentially a way to store metadata and attributes to function parameters and return values that somebody could be using. For instance, consider the following function that sums two numbers:

def add(a, b):
    return a + b

In Python 3, it is possible to add annotations by using the

:
for the parameters and
->
for the return value:

def add(a: "first number", b: "second number"):
    return a + b

The annotations do not modify the function's behavior in any way. Such data is stored in the

__attributes__
mutable dictionary of the function itself and can be retrieved and also modified:

def add(a: "first number", b: "second number"):
     return a + b 

print(add.__annotations__)

$ {'a': 'first number', 'b': 'second number'}

Technically speaking, the annotations are a shortcut to set up the

__attributes__
dictionary of a function.

Any expression is a valid annotation. So the following is also valid:

operation = "square root"

def do_math(a: "Should be a positive number" if operation=='square root' else "Should be a number" ):
    if(operation == "square root"):
        return sqrt(a)
    return a**2

and the relative

__annotations__
will contain the evaluated expressions:

print(do_math.__annotations__)

# {'a': 'Should be a positive number'}

Things get more interesting when we start to annotate function parameters and return values with dictionaries. Consider, for instance, the following annotated function:

def subtract(
    a: {"description": "first number", "exampleValue": 10},
    b: {"description": "second number", "exampleValue": 15}
) -> {"description": "subtraction of the numbers"}:
    return a - b

print(subtract.__annotations__)

$ {'a': {'description': 'first number', 'exampleValue': 10},
   'b': {'description': 'second number', 'exampleValue': 15},
   'return': {'description': 'subtraction of the numbers'}}

Any other Python program can read such metadata and use it. In this specific case, a theoretic program could use such annotations to display or even generate documentation. pydoc, for instance, displays the type annotations in the generated artifacts.

Type Annotations

Because any expression is a valid annotation, it is possible to level up the game by annotating the function parameters with the type constructors that represent the parameter type and have a linter check them out for us.

Linters are programs that will statically analyze the source code of a project and, based on its configuration, can point out sections that might be problematic. In some cases, they can also suggest corrections. In this specific case, a linter can take such annotations, track all the function usages and check whether the type passed as a parameter corresponds to the one, and alert in case the types do not align. This is exactly what mypy does.

Consider the following function concatenating a string to a constant, and let's use type constructors as type annotations:

def greeting(name: str) -> str:
    return "Hello " + name

greeting(10)

Now let's install mypy and run the checked on the newly created file:

# Create a new directory for your project
mkdir python-typings

# Make your project directory the current directory
cd python-typings
 
pip3 install mypy

touch greetings.py

# Open the file with your favorite editor and paste the content of the snippet above

$ mypy ./greetings.py

$ greetings.py:4: error: Argument 1 to "greeting" has incompatible type "int"; expected "str"
$ Found 1 error in 1 file (checked 1 source file)

mypy has identified a misusage in the function typings and alert us with some appropriate error messages. It does so by looking at the annotations that the function provides and then use them in all the function calls it is able to detect, alerting when mismatches are detected.

Just as we have done previously, let's take a look at the annotations on the function:

print(greeting.__annotations__)

$ {'name': <class 'str'>, 'return': <class 'str'>}

This is a very different model than, for example, the one that TypeScript is using.

While TypeScript code needs to be transpiled (which is the procedure that, among other things, strips all the types out) in order to produce regular untyped JavaScript — Python has not a real type erasure. The annotations are left in place and can even be accessed at runtime.

mypy is available as a command-line tool or, in case you are not a fan of the terminal, as a VSCode extension or with Pycharm. These plugins can boost your experience when writing code significantly since they will report type errors directly in your IDE and in real-time.

Now that we know how to type annotations work, let's do a typing tour to see the capabilities of the typing system.

Typing Fundamentals

All the primitives offered by Python are supported through a type constructor. So any variable, function parameter, and return value can be naturally typed with

int, str, float, bool, None
.

On top of these, there's a number of special types that are provided for a better experience:

  • Any
    : specified an unconstrained type, effectively skipping the type check for anything annotated with so
  • NoReturn
    : specifies a function that does NOT return in any case. Infinite loops or functions throwing exceptions are some examples
  • Literal
    : constrains the parameter or the return value to be only one of the provided possible values, effectively subtyping a type
from typing import Literal

def return_literal() -> Literal[1, 2, 3, 4]:
    t: int = 10 
    # return t # Incompatible return value type (got "int", expected "Union[Literal[1], Literal[2], Literal[3], Literal[4]]")
    return 4 # OK

Aggregated

Aggregates are collections of related information we wish to treat as a unit. Python's typing module supports them too.

Dictionaries

Dictionaries can be annotated by creating a subclass of the provided

TypedDict
class:

from typing import TypedDict


class Movie(TypedDict):
    name: str
    year: int


m: Movie = {"name": "The Avengers", "year": 2018}

It is also possible to use the

Movie
constructor, yielding the same result:

m = Movie("Batman Begins", 2005)

All the variable annotations on a module are being held on the

__annotations__
dictionary in the module itself, so
print(__annotations__)
is legit.

mypy will also make sure to alert in case the code is trying to access a property that has not been declared:

m["ratings"] = 5 # TypedDict "Movie" has no key 'ratings'

By default, all the keys specified in the Typed Dictionary must be present. It is possible to relax such constraint through the

total
boolean value:

class Movie(TypedDict, total=False):
    name: str
    year: int


m: Movie = {"year": 2018} # The name is missing, but this is still ok

As long as the key is a string and the value is an integer, the type checker will mark it as valid.

In case some of the keys are mandatory while some are not, the

Optional[T]
key can be used:

from typing import Optional


class Movie(TypedDict):
    name: str
    year: Optional[int]


m: Movie = {"name": "Batman Begins", "year": None}

The

Optional[T]
can also be used to annotate function parameters that are not required:

from typing import Optional, Union


def greet(name: Optional[str], surname: Union[str, None]):
    n = name or "Clark"
    s = surname or "Kent"
    # return "Hello, " + name + " " + s
    return "Hello, " + n + " " + s

You can see that we have also used

Union[str, None]
, which is effectively the same thing as
Optional[str]
.

The

Mapping
type can be used when we have fewer constraints about the keys/values of the dictionary we want to operate with:

from collections.abc import Mapping

m: Mapping[str, int] = {"year": 2018, "randomProperty": 1000}

You can see that, as long as the key of the dictionary is a string and the value is an integer, the dictionary is valid, and we do not have to specify the property names ahead of time.

While the

dict
class is the only one effectively implementing
Mapping
, it is advised to use the less specific type as possible so that future code won't be broken in case the standard Mapping gets changed in a different Python version. This also happens for List, as we'll see shortly.

Sequences

Python has the

Sequence[T]
base class that can be used to model a number of classes, such as
list
,
str
,
tuple
, and
bytes
. The
Sequence
protocol only supports accessing items (but not modify them), so you will need to type according to your use case:

from collections.abc import Mapping


seq: Sequence[int] = [1,2,3,4]
seq[0] = 10 # Unsupported target for indexed assignment ("Sequence[int]")

lst: list[int] = [1,2,3,4]
lst[0] = 10 # This is OK

In case you're a fan of immutability, you've just gained a powerful tool to clearly mark the fact that function will NOT touch the passed sequence.

Type Guard

Type Guard is a feature found in different languages (including TypeScript), where the Type System is capable of narrowing down a type in a block based on a condition.

For instance, if we consider this code:

from typing import Optional


def func(val: Optional[str]) -> str:
    if val is not None:
        return val
    return "default"

The type checker, by statically analyzing the code, is able to understand

val
is a string (and only a string) in case the statement on the first line is true and narrows down to just
None
in case the condition is false.

There are cases where type narrowing cannot be applied based on static information only, and in such cases, custom type guards can come to help. For instance, consider this code example:

def is_str_list(val: List[object]) -> bool:
    return all(isinstance(x, str) for x in val)

lst: List[object] = ["hello", "world", 1, 3, 4]


def capitalize_first_element(lst: List[object]) -> str:
    if is_str_list(lst):
        return lst[0].capitalize()
    return str(lst[0]).capitalize()

If we run mypy on it, we will receive the following error:

code.py:8: error: "object" has no attribute "capitalize"

The Type System is not able to infer that, when

is_str_list
returns
true
— it's safe to assume that the entire list contains only
str
types, where the
capitalize
method exists.

Custom Type Guards allow us to instruct the Type Checker about a type "assumption" based on a boolean value.

from typing import TypeGuard


def is_str_list(val: List[object]) -> TypeGuard[List[str]]:
    return all(isinstance(x, str) for x in val)

By using this function instead of the one provided above, the Type Checker will be able to refine from

List[object]
to
List[string]
(when the return value is
true
) where the
capitalize
function is defined for every element.

The accuracy of the type guard function is left to the developer. If the function is returning an incorrect value, the type checker has no way to figure that out, and it will flow on the wrong branch. For instance, let's say we have a bug in our function:

from typing import TypeGuard


def is_str_list(val: List[object]) -> TypeGuard[List[str]]:
    return all(not isinstance(x, str) for x in val)

In such a case, the type checker will still assume the List of object has strings inside when the function returns

true
even if it's semantically incorrect, leading to possible bugs.

Unfortunately, custom type guards are not available yet, but they are going to be added in Python 3.10

Conclusion

Typing in Python works — it is a success story. It is gradually integrated into the language with no significant modifications and does not modify the dynamic nature of the language. While the features are long to be completed, the way paved is definitely the good one and, in some circumstances, might lead to better software.