Introduction
Python is a dynamically typed language. This means that the Python interpreter does type checking only as code runs, and the type of a variable is allowed to change over its lifetime.
However — Python has included a gradual type system for a very long time through the PEP 484.
Gradual Typing is the ability of a type system to have expressions that are typed (and thus checked) while also having some that are untyped. The final program is still valid and executable by the interpreter.
In this article, we're going to take a look at how the typing annotations have been introduced, what tools we can use today to enforce typings, and then have a rundown to the typing capabilities that Python 3.9 is offering.
Function Annotations
Python 3 introduced the support of function annotations through the PEP3107. It was essentially a way to store metadata and attributes to function parameters and return values that somebody could be using. For instance, consider the following function that sums two numbers:
def add(a, b): return a + b
In Python 3, it is possible to add annotations by using the
:
for the parameters and ->
for the return value:def add(a: "first number", b: "second number"): return a + b
The annotations do not modify the function's behavior in any way. Such data is stored in the
__attributes__
mutable dictionary of the function itself and can be retrieved and also modified:def add(a: "first number", b: "second number"): return a + b print(add.__annotations__) $ {'a': 'first number', 'b': 'second number'}
Technically speaking, the annotations are a shortcut to set up the
__attributes__
dictionary of a function.Any expression is a valid annotation. So the following is also valid:
operation = "square root" def do_math(a: "Should be a positive number" if operation=='square root' else "Should be a number" ): if(operation == "square root"): return sqrt(a) return a**2
and the relative
__annotations__
will contain the evaluated expressions:print(do_math.__annotations__) # {'a': 'Should be a positive number'}
Things get more interesting when we start to annotate function parameters and return values with dictionaries. Consider, for instance, the following annotated function:
def subtract( a: {"description": "first number", "exampleValue": 10}, b: {"description": "second number", "exampleValue": 15} ) -> {"description": "subtraction of the numbers"}: return a - b print(subtract.__annotations__) $ {'a': {'description': 'first number', 'exampleValue': 10}, 'b': {'description': 'second number', 'exampleValue': 15}, 'return': {'description': 'subtraction of the numbers'}}
Any other Python program can read such metadata and use it. In this specific case, a theoretic program could use such annotations to display or even generate documentation. pydoc, for instance, displays the type annotations in the generated artifacts.
Type Annotations
Because any expression is a valid annotation, it is possible to level up the game by annotating the function parameters with the type constructors that represent the parameter type and have a linter check them out for us.
Linters are programs that will statically analyze the source code of a project and, based on its configuration, can point out sections that might be problematic. In some cases, they can also suggest corrections. In this specific case, a linter can take such annotations, track all the function usages and check whether the type passed as a parameter corresponds to the one, and alert in case the types do not align. This is exactly what mypy does.
Consider the following function concatenating a string to a constant, and let's use type constructors as type annotations:
def greeting(name: str) -> str: return "Hello " + name greeting(10)
Now let's install mypy and run the checked on the newly created file:
# Create a new directory for your project mkdir python-typings # Make your project directory the current directory cd python-typings pip3 install mypy touch greetings.py # Open the file with your favorite editor and paste the content of the snippet above $ mypy ./greetings.py $ greetings.py:4: error: Argument 1 to "greeting" has incompatible type "int"; expected "str" $ Found 1 error in 1 file (checked 1 source file)
mypy has identified a misusage in the function typings and alert us with some appropriate error messages. It does so by looking at the annotations that the function provides and then use them in all the function calls it is able to detect, alerting when mismatches are detected.
Just as we have done previously, let's take a look at the annotations on the function:
print(greeting.__annotations__) $ {'name': <class 'str'>, 'return': <class 'str'>}
This is a very different model than, for example, the one that TypeScript is using.
While TypeScript code needs to be transpiled (which is the procedure that, among other things, strips all the types out) in order to produce regular untyped JavaScript — Python has not a real type erasure. The annotations are left in place and can even be accessed at runtime.
mypy is available as a command-line tool or, in case you are not a fan of the terminal, as a VSCode extension or with Pycharm. These plugins can boost your experience when writing code significantly since they will report type errors directly in your IDE and in real-time.
Now that we know how to type annotations work, let's do a typing tour to see the capabilities of the typing system.
Typing Fundamentals
All the primitives offered by Python are supported through a type constructor. So any variable, function parameter, and return value can be naturally typed with
int, str, float, bool, None
.On top of these, there's a number of special types that are provided for a better experience:
: specified an unconstrained type, effectively skipping the type check for anything annotated with soAny
: specifies a function that does NOT return in any case. Infinite loops or functions throwing exceptions are some examplesNoReturn
: constrains the parameter or the return value to be only one of the provided possible values, effectively subtyping a typeLiteral
from typing import Literal def return_literal() -> Literal[1, 2, 3, 4]: t: int = 10 # return t # Incompatible return value type (got "int", expected "Union[Literal[1], Literal[2], Literal[3], Literal[4]]") return 4 # OK
Aggregated
Aggregates are collections of related information we wish to treat as a unit. Python's typing module supports them too.
Dictionaries
Dictionaries can be annotated by creating a subclass of the provided
TypedDict
class:from typing import TypedDict class Movie(TypedDict): name: str year: int m: Movie = {"name": "The Avengers", "year": 2018}
It is also possible to use the
Movie
constructor, yielding the same result:m = Movie("Batman Begins", 2005)
All the variable annotations on a module are being held on the
__annotations__
dictionary in the module itself, so print(__annotations__)
is legit.mypy will also make sure to alert in case the code is trying to access a property that has not been declared:
m["ratings"] = 5 # TypedDict "Movie" has no key 'ratings'
By default, all the keys specified in the Typed Dictionary must be present. It is possible to relax such constraint through the
total
boolean value:class Movie(TypedDict, total=False): name: str year: int m: Movie = {"year": 2018} # The name is missing, but this is still ok
As long as the key is a string and the value is an integer, the type checker will mark it as valid.
In case some of the keys are mandatory while some are not, the
Optional[T]
key can be used:from typing import Optional class Movie(TypedDict): name: str year: Optional[int] m: Movie = {"name": "Batman Begins", "year": None}
The
Optional[T]
can also be used to annotate function parameters that are not required:from typing import Optional, Union def greet(name: Optional[str], surname: Union[str, None]): n = name or "Clark" s = surname or "Kent" # return "Hello, " + name + " " + s return "Hello, " + n + " " + s
You can see that we have also used
Union[str, None]
, which is effectively the same thing as Optional[str]
.The
Mapping
type can be used when we have fewer constraints about the keys/values of the dictionary we want to operate with:from collections.abc import Mapping m: Mapping[str, int] = {"year": 2018, "randomProperty": 1000}
You can see that, as long as the key of the dictionary is a string and the value is an integer, the dictionary is valid, and we do not have to specify the property names ahead of time.
While the
dict
class is the only one effectively implementing Mapping
, it is advised to use the less specific type as possible so that future code won't be broken in case the standard Mapping gets changed in a different Python version. This also happens for List, as we'll see shortly.Sequences
Python has the
Sequence[T]
base class that can be used to model a number of classes, such as list
, str
, tuple
, and bytes
. The Sequence
protocol only supports accessing items (but not modify them), so you will need to type according to your use case:from collections.abc import Mapping seq: Sequence[int] = [1,2,3,4] seq[0] = 10 # Unsupported target for indexed assignment ("Sequence[int]") lst: list[int] = [1,2,3,4] lst[0] = 10 # This is OK
In case you're a fan of immutability, you've just gained a powerful tool to clearly mark the fact that function will NOT touch the passed sequence.
Type Guard
Type Guard is a feature found in different languages (including TypeScript), where the Type System is capable of narrowing down a type in a block based on a condition.
For instance, if we consider this code:
from typing import Optional def func(val: Optional[str]) -> str: if val is not None: return val return "default"
The type checker, by statically analyzing the code, is able to understand
val
is a string (and only a string) in case the statement on the first line is true and narrows down to just None
in case the condition is false.There are cases where type narrowing cannot be applied based on static information only, and in such cases, custom type guards can come to help. For instance, consider this code example:
def is_str_list(val: List[object]) -> bool: return all(isinstance(x, str) for x in val) lst: List[object] = ["hello", "world", 1, 3, 4] def capitalize_first_element(lst: List[object]) -> str: if is_str_list(lst): return lst[0].capitalize() return str(lst[0]).capitalize()
If we run mypy on it, we will receive the following error:
code.py:8: error: "object" has no attribute "capitalize"
The Type System is not able to infer that, when
is_str_list
returns true
— it's safe to assume that the entire list contains only str
types, where the capitalize
method exists.Custom Type Guards allow us to instruct the Type Checker about a type "assumption" based on a boolean value.
from typing import TypeGuard def is_str_list(val: List[object]) -> TypeGuard[List[str]]: return all(isinstance(x, str) for x in val)
By using this function instead of the one provided above, the Type Checker will be able to refine from
List[object]
to List[string]
(when the return value is true
) where the capitalize
function is defined for every element.The accuracy of the type guard function is left to the developer. If the function is returning an incorrect value, the type checker has no way to figure that out, and it will flow on the wrong branch. For instance, let's say we have a bug in our function:
from typing import TypeGuard def is_str_list(val: List[object]) -> TypeGuard[List[str]]: return all(not isinstance(x, str) for x in val)
In such a case, the type checker will still assume the List of object has strings inside when the function returns
true
even if it's semantically incorrect, leading to possible bugs.Unfortunately, custom type guards are not available yet, but they are going to be added in Python 3.10
Conclusion
Typing in Python works — it is a success story. It is gradually integrated into the language with no significant modifications and does not modify the dynamic nature of the language. While the features are long to be completed, the way paved is definitely the good one and, in some circumstances, might lead to better software.
About the author
Vincenzo Chianese
API Architect