Python Type Hinting: To hint or to cast?

During a code review, a colleague, Zachary Paden, asked me why I was calling the typing.cast function on my variables rather than creating temporary variables just to type hint. Well, just as he didn’t know about cast, I didn’t know that this approach worked. Being the nerdz that we are, he decided to measure the performance of each approach.

To begin with, let’s go with a function that is not adequately typed.

def func(data: dict) -> dict:
    return data["subdata"]

Mypy yields the following error message: Returning Any from function declared to return "Dict[Any, Any]"

First solution: Casting

from typing import cast

def casting(data: dict) -> dict:
    return cast(dict, data["subdata"])

Second solution, extract the sub dictionary in a temporary value that it properly typed.

def hinting(data: dict) -> dict:
    subdata: dict = data["subdata"]
    return subdata

Both approaches are perfectly valid, yet one is ~10x faster than the other. Which one is it?

To measure, we used iPython and the %timeit function. For the conversion solution, the import is excluded from the calculations.

In [1]: from typing import cast
   ...:
   ...: def casting(data: dict) -> dict:
   ...:     return cast(dict, data["subdata"])
   ...:

In [2]: def hinting(data: dict) -> dict:
   ...:     subdata: dict = data["subdata"]
   ...:     return subdata
   ...:

In [3]: data = {"subdata": {"more" : "data"}}

In [4]: %timeit casting(data)
52.1 ns ± 0.0996 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [5]: %timeit hinting(data)
33.9 ns ± 0.0581 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

We observe that the annotated version is ~35% faster. But, in his experiment, Zach had a 90% difference. Why was this? Well, his code typed more accurately than the one above. Here is a new version with more accurate annotations.

In [6]: def casting(data: dict) -> dict:
   ...:     return cast(dict[str, str], data["subdata"])
   ...:

In [7]: def hinting(data: dict) -> dict:
   ...:     subdata: dict[str, str] = data["subdata"]
   ...:     return subdata
   ...:

In [8]: %timeit casting(data)
116 ns ± 0.105 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [9]: %timeit hinting(data)
33.9 ns ± 0.0523 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

Well well, this time the annotated version is ~71% faster than the casting version. Or rather, the casting version is ~2.27 times slower than before. Let’s make the type heavier to see the new performance impact.

In [10]: data = {"subdata": {"way" : {"more" : "data"}}}

In [11]: def casting(data: dict) -> dict:
    ...:     return cast(dict[str, dict[str, str]], data["subdata"])
    ...:

In [12]: def hinting(data: dict) -> dict:
    ...:     subdata: dict[str, dict[str, str]] = data["subdata"]
    ...:     return subdata
    ...:

In [13]: %timeit casting(data)
182 ns ± 0.0948 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [14]: %timeit hinting(data)
33.9 ns ± 0.0621 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

Findings:

  1. The version that casts is ~1.57 times slower than before.
  2. The performance of the annotating version is stable. Very stable.
  3. The performance of the version that converts is subject to the complexity of the type.
  4. I’m really nerdy to take the time to blog about nanoseconds of performance.

😂

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.