Python code formatters comparison: Black, autopep8 and YAPF

Following some discussions at work and the will of the team to adopt a python code formatter, I set out to explore some of them. No need to say, the contenders had to aim towards pep8 compliance. Here are my findings on three of them.

Black

  • Source: https://github.com/ambv/black
  • Version at time of writing: 18.6b1
  • First commit: Wed Mar 14 12:55:32 2018 -0700
  • Watch count: 94
  • Stars: 4626
  • Forks:  174
  • Issues: 47

Black is what I would call a strict formatter. It will apply its style guide even where pep8 was not violated. Black is highly opinionated and has close to zero configuration. As the readme states it:

Black reformats entire files in place. It is not configurable.

Clear enough. That’s a design decision. There are in fact only two configurable formatting options: maximal line length and whether to normalize string quotes/prefixes or not. That’s it. If you are curious to learn why black formats the way it does, the readme contains a bunch of great rational explanations.

Personally, here is an example of formatting done by black that I dislike. For this input:

 

data = {
    'key': {
        'subkey': 'value',
        'foo': 'bar'
    }
}

It reformats as:

data = {"key": {"subkey": "value", "foo": "bar"}}

But that’s just me, and I get it. One of the points for having a formatter is to make code uniform without having discussions around it. Black took it to another level: there is almost not even any discussions possible at the configuration level. If you really insist on having it NOT touch some sections of code, you can surround them with # fmt: off/on.

One nice feature: the --check flag. Use it in your favourite CI tool to see your build fail if the code is not formatted as Blacks would.

A last note: I have not experienced any issues myself with black but some of my colleagues did. It’s still very new and I suppose it is to be expected.

autopep8

  • Source: https://github.com/hhatto/autopep8
  • Version at time of writing: 1.3.5
  • First commit: Thu Dec 30 05:27:29 2010 +0900
  • Watch count: 60
  • Stars: 2253
  • Forks:  163
  • Issues: 66

autopep8 is what I would call a loose formatter. Its aim is fixing pep8 errors, not making the code uniform. If we take the two code samples above, in the Black section, they are both pep8 compliant so autopep8 would not change them. Because it only modifies code that is not pep8 compliant, it cannot be used as a way to stop having to manually manage uniformity of coding styles.

Basically, autopep8 is great in helping with pep8 compliance and that’s it.

YAPF

  • Source: https://github.com/google/yapf
  • Version at time of writing: 0.22.0
  • First commit: Wed Mar 18 13:36:07 2015 -0700
  • Watch count: 200
  • Stars: 7427
  • Forks:  496
  • Issues: 156

What does YAPF stands for? I don’t know. Maybe “Yet Another Python Formatter”? But that’s just a supposition.

YAPF is made by google, but as the readme states:

YAPF is not an official Google product (experimental or otherwise), it is just code that happens to be owned by Google.

Like Black, it is what I would call a strict formatter. One major difference: it can be configured. It comes with three built-in styles: pep8, google and chromium, but the documentation doesn’t bother highlighting the differences. On top of that, you can fine tweak your style of choice with “knobs”, as they call it. Again, the documentation fails to explain clearly what some of them do. Configurations can be saved to a file that will be looked upon at launch.

YAPF also has a “leave this section alone” functionality with # yapf: disable/enable.

I would have loved to see a flag like the --check from Black to validate the formatting. Since YAPF doesn’t provide anything similar, I have crafted a working bash command.

yapf --diff --recursive . | wc -l | xargs test 0 -eq

Sad me though: If you look at the first code sample again, I cannot get YAPF to leave it alone either.


As mentioned earlier, I started digging this topic after a colleague introduced us to Black. As a team, we decided not to use it because it behaves in ways we disagreed with. About autopep8, I am already using it, but it doesn’t make the code uniform so we are looking for a bigger weapon. YAPF seems like a strong contender.

Clearly, there are more formatters out there that I did not try. Any of them worth my attention? Mention it in the comments!

Comments

  1. yapf allows dicts (and lists etc.) to flow onto multiple lines if you put a comma after the final element.

    Example:

    This
    a = [
    1,
    2,
    3
    ]

    becomes:
    a = [1, 2, 3]

    …but this will remain as it is
    a = [
    1,
    2,
    3,
    ]

    1. This actually makes a lot of sense. At least IMHO. Without the trailing comma, you tell the formatter/future you/others that you do not expect this code to change. In this case it is only reasonable to fold it to line if it is not too long. But if you do put a comma after the last element, you tell all of them, that you expect other elements to be appended.

      I tried this lovely trick with black and guess what – it also leaves the tree like dict alone and doesn’t format it.

  2. Black lets you also see dictionaries in a “dictionary way” if you put a comma also afterin the last element.

  3. I’ve never understood the desperate need for strict formatting conventions. Formatting helps readability but if you are struggling to read your code because of formatting issues then that means you have far, far bigger issues to worry about. After 40 years of reading code in just about any popular language and format you can think of, I’ve never, ever found formatting a limiting factor in my comprehension of code. Variable naming, inconsistent ordering or function parameters, lack of useful comments, too many comments, too long files, too short files, all of these things yes. Even McCabe’s complexity metrics can help. But formatting? Not really once you get beyond a few basic rules that Python mainly enforces anyway!.

    1. I see two main reasons to use an auto-formatter.
      1. Readability. I agree with you though, I don’t think this is a huge deal.
      2. Stop nitpicking and invest the time saved in what matters. Some people can’t help it but reformat files they are working with to a style they prefer. Yes you can have discussions to tell them to stop, but if you have a tool you don’t even have to have a discussion. And then, sometimes you just do your work, open a pull request, and reviewers will tell you that you did not respect the style of the file you are working on or the style of project or team. That is on top of contributors who can be sloppy for real. Once more, you can have discussions to clear that matter, but having a tool eliminates them entirely so your team, again, can focus more on what matters.

      1. I would say I am more agree with the Alan. The problem is hiding most of the times somewhere else and in your case is definitely is the sloppy person who can’t keep things in the team’s order. if that person can’t adjust with such a small task as not re-formatting the whole project or staying in a certain box for the sake of the work flow, how can you trust such a person not causing more serve issues in parts which actually matter? It’s great to have a constant formatting for the entire team, but readability and clear code base mostly coming from other places.

    2. Well said and I tend to agree @Alan Have not used these formatters though, so no experience as to how it reduces time wasted by team on nitpicking formatting. I can imagine some teams would benefit by having the formatter essentially END the discussion so that everyone can focus on what matters more, which is all the things you said. Thank you for this blog post OP @Frank

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.