Last week, my team open sourced the configuration library we have streamlined for the software components that we work on.
Opset – A library for simplifying the configuration of Python applications at all stages of deployment. [github] [pypi]
Although the version you can see online is version 1, in reality it’s closer to version 3. Unsurprisingly, we went through many iteration cycles before we came up with something that pleased all the members of the team.
The documentation of the library is very good, but here it is in a glimpse.
It works with fixed layers of configurations. These layers change whether you are testing or running the application. The only required layer is the first one.
The first layer is the “default.yml” file. It is the skeleton of the configuration and meant to be saved to your versioning system. It should contain all the keys required by the application and it’s perfectly Ok to set sane default values, as long as they are not secrets. Any key/value that will be defined in the other layers will be ignored by the system if there is no corresponding key in “default.yml”.
The second layer is made of both a “local.yml” and a “unit_test.yml”. These files are respectively loaded either in regular mode or when testing. Their values overwrite whatever exists in “default.yml”. This is usually where you will want to put your secrets. These files are not meant to be saved to your versioning system. This layer is meant to be used while working on the project.
The third layer is made of environment variables. Environment variables respecting a certain pattern will overwrite whatever value exists in the first two layers. When running in regular mode, this is the last layer. This layer is meant to be used while working on the project as well as when running it in a production/staging environment where variables are injected inside say a docker container.
When in test mode, there is a fourth layer called the “config_values”. For projects using pytest, these values are usually set in the top conftest.py file. These settings overwrite all of the other layers and are meant to be forced values for the tests. They are meant to be a safeguard against layers 2 and 3, where for example developers could have configured their local environment to use a staging database, so you force the connection string value in the tests to something else to make sure that they don’t reset the database.
This library has worked well for us for a while now and that’s why we felt ready to open source it. Because of the hidden iteration cycles, which I mentioned earlier, the library is quite mature and I would blindly recommend using it.
Extra: Configuration vs. Parameters
While the library was presented internally at Element AI, some people asked if it was possible to have many layers instead of the fixed amount that the library supports. Their use case is that they want to launch many experiments where they change some configuration values at each launch.
My opinion: This is the difference between configurations and parameters.
A configuration is something that doesn’t change often from one run to another.
A parameter is something that changes quite often. Think like function arguments or command line arguments.
So, in my opinion, this is not quite the right tool for their purpose. What could be done though is using it to provide the default values and combine it with command line arguments to change whatever shouldn’t be the default.