Facebook Hydra is a new open-source framework aimed to speed up the creation of Python applications by simplifying the implementation of common functionality such as command-line argument handling, configuration management, and logging.
Facebook developed Hydra to accelerate development of several research projects where the ability to cope with changing requirements was key.
Hydra offers an innovative approach to composing an application’s configuration, allowing changes to a composition through configuration files as well as from the command line. This addresses challenges that can arise when modifying a config, such as having to maintain many slightly different copies of a configuration, or adding custom logic to override individual configuration values.
One of its main goals is reducing boilerplate code you usually need to write to handle command line arguments, file-based configuration, logging, etc. Hydra also provides a pluggable architecture that aims to enable future extensions, such as to run your code on a cloud provider seamlessly.
One of the mechanisms used to reduce boilerplate is establishing a convention regarding the way you specify your app configurations. In particular, configurations are composed of multiple sources that constitute a hierarchy and that can be overridden from the command line. For example, if you have a config.yaml
configuration file containing a number of configuration options for your program, you can use it through Hydra seamlessly:
hydra.main(config_path='config.yaml')
def my_app(cfg):
# use cfg configuration options...
If on a specific run you want to override one configuration value, you can provide the new value to use on the command line:
$ python my_app.py db.user=root db.pass=1234
Hydra also makes it easy to handle alternative groups of configuration options. For example, you could have two configuration files, one to, say, connect to a MySQL database, and the other to a PostgreSQL database. On each run of your program, you can choose which configuration file to use by specifying it on the command line like this:
$ python my_app.py db=postgresql
$ python my_app.py db=mysql db.timeout=20
Configuration files are stored in a single directory and organized hierarchically using the filesystem. Hydra mirrors the filesystem hierarchy trough the cfg
map which is passed to your app. This allows to organize your configuration options in independent spaces and compose then according to your needs. For example, in addition to configuration files for PostgreSQL and MySQL, you could have configuration files to describe a number of database schemas you would like to work with, then decide at launch time which combination of database/schema to use on that particular run:
$ python my_app.py db=postgresql schema=school
$ python my_app.py db=mysql schema=home
Conveniently, Hydra uses shell tab completion to guide you through the configurations and sub-configurations you can use on the command line, so you do not need to remember all allowed combinations. As an additional bonus, Hydra will create an output directory for each run of your program and copy there the configuration active for that run in addition to any output file. This is ideal when you want to run multiple experiments and keep track of the results to be able to compare them at the end.
Last, but not least, Hydra includes facilities for logging aimed to reduce setup cost that are fully integrated with Hydra configuration management.
import logging
# A logger for this file
log = logging.getLogger(__name__)
@hydra.main()
def my_app(_cfg):
log.info("Info level message")
log.debug("Debug level message")
Indeed, you can set which log level to display and switch logging on and off at the file-level either from the command line or through configuration files.
$ python my_app.py hydra.verbose=[__main__,hydra]
As mentioned, Facebook plans to grow Hydra features leveraging its pluggable architecture. It is available on GitHub under the MIT license.