Technical Notes
Firstly, detailed documentation generated from the docstrings are available under API/.
This page intends to give technical notes beyond the API/, of how different components of model-ensembler
work together, and provide accompanying notes to aid developers. It is not intended to be read from start to finish.
model_ensembler
batcher
Contains execution core code.
Contains BatchExecutor class, which monitors all runs independent of one and executes batches based on the configuration. Can control submission rate into SLURM.
Note:
- It relies on host porcessor staying alive.
- It relies on workflow picking itself up.
- It is not aware of state.
cli
cli.py provides the main CLI entrypoint (the model_ensemble command), and parses various arguments to control it.
model_ensemble calls the BatchExecutor, which is fed the configuration and backend options through the
parsed arguments.
config
model-ensemble.json defines the schema which a configuration should follow.
config.py will use this to validate the configuration file it is given.
config.py:
- Contains
YAMLConfigclass which validates the yaml file againstmodel-ensemble.jsonschema. - Contains
TaskandTaskArrayMixin,Taskobtains tasks fromTaskSpecinYAMLConfig, stores Tasks as an array.TaskArrayMixinobtains tasks from batch object members. - Contains
EnsembleConfigclass, represents ensemble (collectsYAMLConfigandTaskArrayMixin) - Contains
Batchclass, represent batch (collectsBatchSpecandTaskArrayMixin)
exceptions
Contains a TemplatingError exception. Other exceptions are handled in tasks/exception.py.
runners
Core execution functions for the batcher, for example functionality to asynchronously run a list of tasks.
templates
Contains the functionality to render batch templates and preparing directories for their transfer to run directories.
utils
Contains general purpose functionality, such as arguments handling and logging.
tasks
Submodule which contains generic tasks, utilities and exceptions:
exceptions.py: contains exceptions which relate to the tasks (e.g. *ProcessingExceptionfor processing failures)hpc.py: contains HPC-related tasks methods, such as checking the number of SLURM jobs.sys.pycontains all methods for system related tasks, such as rsyncing directory contents.utils.py: contains general implementation and functionality related to tasks.
cluster
Submodule which contains backend-specific functionality, currently SLURM and local (slurm.py, dummy.py).