`homelette.organization`

The homelette.organization submodule contains classes for organizing workflows.

Task is an object orchestrating model generation and evaluation.

Model is an object used for storing information about generated models.

Tutorials

For an introduction to homelette’s workflow, Tutorial 1 is useful. Assembling custom pipelines is discussed in Tutorial 7.

Classes

The following classes are part of this submodule:

Task Model

class homelette.organization.Task(task_name: str, target: str, alignment: Type[Alignment], task_directory: str = None, overwrite: bool = False)

Class for directing modelling and evaluation.

It is designed for the modelling of one target sequence from one or multiple templates.

If an already existing folder with models is specified, the Task object will load those models in automatically. In this case, it can also be used exclusively for evaluation purposes.

Parameters:

task_name (str) – The name of the task
target (str) – The identifier of the protein to model
alignment (Alignment) – The alignment object that will be used for modelling
task_directory (str, optional) – The directory that will be used for this modelling task (default is creating a new one based on the task_name)
overwrite (bool, optional) – Boolean value determining if an already existing task_directory should be overwriten. If a directory already exists for a given task_name or task_directory, this will determine whether the directory and all its contents will be overwritten (True), or whether the contained models will be imported (False) (default is False)

Variables:

task_name (str) – The name of the task
task_directory (str) – The directory that will be used for this modelling task (default is to use the task_name)
target (str) – The identifier of the protein to model
alignment (Alignment) – The alignment object that will be used for modelling
models (list) – List of models generated or imported by this task
routines (list) – List of modelling routines executed by this task

Return type:

None

execute_routine(tag: str, routine: Type[routines.Routine], templates: Iterable, template_location: str = '.', **kwargs) → None

Generates homology models using a specified modelling routine

Parameters:

tag (str) – The identifier associated with this combination of routine and template(s). Has to be unique between all routines executed by the same task object
routine (Routine) – The routine object used to generate the models
templates (list) – The iterable containing the identifier(s) of the template(s) used for model generation
template_location (str, optional) – The location of the template PDB files. They should be named according to their identifiers in the alignment (i.e. for a sequence named “1WXN” to be used as a template, it is expected that there will be a PDB file named “1WXN.pdb” in the specified template location (default is current working directory)
**kwargs – Named parameters passed directly on to the Routine object when the modelling is performed. Please check the documentation in order to make sure that the parameters passed on are available with the Routine object you intend to use

Return type:

None

evaluate_models(*args: Type[evaluation.Evaluation], n_threads: int = 1) → None

Evaluates models using one or multiple evaluation metrics

Parameters:

*args (Evaluation) – Evaluation objects that will be applied to the models
n_threads (int, optional) – Number of threads used for model evaluation (default is 1, which deactivates parallelization)

Return type:

None

get_evaluation() → pandas.DataFrame

Return evaluation for all models as pandas dataframe.

Returns:: Dataframe containing all model evaluation
Return type:: pd.DataFrame

class homelette.organization.Model(model_file: str, tag: str, routine: str)

Interface used to interact with created protein structure models.

Parameters:

model_file (str) – The file location of the PDB file for this model
tag (str) – The tag that was used when generating this model (see Task.execute_routine for more details)
routine (str) – The name of the routine that was used to generate this model

Variables:

model_file (str) – The file location of the PDB file for this model
tag (str) – The tag that was used when generating this model (see Task.execute_routine for more details)
routine (str) – The name of the routine that was used to generate this model
info (dict) – Dictionary that can be used to store metadata about the model (i.e. for some evaluation metrics)

Return type:

None

parse_pdb() → pandas.DataFrame

Parses ATOM and HETATM records in PDB file to pandas dataframe Useful for giving some evaluations methods access to data from the PDB file.

Return type:: pd.DataFrame

Notes

Information is extracted according to the PDB file specification (version 3.30) and columns are named accordingly. See https://www.wwpdb.org/documentation/file-format for more information.

get_sequence() → str

Retrieve the 1-letter amino acid sequence of the PDB file associated with the Model object.

Returns:: Amino acid sequence
Return type:: str

rename(new_name: str) → None

Rename the PDB file associated with the Model object.

Parameters:: new_name (str) – New name of PDB file
Return type:: None

homelette.organization

Tutorials

Classes

`homelette.organization`