Tutorial 8: Automatic Alignment Generation
[1]:
import homelette as hm
Introduction
Welcome to the eighth tutorial for homelette
, in which we will explore homelette
’s tool for automated alignment generation.
The alignment is a central step in homology modelling, and the quality of the alignment used for modelling has a lot of influence on the final models. In general, the challenge of creating solid sequence alignments is mainly dependent how closely the target and template are. If they share a high sequence identity, the alignments are easy to construct and the modelling process will most likely be successful.
Note
As a rule of thumb, it is said that everything above 50-60% sequence identity is well approachable, while everything below 30% sequence identity is very challenging to model.
homelette
has methods that can automatically generate an alignment given a query sequence. However, these methods hide some of the complexity of generating good alignments. Use them at your own discretion, especially for target sequences with low sequence identity to any template.
Note
Be careful with automatically generated alignments if your protein of interest has no closely related templates
After these words of caution, let’s look at the implemented methods:
alignment.AlignmentGenerator_pdb
: Query the PDB and local alignment with Clustal Omegaalignment.AlignmentGenerator_hhblits
: Local database search against PDB70 database.alignment.AlignmentGenerator_from_aln
: For if you already have an alignment ready, but want to make use ofhomelette
’s processing of templates and alignments.
Method 1: Querying RCSB and Realignment of template sequences with Clusta Omega
This class performs a three step process:
Template Identification: Query the RCSB using a sequence (interally, MMseq2 is used by RCSB) [1, 2] (
get_suggestion
)Then the sequences of identified templates are aligned locally using Clustal Omega [3, 4]. (
get_suggesion
)Finally, the template structures are downloaded and processed together with the alignment (
get_pdbs
)
Afterwards, the templates schould be ready for performing homology modelling.
For a practical demonstration, let’s find some templates for ARAF:
[2]:
gen = hm.alignment.AlignmentGenerator_pdb.from_fasta('data/alignments/ARAF.fa')
# gen = hm.alignment.AlignmentGenerator_pdb(
# sequence = 'GTVKVYLPNKQRTVVTVRDGMSVYDSLDKALKVRGLNQDCCVVYRLIKGRKTVTAWDTAIAPLDGEELIVEVL',
# target = 'ARAF')
There are two ways how AlignmentGenerator
can be initialized: either with a sequence, or from a fasta file. Both ways are shown above.
In the next step we use this sequence to generate an initial alignment:
[3]:
gen.get_suggestion()
Querying PDB...
Query successful, 16 found!
Retrieving sequences...
Sequences succefully retrieved!
Generating alignment...
Alignment generated!
As we can see from the output, we are querying the PDB and extracting potential templates. Then, an alignment is generated.
We can have a first look at the suggested templates as such:
[4]:
gen.show_suggestion()
[4]:
template | coverage | identity | |
---|---|---|---|
0 | 1C1Y_2 | 100.0 | 60.27 |
1 | 1GUA_2 | 100.0 | 60.27 |
2 | 4G0N_2 | 100.0 | 60.27 |
3 | 4G3X_2 | 100.0 | 60.27 |
4 | 6VJJ_2 | 100.0 | 60.27 |
5 | 6XGU_2 | 100.0 | 60.27 |
6 | 6XGV_2 | 100.0 | 60.27 |
7 | 6XHA_2 | 100.0 | 60.27 |
8 | 6XHB_2 | 100.0 | 60.27 |
9 | 6XI7_2 | 100.0 | 60.27 |
10 | 7JHP_2 | 100.0 | 60.27 |
11 | 3KUC_2 | 100.0 | 58.90 |
12 | 3KUD_2 | 100.0 | 58.90 |
13 | 3NY5_1 | 100.0 | 58.90 |
14 | 6NTD_2 | 100.0 | 53.42 |
15 | 6NTC_2 | 100.0 | 52.05 |
[5]:
gen.alignment.print_clustal(70)
ARAF -------------GTVKVYLPNKQRTVVTVRDGMSVYDSLDKALKVRGLNQDCCVVYRLI---KGRKTVT
1C1Y_2 ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
1GUA_2 --------PSKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
3KUC_2 --------PSKTSNTIRVFLPNKQRTVVRVRNGMSLHDCLMKKLKVRGLQPECCAVFRLLHEHKGKKARL
3KUD_2 --------PSKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKKLKVRGLQPECCAVFRLLHEHKGKKARL
3NY5_1 MGHHHHHHSHMQKPIVRVFLPNKQRTVVPARCGVTVRDSLKKALMMRGLIPECCAVYRIQ---DGEKKPI
4G0N_2 -----------TSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
4G3X_2 ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
6NTC_2 --------GAMDSNTIRVLLPNQEWTVVKVRNGMSLHDSLMKALKRHGLQPESSAVFRLLHEHKGKKARL
6NTD_2 --------GAMDSNTIRVLLPNHERTVVKVRNGMSLHDSLMKALKRHGLQPESSAVFRLLHEHKGKKARL
6VJJ_2 ---------SKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
6XGU_2 ---------SKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
6XGV_2 ---------SKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
6XHA_2 ---------SKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
6XHB_2 ---------SKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
6XI7_2 ---------SKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
7JHP_2 ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
ARAF AWDTAIAPLDGEELIVEVL---------------------------------------------------
1C1Y_2 DWNTDAASLIGEELQVDFL---------------------------------------------------
1GUA_2 DWNTDAASLIGEELQVDFL---------------------------------------------------
3KUC_2 DWNTDAASLIGEELQVDFL---------------------------------------------------
3KUD_2 DWNTDAASLIGEELQVDFL---------------------------------------------------
3NY5_1 GWDTDISWLTGEELHVEVLENVPLTTHNF-----------------------------------------
4G0N_2 DWNTDAASLIGEELQVDFL---------------------------------------------------
4G3X_2 DWNTDAASLIGEELQVDFL---------------------------------------------------
6NTC_2 DWNTDAASLIGEELQVDFL---------------------------------------------------
6NTD_2 DWNTDAASLIGEELQVDFL---------------------------------------------------
6VJJ_2 DWNTDAASLIGEELQVDFL---------------------------------------------------
6XGU_2 DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
6XGV_2 DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
6XHA_2 DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
6XHB_2 DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
6XI7_2 DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
7JHP_2 DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
ARAF ------
1C1Y_2 ------
1GUA_2 ------
3KUC_2 ------
3KUD_2 ------
3NY5_1 ------
4G0N_2 ------
4G3X_2 ------
6NTC_2 ------
6NTD_2 ------
6VJJ_2 ------
6XGU_2 MCVDWS
6XGV_2 MCVDWS
6XHA_2 MCVDWS
6XHB_2 MCVDWS
6XI7_2 MCVDWS
7JHP_2 MCVDW-
After potentially filtering out some sequences, we can proceed with the next step: downloading the structures for our templates, comparing the sequences of the templates with the residues present in the template structure and make adjustments to both the structure and the alignment if necessary.
[6]:
gen.get_pdbs()
Guessing template naming format...
Template naming format guessed: polymer_entity!
Checking template dir...
Template dir found!
Processing templates:
1C1Y downloading from PDB...
1C1Y downloaded!
1C1Y_B: Chain extracted!
1C1Y_B: Alignment updated!
1C1Y_B: PDB processed!
1GUA downloading from PDB...
1GUA downloaded!
1GUA_B: Chain extracted!
1GUA_B: Alignment updated!
1GUA_B: PDB processed!
3KUC downloading from PDB...
3KUC downloaded!
3KUC_B: Chain extracted!
3KUC_B: Alignment updated!
3KUC_B: PDB processed!
3KUD downloading from PDB...
3KUD downloaded!
3KUD_B: Chain extracted!
3KUD_B: Alignment updated!
3KUD_B: PDB processed!
3NY5 downloading from PDB...
3NY5 downloaded!
3NY5_A: Chain extracted!
3NY5_A: Alignment updated!
3NY5_A: PDB processed!
3NY5_B: Chain extracted!
3NY5_B: Alignment updated!
3NY5_B: PDB processed!
3NY5_C: Chain extracted!
3NY5_C: Alignment updated!
3NY5_C: PDB processed!
3NY5_D: Chain extracted!
3NY5_D: Alignment updated!
3NY5_D: PDB processed!
4G0N downloading from PDB...
4G0N downloaded!
4G0N_B: Chain extracted!
4G0N_B: Alignment updated!
4G0N_B: PDB processed!
4G3X downloading from PDB...
4G3X downloaded!
4G3X_B: Chain extracted!
4G3X_B: Alignment updated!
4G3X_B: PDB processed!
6NTC downloading from PDB...
6NTC downloaded!
6NTC_B: Chain extracted!
6NTC_B: Alignment updated!
6NTC_B: PDB processed!
6NTD downloading from PDB...
6NTD downloaded!
6NTD_B: Chain extracted!
6NTD_B: Alignment updated!
6NTD_B: PDB processed!
6VJJ downloading from PDB...
6VJJ downloaded!
6VJJ_B: Chain extracted!
6VJJ_B: Alignment updated!
6VJJ_B: PDB processed!
6XGU downloading from PDB...
6XGU downloaded!
6XGU_B: Chain extracted!
6XGU_B: Alignment updated!
6XGU_B: PDB processed!
6XGV downloading from PDB...
6XGV downloaded!
6XGV_B: Chain extracted!
6XGV_B: Alignment updated!
6XGV_B: PDB processed!
6XHA downloading from PDB...
6XHA downloaded!
6XHA_B: Chain extracted!
6XHA_B: Alignment updated!
6XHA_B: PDB processed!
6XHB downloading from PDB...
6XHB downloaded!
6XHB_B: Chain extracted!
6XHB_B: Alignment updated!
6XHB_B: PDB processed!
6XI7 downloading from PDB...
6XI7 downloaded!
6XI7_B: Chain extracted!
6XI7_B: Alignment updated!
6XI7_B: PDB processed!
7JHP downloading from PDB...
7JHP downloaded!
7JHP_C: Chain extracted!
7JHP_C: Alignment updated!
7JHP_C: PDB processed!
Finishing... All templates successfully
downloaded and processed!
Templates can be found in
"/home/homelette/workdir/templates".
get_pdbs
will check all chains of a template and download those with the correct sequence.
[7]:
gen.alignment.print_clustal(70)
ARAF -------------GTVKVYLPNKQRTVVTVRDGMSVYDSLDKALKVRGLNQDCCVVYRLI---KGRKTVT
1C1Y_B ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
1GUA_B -------------NTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
3KUC_B -------------NTIRVFLPNKQRTVVRVRNGMSLHDCLMKKLKVRGLQPECCAVFRLLHEHKGKKARL
3KUD_B -------------NTIRVFLPNKQRTVVNVRNGMSLHDCLMKKLKVRGLQPECCAVFRLLHEHKGKKARL
3NY5_A ---------H-QKPIVRVFLPNKQRTVVPARCGVTVRDSLKKAL--RGLIPECCAVYRIQ------KKPI
3NY5_B --------SH-QKPIVRVFLPNKQRTVVPARCGVTVRDSLKKAL--RGLIPECCAVYRIQ-----EKKPI
3NY5_C -----------QKPIVRVFLPNKQRTVVPARCGVTVRDSLKKAL--RGLIPECCAVYRIQ------KKPI
3NY5_D ---------H-QKPIVRVFLPNKQRTVVPARCGVTVRDSLKKAL--RGLIPECCAVYRI-------KKPI
4G0N_B -----------TSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
4G3X_B ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
6NTC_B -------------NTIRVLLPNQEWTVVKV---MSLHDSLMKALKRHGLQPESSAVF---------KARL
6NTD_B ------------SNTIRVLLPNHERTVVKVRNGMSLHDSLMKALKRHGLQPESSAVF-----------RL
6VJJ_B ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
6XGU_B ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPE-CAVFRLLHEHKGKKARL
6XGV_B ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPE-CAVFRLLHEHKGKKARL
6XHA_B ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPE-CAVFRLLHEHKGKKARL
6XHB_B ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPE-CAVFRLLHEHKGKKARL
6XI7_B -------------NTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLH----KKARL
7JHP_C ------------SNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLL-----KKARL
ARAF AWDTAIAPLDGEELIVEVL---------------------------------------------------
1C1Y_B DWNTDAASLIGEELQVDFL---------------------------------------------------
1GUA_B DWNTDAASLIGEELQVDFL---------------------------------------------------
3KUC_B DWNTDAASLIGEELQVDFL---------------------------------------------------
3KUD_B DWNTDAASLIGEELQVDFL---------------------------------------------------
3NY5_A GWDTDISWLTGEELHVEVLENVPLT---------------------------------------------
3NY5_B GWDTDISWLTGEELHVEVLENVPLTTH-------------------------------------------
3NY5_C GWDTDISWLTGEELHVEVLENVPLTTH-------------------------------------------
3NY5_D GWDTDISWLTGEELHVEVLENVPL----------------------------------------------
4G0N_B DWNTDAASLIGEELQVDFL---------------------------------------------------
4G3X_B DWNTDAASLIGEELQVDFL---------------------------------------------------
6NTC_B DWNTDAASLIGEELQVDF----------------------------------------------------
6NTD_B DWNTDAASLIGEELQVD-----------------------------------------------------
6VJJ_B DWNTDAASLIGEELQVDFL---------------------------------------------------
6XGU_B DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
6XGV_B DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
6XHA_B DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
6XHB_B DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
6XI7_B DWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
7JHP_C DWNTDAASLIGEELQVDFLDH--LTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPT
ARAF ------
1C1Y_B ------
1GUA_B ------
3KUC_B ------
3KUD_B ------
3NY5_A ------
3NY5_B ------
3NY5_C ------
3NY5_D ------
4G0N_B ------
4G3X_B ------
6NTC_B ------
6NTD_B ------
6VJJ_B ------
6XGU_B MCVDWS
6XGV_B MCVDWS
6XHA_B MCVDWS
6XHB_B MCVDWS
6XI7_B MCV---
7JHP_C MCVDW-
Now we can directly use these template for homology modelling:
[8]:
# initialize task
t = gen.initialize_task(task_name = 'Tutorial8', overwrite = True)
# create a model per template
templates = [temp for temp in t.alignment.sequences.keys() if temp != 'ARAF']
for template in templates:
t.execute_routine(
tag = f'test_{template}',
routine = hm.routines.Routine_automodel_default,
templates = [template],
template_location = './templates/'
)
[9]:
# inspect models
t.models
[9]:
[<homelette.organization.Model at 0x7f22492f4340>,
<homelette.organization.Model at 0x7f22492f45b0>,
<homelette.organization.Model at 0x7f229829a610>,
<homelette.organization.Model at 0x7f2273b6afa0>,
<homelette.organization.Model at 0x7f2273b38ee0>,
<homelette.organization.Model at 0x7f22491c0e50>,
<homelette.organization.Model at 0x7f22491bf070>,
<homelette.organization.Model at 0x7f22491bf880>,
<homelette.organization.Model at 0x7f22491c5760>,
<homelette.organization.Model at 0x7f22491c5a00>,
<homelette.organization.Model at 0x7f22491c8310>,
<homelette.organization.Model at 0x7f22491c8820>,
<homelette.organization.Model at 0x7f22491b0f10>,
<homelette.organization.Model at 0x7f22491c96a0>,
<homelette.organization.Model at 0x7f22491c9b80>,
<homelette.organization.Model at 0x7f22491c8af0>,
<homelette.organization.Model at 0x7f22492f49d0>,
<homelette.organization.Model at 0x7f22491bfbe0>,
<homelette.organization.Model at 0x7f2273b38040>]
Method 2: HHSuite
This class is build on the hhblits
query function of the HHSuite3 [5].
This has the same interface as AlignmentGenerator_pdb
, except some different settings for the alignment generation with get_pdbs
.
It should also be noted that technically, this approach does not generate a multiple sequence alignment, but rather a combined alignment of lots of pairwise alignments of query to template. These pairwise alignments are combined on the common sequence they are all aligned to.
(This code is commented out since it requires big databases to run, which are not part of the docker container.)
[10]:
# gen = hm.alignment.AlignmentGenerator_hhblits.from_fasta('data/alignments/ARAF.fa')
# gen.get_suggestion(database_dir='/home/philipp/Downloads/hhsuite_dbs/')
# gen.get_pdbs()
# gen.show_suggestion()
# t = gen.initialize_task()
Method 3: Using pre-computed alignments
If you already have an alignment computed, but want to make use of get_pdbs
in order to download the templates and process the alignment and the template structures, there is also the possibility to load your alignment into an AlignmentGenerator
object:
[11]:
# initialize an alignment generator from a pre-computed alignemnt
gen = hm.alignment.AlignmentGenerator_from_aln(
alignment_file = 'data/alignments/unprocessed.fasta_aln',
target = 'ARAF')
gen.show_suggestion()
gen.alignment.print_clustal(70)
gen.get_pdbs()
gen.alignment.print_clustal(70)
ARAF -------------GTVKVYLPNKQRTVVTVRDGMSVYDSLDKALKVRGLNQDCCVVYRLI---KGRKTVT
3NY5 MGHHHHHHSHMQKPIVRVFLPNKQRTVVPARCGVTVRDSLKKALMMRGLIPECCAVYRIQ---DGEKKPI
4G0N -----------TSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
ARAF AWDTAIAPLDGEELIVEVL----------
3NY5 GWDTDISWLTGEELHVEVLENVPLTTHNF
4G0N DWNTDAASLIGEELQVDFL----------
Guessing template naming format...
Template naming format guessed: entry!
Checking template dir...
Template dir found!
Processing templates:
3NY5 downloading from PDB...
3NY5 downloaded!
3NY5_A: Chain extracted!
3NY5_A: Alignment updated!
3NY5_A: PDB processed!
3NY5_B: Chain extracted!
3NY5_B: Alignment updated!
3NY5_B: PDB processed!
3NY5_C: Chain extracted!
3NY5_C: Alignment updated!
3NY5_C: PDB processed!
3NY5_D: Chain extracted!
3NY5_D: Alignment updated!
3NY5_D: PDB processed!
4G0N downloading from PDB...
4G0N downloaded!
4G0N_B: Chain extracted!
4G0N_B: Alignment updated!
4G0N_B: PDB processed!
Finishing... All templates successfully
downloaded and processed!
Templates can be found in
"./templates/".
ARAF -------------GTVKVYLPNKQRTVVTVRDGMSVYDSLDKALKVRGLNQDCCVVYRLI---KGRKTVT
3NY5_A ---------H-QKPIVRVFLPNKQRTVVPARCGVTVRDSLKKAL--RGLIPECCAVYRIQ------KKPI
3NY5_B --------SH-QKPIVRVFLPNKQRTVVPARCGVTVRDSLKKAL--RGLIPECCAVYRIQ-----EKKPI
3NY5_C -----------QKPIVRVFLPNKQRTVVPARCGVTVRDSLKKAL--RGLIPECCAVYRIQ------KKPI
3NY5_D ---------H-QKPIVRVFLPNKQRTVVPARCGVTVRDSLKKAL--RGLIPECCAVYRI-------KKPI
4G0N_B -----------TSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARL
ARAF AWDTAIAPLDGEELIVEVL----------
3NY5_A GWDTDISWLTGEELHVEVLENVPLT----
3NY5_B GWDTDISWLTGEELHVEVLENVPLTTH--
3NY5_C GWDTDISWLTGEELHVEVLENVPLTTH--
3NY5_D GWDTDISWLTGEELHVEVLENVPL-----
4G0N_B DWNTDAASLIGEELQVDFL----------
Again, for every template structure, homelette
is finding which chains fit to the sequence and then extract all of them.
Of course, if your alignment and template(s) are already processed, it is perfectly fine to use the Alignment
class directly as we have done in the previous tutorials.
Implementing own methods
While not discussed in Tutorial 4, AlignmentGenerator
object are also building blocks in the homelette
framework and custom versions can be implemented. All AlignmentGenerator
children classes so far inherit from the AlignmentGenerator
abstract base class, which contains some useful functionality for writing your own alignment generations, in particular the get_pdbs
function.
Further Reading
Congratulation on finishing the tutorial about alignment generation in homelette
.
Please note that there are other tutorials, which will teach you more about how to use homelette
.
Tutorial 1: Learn about the basics of
homelette
.Tutorial 2: Learn more about already implemented routines for homology modelling.
Tutorial 3: Learn about the evaluation metrics available with
homelette
.Tutorial 4: Learn about extending
homelette
’s functionality by defining your own modelling routines and evaluation metrics.Tutorial 5: Learn about how to use parallelization in order to generate and evaluate models more efficiently.
Tutorial 6: Learn about modelling protein complexes.
Tutorial 7: Learn about assembling custom pipelines.
References
[1] Rose, Y., Duarte, J. M., Lowe, R., Segura, J., Bi, C., Bhikadiya, C., Chen, L., Rose, A. S., Bittrich, S., Burley, S. K., & Westbrook, J. D. (2021). RCSB Protein Data Bank: Architectural Advances Towards Integrated Searching and Efficient Access to Macromolecular Structure Data from the PDB Archive. Journal of Molecular Biology, 433(11), 166704. https://doi.org/10.1016/J.JMB.2020.11.003
[2] Steinegger, M., & Söding, J. (2017). MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature Biotechnology 2017 35:11, 35(11), 1026–1028. https://doi.org/10.1038/nbt.3988
[3] Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Söding, J., Thompson, J. D., & Higgins, D. G. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular Systems Biology, 7(1), 539. https://doi.org/10.1038/MSB.2011.75
[4] Sievers, F., & Higgins, D. G. (2018). Clustal Omega for making accurate alignments of many protein sequences. Protein Science, 27(1), 135–145. https://doi.org/10.1002/PRO.3290
[5] Steinegger, M., Meier, M., Mirdita, M., Vöhringer, H., Haunsberger, S. J., & Söding, J. (2019). HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics, 20(1), 1–15. https://doi.org/10.1186/S12859-019-3019-7/FIGURES/7
Session Info
[12]:
# session info
import session_info
session_info.show(html = False, dependencies = True)
-----
homelette 1.4
pandas 1.5.3
session_info 1.0.0
-----
PIL 7.0.0
altmod NA
anyio NA
asttokens NA
attr 19.3.0
babel 2.12.1
backcall 0.2.0
certifi 2022.12.07
chardet 3.0.4
charset_normalizer 3.1.0
comm 0.1.2
cycler 0.10.0
cython_runtime NA
dateutil 2.8.2
debugpy 1.6.6
decorator 4.4.2
executing 1.2.0
fastjsonschema NA
idna 3.4
importlib_metadata NA
importlib_resources NA
ipykernel 6.21.3
ipython_genutils 0.2.0
jedi 0.18.2
jinja2 3.1.2
json5 NA
jsonschema 4.17.3
jupyter_events 0.6.3
jupyter_server 2.4.0
jupyterlab_server 2.20.0
kiwisolver 1.0.1
markupsafe 2.1.2
matplotlib 3.1.2
modeller 10.4
more_itertools NA
mpl_toolkits NA
nbformat 5.7.3
numexpr 2.8.4
numpy 1.24.2
ost 2.3.1
packaging 20.3
parso 0.8.3
pexpect 4.8.0
pickleshare 0.7.5
pkg_resources NA
platformdirs 3.1.1
prometheus_client NA
promod3 3.2.1
prompt_toolkit 3.0.38
psutil 5.5.1
ptyprocess 0.7.0
pure_eval 0.2.2
pydev_ipython NA
pydevconsole NA
pydevd 2.9.5
pydevd_file_utils NA
pydevd_plugins NA
pydevd_tracing NA
pygments 2.14.0
pyparsing 2.4.6
pyrsistent NA
pythonjsonlogger NA
pytz 2022.7.1
qmean NA
requests 2.28.2
rfc3339_validator 0.1.4
rfc3986_validator 0.1.1
send2trash NA
sitecustomize NA
six 1.12.0
sniffio 1.3.0
stack_data 0.6.2
swig_runtime_data4 NA
tornado 6.2
traitlets 5.9.0
urllib3 1.26.15
wcwidth NA
websocket 1.5.1
yaml 6.0
zipp NA
zmq 25.0.1
-----
IPython 8.11.0
jupyter_client 8.0.3
jupyter_core 5.2.0
jupyterlab 3.6.1
notebook 6.5.3
-----
Python 3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0]
Linux-4.15.0-206-generic-x86_64-with-glibc2.29
-----
Session information updated at 2023-03-15 23:40