homelette.pdb_io
The homelette.pdb_io
submodule contains an object for parsing and
manipulating PDB files. There are several constructor function that can read
PDB files or download them from the internet.
Functions and classes
Functions and classes present in homelette.pdb_io are listed below:
- homelette.pdb_io.read_pdb(file_name: str) PdbObject
Reads PDB from file.
- Parameters:
file_name (str) – PDB file name
- Return type:
Notes
If a PDB file with multiple MODELs is read, only the first model will be conserved.
- homelette.pdb_io.download_pdb(pdbid: str) PdbObject
Download PDB from the RCSB.
- Parameters:
pdbid (str) – PDB identifier
- Return type:
Notes
If a PDB file with multiple MODELs is read, only the first model will be conserved.
- class homelette.pdb_io.PdbObject(lines: Iterable)
Object encapsulating functionality regarding the processing of PDB files
- Parameters:
lines (Iterable) – The lines of the PDB
- Variables:
lines – The lines of the PDB, filtered for ATOM and HETATM records
- Return type:
None
See also
Notes
Please contruct instances of PdbObject using the constructor functions.
If a PDB file with multiple MODELs is read, only the first model will be conserved.
- write_pdb(file_name) None
Write PDB to file.
- Parameters:
file_name (str) – The name of the file to write the PDB to.
- Return type:
None
- parse_to_pd() pandas.DataFrame
Parses PDB to pandas dataframe.
- Return type:
pd.DataFrame
Notes
Information is extracted according to the PDB file specification (version 3.30) and columns are named accordingly. See https://www.wwpdb.org/documentation/file-format for more information.
- get_sequence(ignore_missing: bool = True) str
Retrieve the 1-letter amino acid sequence of the PDB, grouped by chain.
- Parameters:
ignore_missing (bool) – Changes behaviour with regards to unmodelled residues. If True, they will be ignored for generating the sequence (default). If False, they will be represented in the sequence with the character X.
- Returns:
Amino acid sequence
- Return type:
str
- get_chains() list
Extract all chains present in the PDB.
- Return type:
list
- transform_extract_chain(chain) PdbObject
Extract chain from PDB.
- Parameters:
chain (str) – The chain ID to be extracted.
- Return type:
- transform_renumber_residues(starting_res: int = 1) PdbObject
Renumber residues in PDB.
- Parameters:
starting_res (int) – Residue number to start renumbering at (default 1)
- Return type:
Notes
Missing residues in the PDB (i.e. unmodelled) will not be considered in the renumbering. If multiple chains are present in the PDB, numbering will be continued from one chain to the next one.
- transform_change_chain_id(new_chain_id) PdbObject
Replace chain ID for every entry in PDB.
- Parameters:
new_chain_id (str) – New chain ID.
- Return type:
- transform_filter_res_name(selection: Iterable, mode: str = 'out') PdbObject
Filter PDB by residue name.
- Parameters:
selection (Iterable) – For which residue names to filter
mode (str) – Filtering mode. If mode = “out”, the selection will be filtered out (default). If mode = “in”, everything except the selection will be filtered out.
- Return type: