PDB structure without hydrogens

Very often, PDB structures downloaded directly from the PDB database will not have determined hydrogen atoms that are required, for example, for predicting hydrogen bonds. In such a case, we can use the addHydrogens() function. It will allow us to use one of two available methods (openbabel or pdbfixer) to predict the position of hydrogen atoms in protein structure.

To use one of those functions, we need to install additional Python package(s). For Anaconda users, the installation will be the following:

Installation of Openbabel:

conda install -c conda-forge openbabel

Installation of PDBfixer:

conda install -c conda-forge pdbfixer

Add missing hydrogen atoms to the structure

We start by fetching the PDB file with 5KQM code (5kqm.pdb). Openbabel requires having the PDB file in the same folder. Therefore, it needs to be downloaded and saved to successfully perform the operation with adding missing hydrogens. A new file will be saved with the same name with the additional prefix ‘addH_‘.

In [1]: from prody import *

In [2]: from pylab import *

In [3]: import matplotlib

In [4]: ion()   # turn interactive mode on

Openbabel or PDBfixer require PDB file saved in the direcory. Therefore first it needs to be downloaded.

In [5]: fetchPDB('5kqm', compressed=False)
'5kqm.pdb'

When PDB file is already in the local directory, we can choose between Openbabel and PDBfixer to add missing hydrogen bonds to the protein structure:

Openbabel:

In [6]: PDBname = '5kqm.pdb'

In [7]: addMissingAtoms(PDBname, method='openbabel')
@> Hydrogens were added to the structure. Structure addH_5kqm.pdb is saved in the local directry.

PDBfixer:

In [8]: addMissingAtoms(PDBname, method='pdbfixer')
@> Hydrogens were added to the structure. New structure is saved as addH_5kqm.pdb.

Next, we can parse the saved structure with hydrogen atoms to ProDy and analyze it in the same way as in the previous paragraph.

In [9]: atoms = parsePDB('addH_'+str(PDBname)).select('protein')
@> 2800 atoms and 1 coordinate set(s) were parsed in 0.03s.