Release Notes

v2.5.0 series come with new and improved sequence, structure, and dynamics analysis features. See release notes for details.

How to Cite

Bakan A, Meireles LM, Bahar I ProDy: Protein Dynamics Inferred from Theory and Experiments
Bioinformatics 2011 27(11):1575-1577.

Bakan A, Dutta A, Mao W, Liu Y, Chennubhotla C, Lezon TR, Bahar I Evol and ProDy for Bridging Protein Sequence Evolution and Structural Dynamics
Bioinformatics 2014 30(18):2681-2683.

Zhang S, Krieger JM, Zhang Y, Kaya C, Kaynak B, Mikulska-Ruminska K, Doruker P, Li H, Bahar I ProDy 2.0: Increased scale and scope after 10 years of protein dynamics modelling with Python
Bioinformatics 2021 37(20):3657-3659.

Measurement Tools¶

This module defines a class and methods and for comparing coordinate data and measuring quantities.

buildDistMatrix(atoms1, atoms2=None, unitcell=None, format='mat', seqsep=None)[source]¶

Returns distance matrix. When atoms2 is given, a distance matrix with shape (len(atoms1), len(atoms2)) is built. When atoms2 is None, a symmetric matrix with shape (len(atoms1), len(atoms1)) is built. If unitcell array is provided, periodic boundary conditions will be taken into account.

Parameters:

Parameters:	atoms1 (`Atomic`, `numpy.ndarray`) – atom or coordinate data atoms2 (`Atomic`, `numpy.ndarray`) – atom or coordinate data unitcell (`numpy.ndarray`) – orthorhombic unitcell dimension array with shape `(3,)` format (bool) – format of the resulting array, one of `'mat'` (matrix, default), `'rcd'` (arrays of row indices, column indices, and distances), or `'arr'` (only array of distances) seqsep (int) – if provided, distances will only be measured between atoms with resnum differences that are greater than or equal to seqsep.

atoms1 (Atomic, numpy.ndarray) – atom or coordinate data
atoms2 (Atomic, numpy.ndarray) – atom or coordinate data
unitcell (numpy.ndarray) – orthorhombic unitcell dimension array with shape (3,)
format (bool) – format of the resulting array, one of 'mat' (matrix, default), 'rcd' (arrays of row indices, column indices, and distances), or 'arr' (only array of distances)
seqsep (int) – if provided, distances will only be measured between atoms with resnum differences that are greater than or equal to seqsep.

calcDistance(atoms1, atoms2, unitcell=None)[source]¶

Returns the Euclidean distance between atoms1 and atoms2. Arguments may be Atomic instances or NumPy arrays. Shape of numpy arrays must be ([M,]N,3), where M is number of coordinate sets and N is the number of atoms. If unitcell array is provided, periodic boundary conditions will be taken into account.

Parameters:	atoms1 (`Atomic`, `numpy.ndarray`) – atom or coordinate data atoms2 (`Atomic`, `numpy.ndarray`) – atom or coordinate data unitcell (`numpy.ndarray`) – orthorhombic unitcell dimension array with shape `(3,)`

calcGyradius(atoms, weights=None)[source]¶: Calculate radius of gyration of atoms.

calcCenter(atoms, weights=None)[source]¶: Returns geometric center of atoms. If weights is given it must be a flat array with length equal to number of atoms. Mass center of atoms can be calculated by setting weights equal to atom masses, i.e. weights=atoms.getMasses().

calcAngle(atoms1, atoms2, atoms3, radian=False)[source]¶: Returns the angle between atoms in degrees.

calcDihedral(atoms1, atoms2, atoms3, atoms4, radian=False)[source]¶: Returns the dihedral angle between atoms in degrees.

getAngle(coords1, coords2, coords3, radian=False)[source]¶: Returns bond angle in degrees unless radian=True

getDihedral(coords1, coords2, coords3, coords4, radian=False)[source]¶: Returns the dihedral angle in degrees unless radian=True.

calcOmega(residue, radian=False, dist=4.1)[source]¶: Returns ω (omega) angle of residue in degrees. This function checks the distance between Cα atoms of two residues and raises an exception if the residues are disconnected. Set dist to None, to avoid this.

calcPhi(residue, radian=False, dist=4.1)[source]¶: Returns φ (phi) angle of residue in degrees. This function checks the distance between Cα atoms of two residues and raises an exception if the residues are disconnected. Set dist to None, to avoid this.

calcPsi(residue, radian=False, dist=4.1)[source]¶: Returns ψ (psi) angle of residue in degrees. This function checks the distance between Cα atoms of two residues and raises an exception if the residues are disconnected. Set dist to None, to avoid this.

calcMSF(coordsets)[source]¶

Calculate mean square fluctuation(s) (MSF). coordsets may be an instance of Ensemble, TrajBase, or Atomic. For trajectory objects, e.g. DCDFile, frames will be considered after they are superposed. For other ProDy objects, coordinate sets should be aligned prior to MSF calculation.

Note that using trajectory files that store 32-bit coordinate will result in lower precision in calculations. Over 10,000 frames this may result in up to 5% difference from the values calculated using 64-bit arrays. To ensure higher-precision calculations for DCDFile instances, you may use astype argument, i.e. astype=float, to auto recast coordinate data to double-precision (64-bit) floating-point format.

calcRMSF(coordsets)[source]¶

Returns root mean square fluctuation(s) (RMSF). coordsets may be an instance of Ensemble, TrajBase, or Atomic. For trajectory objects, e.g. DCDFile, frames will be considered after they are superposed. For other ProDy objects, coordinate sets should be aligned prior to MSF calculation.

calcDeformVector(from_atoms, to_atoms, weights=None)[source]¶: Returns deformation from from_atoms to atoms_to as a Vector instance.

buildADPMatrix(atoms)[source]¶

Returns a 3Nx3N symmetric matrix containing anisotropic displacement parameters (ADPs) along the diagonal as 3x3 super elements.

In [1]: from prody import *

In [2]: protein = parsePDB('1ejg')

In [3]: calphas = protein.select('calpha')

In [4]: adp_matrix = buildADPMatrix(calphas)

calcADPAxes(atoms, **kwargs)[source]¶

Returns a 3Nx3 array containing principal axes defining anisotropic displacement parameter (ADP, or anisotropic temperature factor) ellipsoids.

Parameters:

Parameters:	atoms (`Atomic`) – a ProDy object for handling atomic data fract (float) – For an atom, if the fraction of anisotropic displacement explained by its largest axis/eigenvector is less than given value, all axes for that atom will be set to zero. Values larger than 0.33 and smaller than 1.0 are accepted. ratio2 (float) – For an atom, if the ratio of the second-largest eigenvalue to the largest eigenvalue axis less than or equal to the given value, all principal axes for that atom will be returned. Values less than 1 and greater than 0 are accepted. ratio3 (float) – For an atom, if the ratio of the smallest eigenvalue to the largest eigenvalue is less than or equal to the given value, all principal axes for that atom will be returned. Values less than 1 and greater than 0 are accepted. ratio (float) – Same as ratio3.

atoms (Atomic) – a ProDy object for handling atomic data
fract (float) – For an atom, if the fraction of anisotropic displacement explained by its largest axis/eigenvector is less than given value, all axes for that atom will be set to zero. Values larger than 0.33 and smaller than 1.0 are accepted.
ratio2 (float) – For an atom, if the ratio of the second-largest eigenvalue to the largest eigenvalue axis less than or equal to the given value, all principal axes for that atom will be returned. Values less than 1 and greater than 0 are accepted.
ratio3 (float) – For an atom, if the ratio of the smallest eigenvalue to the largest eigenvalue is less than or equal to the given value, all principal axes for that atom will be returned. Values less than 1 and greater than 0 are accepted.
ratio (float) – Same as ratio3.

Keyword arguments fract, ratio2, or ratio3 can be used to set principal axes to 0 for atoms showing relatively lower degree of anisotropy.

3Nx3 axis contains N times 3x3 matrices, one for each given atom. Columns of these 3x3 matrices are the principal axes which are weighted by square root of their eigenvalues. The first columns correspond to largest principal axes.

The direction of the principal axes for an atom is determined based on the correlation of the axes vector with the principal axes vector of the previous atom.

In [1]: from prody import *

In [2]: protein = parsePDB('1ejg')

In [3]: calphas = protein.select('calpha')

In [4]: adp_axes = calcADPAxes( calphas )
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-5a7bd3de55e3> in <module>()
----> 1 adp_axes = calcADPAxes( calphas )

/home/exx/ProDy-website/ProDy/prody/measure/measure.pyc in calcADPAxes(atoms, **kwargs)
    734         # is selected
    735         vals = vals * sign((vecs * axes[(i-1)*3:(i)*3, :]).sum(0))
--> 736         axes[i*3:i*3, :] = vals * vecs
    737     # Resort the columns before returning array
    738     axes = axes[:, [2, 1, 0]]

ValueError: could not broadcast input array from shape (3,3) into shape (0,3)

In [5]: adp_axes.shape
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-5-71e545866239> in <module>()
----> 1 adp_axes.shape

NameError: name 'adp_axes' is not defined

These can be written in NMD format as follows:

In [6]: nma = NMA('ADPs')

In [7]: nma.setEigens(adp_axes)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-e3908103788d> in <module>()
----> 1 nma.setEigens(adp_axes)

NameError: name 'adp_axes' is not defined

In [8]: nma
Out[8]: <NMA: ADPs (0 modes; 0 atoms)>

In [9]: writeNMD('adp_axes.nmd', nma, calphas)
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-9-2a12f0fa27fa> in <module>()
----> 1 writeNMD('adp_axes.nmd', nma, calphas)

/home/exx/ProDy-website/ProDy/prody/dynamics/nmdfile.pyc in writeNMD(filename, modes, atoms)
    388                         'not {0}'.format(type(modes)))
    389     if modes.numAtoms() != atoms.numAtoms():
--> 390         raise Exception('number of atoms do not match')
    391     out = openFile(filename, 'w')
    392 

Exception: number of atoms do not match

calcADPs(atom)[source]¶

Calculate anisotropic displacement parameters (ADPs) from anisotropic temperature factors (ATFs).

atom must have ATF values set for ADP calculation. ADPs are returned as a tuple, i.e. (eigenvalues, eigenvectors).

pickCentral(obj, weights=None)[source]¶: Returns Atom or Conformation that is closest to the center of obj, which may be an Atomic or Ensemble instance. See also pickCentralAtom(), and pickCentralConf() functions.

pickCentralAtom(atoms, weights=None)[source]¶: Returns Atom that is closest to the center, which is calculated using calcCenter().

pickCentralConf(ens, weights=None)[source]¶: Returns Conformation that is closest to the center of ens. In addition to Ensemble instances, Atomic instances are accepted as ens argument. In this case a Selection with central coordinate set as active will be returned.

calcInertiaTensor(coords)[source]¶: “Calculate inertia tensor from coords

calcPrincAxes(coords, turbo=True)[source]¶: Calculate principal axes from coords

calcDistanceMatrix(coords, cutoff=None)[source]¶

Calculate matrix of distances between coordinates within cutoff. Other matrix entries are set to maximum of calculated distances.

Parameters:	coords (`ndarray`, `Atomic`) – a coordinate set or an object with `getCoords()` method. cutoff (None, float) – cutoff distance for searching the KDTree. Default (None) is to use the length of the longest coordinate axis.

assignBlocks(atoms, res_per_block=None, secstr=False, **kwargs)[source]¶

Assigns blocks to protein from atoms using a block size of res_per_block or secondary structure information if secstr is True.

Returns an array of block IDs and an AtomMap corresponding to protein atoms.

Parameters:

Parameters:	atoms (`Atomic`) – atoms to be assigned blocks res_per_block (int) – number of residues per block The last block may be smaller or larger than this. Default is None, allowing secstr to be used easily instead. secstr (bool) – use secondary structure information to assign blocks. Default is False, allowing res_per_block to be used easily instead. Any set of strings that can be retrieved by `getSecstr()` is acceptable including from PDB header, DSSP or STRIDE. shortest_block (int) – smallest number of residues to be included in a block before merging with the previous block Default is 4 as smaller numbers can cause problems for distance matrices. longest_block (int) – largest number of residues to be included in a block before splitting it in half. Default is the length of the protein so it isn’t triggered. min_dist_cutoff (Number) – minimum distance of a residue from others beyond which it is not included in the same block as them using `findSubgroups()`. Default is 20 A, which was found to work well with res_per_block=10.

atoms (Atomic) – atoms to be assigned blocks
res_per_block (int) – number of residues per block The last block may be smaller or larger than this. Default is None, allowing secstr to be used easily instead.
secstr (bool) – use secondary structure information to assign blocks. Default is False, allowing res_per_block to be used easily instead. Any set of strings that can be retrieved by getSecstr() is acceptable including from PDB header, DSSP or STRIDE.
shortest_block (int) – smallest number of residues to be included in a block before merging with the previous block Default is 4 as smaller numbers can cause problems for distance matrices.
longest_block (int) – largest number of residues to be included in a block before splitting it in half. Default is the length of the protein so it isn’t triggered.
min_dist_cutoff (Number) – minimum distance of a residue from others beyond which it is not included in the same block as them using findSubgroups(). Default is 20 A, which was found to work well with *res_per_block*=10.