Measurement Tools¶
This module defines a class and methods and for comparing coordinate data and measuring quantities.
-
buildDistMatrix
(atoms1, atoms2=None, unitcell=None, format='mat', seqsep=None)[source]¶ Returns distance matrix. When atoms2 is given, a distance matrix with shape
(len(atoms1), len(atoms2))
is built. When atoms2 is None, a symmetric matrix with shape(len(atoms1), len(atoms1))
is built. If unitcell array is provided, periodic boundary conditions will be taken into account.Parameters: - atoms1 (
Atomic
,numpy.ndarray
) – atom or coordinate data - atoms2 (
Atomic
,numpy.ndarray
) – atom or coordinate data - unitcell (
numpy.ndarray
) – orthorhombic unitcell dimension array with shape(3,)
- format (bool) – format of the resulting array, one of
'mat'
(matrix, default),'rcd'
(arrays of row indices, column indices, and distances), or'arr'
(only array of distances) - seqsep (int) – if provided, distances will only be measured between atoms with resnum differences that are greater than or equal to seqsep.
- atoms1 (
-
calcDistance
(atoms1, atoms2, unitcell=None)[source]¶ Returns the Euclidean distance between atoms1 and atoms2. Arguments may be
Atomic
instances or NumPy arrays. Shape of numpy arrays must be([M,]N,3)
, where M is number of coordinate sets and N is the number of atoms. If unitcell array is provided, periodic boundary conditions will be taken into account.Parameters:
-
calcCenter
(atoms, weights=None)[source]¶ Returns geometric center of atoms. If weights is given it must be a flat array with length equal to number of atoms. Mass center of atoms can be calculated by setting weights equal to atom masses, i.e.
weights=atoms.getMasses()
.
-
calcAngle
(atoms1, atoms2, atoms3, radian=False)[source]¶ Returns the angle between atoms in degrees.
-
calcDihedral
(atoms1, atoms2, atoms3, atoms4, radian=False)[source]¶ Returns the dihedral angle between atoms in degrees.
-
calcOmega
(residue, radian=False, dist=4.1)[source]¶ Returns ω (omega) angle of residue in degrees. This function checks the distance between Cα atoms of two residues and raises an exception if the residues are disconnected. Set dist to None, to avoid this.
-
calcPhi
(residue, radian=False, dist=4.1)[source]¶ Returns φ (phi) angle of residue in degrees. This function checks the distance between Cα atoms of two residues and raises an exception if the residues are disconnected. Set dist to None, to avoid this.
-
calcPsi
(residue, radian=False, dist=4.1)[source]¶ Returns ψ (psi) angle of residue in degrees. This function checks the distance between Cα atoms of two residues and raises an exception if the residues are disconnected. Set dist to None, to avoid this.
-
calcMSF
(coordsets)[source]¶ Calculate mean square fluctuation(s) (MSF). coordsets may be an instance of
Ensemble
,TrajBase
, orAtomic
. For trajectory objects, e.g.DCDFile
, frames will be considered after they are superposed. For other ProDy objects, coordinate sets should be aligned prior to MSF calculation.Note that using trajectory files that store 32-bit coordinate will result in lower precision in calculations. Over 10,000 frames this may result in up to 5% difference from the values calculated using 64-bit arrays. To ensure higher-precision calculations for
DCDFile
instances, you may use astype argument, i.e.astype=float
, to auto recast coordinate data to double-precision (64-bit) floating-point format.
-
calcRMSF
(coordsets)[source]¶ Returns root mean square fluctuation(s) (RMSF). coordsets may be an instance of
Ensemble
,TrajBase
, orAtomic
. For trajectory objects, e.g.DCDFile
, frames will be considered after they are superposed. For other ProDy objects, coordinate sets should be aligned prior to MSF calculation.Note that using trajectory files that store 32-bit coordinate will result in lower precision in calculations. Over 10,000 frames this may result in up to 5% difference from the values calculated using 64-bit arrays. To ensure higher-precision calculations for
DCDFile
instances, you may use astype argument, i.e.astype=float
, to auto recast coordinate data to double-precision (64-bit) floating-point format.
-
calcDeformVector
(from_atoms, to_atoms, weights=None)[source]¶ Returns deformation from from_atoms to atoms_to as a
Vector
instance.
-
buildADPMatrix
(atoms)[source]¶ Returns a 3Nx3N symmetric matrix containing anisotropic displacement parameters (ADPs) along the diagonal as 3x3 super elements.
In [1]: from prody import * In [2]: protein = parsePDB('1ejg') In [3]: calphas = protein.select('calpha') In [4]: adp_matrix = buildADPMatrix(calphas)
-
calcADPAxes
(atoms, **kwargs)[source]¶ Returns a 3Nx3 array containing principal axes defining anisotropic displacement parameter (ADP, or anisotropic temperature factor) ellipsoids.
Parameters: - atoms (
Atomic
) – a ProDy object for handling atomic data - fract (float) – For an atom, if the fraction of anisotropic displacement explained by its largest axis/eigenvector is less than given value, all axes for that atom will be set to zero. Values larger than 0.33 and smaller than 1.0 are accepted.
- ratio2 (float) – For an atom, if the ratio of the second-largest eigenvalue to the largest eigenvalue axis less than or equal to the given value, all principal axes for that atom will be returned. Values less than 1 and greater than 0 are accepted.
- ratio3 (float) – For an atom, if the ratio of the smallest eigenvalue to the largest eigenvalue is less than or equal to the given value, all principal axes for that atom will be returned. Values less than 1 and greater than 0 are accepted.
- ratio (float) – Same as ratio3.
Keyword arguments fract, ratio2, or ratio3 can be used to set principal axes to 0 for atoms showing relatively lower degree of anisotropy.
3Nx3 axis contains N times 3x3 matrices, one for each given atom. Columns of these 3x3 matrices are the principal axes which are weighted by square root of their eigenvalues. The first columns correspond to largest principal axes.
The direction of the principal axes for an atom is determined based on the correlation of the axes vector with the principal axes vector of the previous atom.
In [1]: from prody import * In [2]: protein = parsePDB('1ejg') In [3]: calphas = protein.select('calpha') In [4]: adp_axes = calcADPAxes( calphas ) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-4-5a7bd3de55e3> in <module>() ----> 1 adp_axes = calcADPAxes( calphas ) /home/exx/ProDy-website/ProDy/prody/measure/measure.py in calcADPAxes(atoms, **kwargs) 730 # is selected 731 vals = vals * sign((vecs * axes[(i-1)*3:(i)*3, :]).sum(0)) --> 732 axes[i*3:i*3, :] = vals * vecs 733 # Resort the columns before returning array 734 axes = axes[:, [2, 1, 0]] ValueError: could not broadcast input array from shape (3,3) into shape (0,3) In [5]: adp_axes.shape --------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-5-71e545866239> in <module>() ----> 1 adp_axes.shape NameError: name 'adp_axes' is not defined
These can be written in NMD format as follows:
In [6]: nma = NMA('ADPs') In [7]: nma.setEigens(adp_axes) --------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-7-e3908103788d> in <module>() ----> 1 nma.setEigens(adp_axes) NameError: name 'adp_axes' is not defined In [8]: nma Out[8]: <NMA: ADPs (0 modes; 0 atoms)> In [9]: writeNMD('adp_axes.nmd', nma, calphas) --------------------------------------------------------------------------- Exception Traceback (most recent call last) <ipython-input-9-2a12f0fa27fa> in <module>() ----> 1 writeNMD('adp_axes.nmd', nma, calphas) /home/exx/ProDy-website/ProDy/prody/dynamics/nmdfile.pyc in writeNMD(filename, modes, atoms) 388 'not {0}'.format(type(modes))) 389 if modes.numAtoms() != atoms.numAtoms(): --> 390 raise Exception('number of atoms do not match') 391 out = openFile(filename, 'w') 392 Exception: number of atoms do not match
- atoms (
-
calcADPs
(atom)[source]¶ Calculate anisotropic displacement parameters (ADPs) from anisotropic temperature factors (ATFs).
atom must have ATF values set for ADP calculation. ADPs are returned as a tuple, i.e. (eigenvalues, eigenvectors).
-
pickCentral
(obj, weights=None)[source]¶ Returns
Atom
orConformation
that is closest to the center of obj, which may be anAtomic
orEnsemble
instance. See alsopickCentralAtom()
, andpickCentralConf()
functions.
-
pickCentralAtom
(atoms, weights=None)[source]¶ Returns
Atom
that is closest to the center, which is calculated usingcalcCenter()
.
-
pickCentralConf
(ens, weights=None)[source]¶ Returns
Conformation
that is closest to the center of ens. In addition toEnsemble
instances,Atomic
instances are accepted as ens argument. In this case aSelection
with central coordinate set as active will be returned.
-
calcDistanceMatrix
(coords, cutoff=None)[source]¶ Calculate matrix of distances between coordinates within cutoff. Other matrix entries are set to maximum of calculated distances.
Parameters: - coords (
ndarray
,Atomic
) – a coordinate set or an object withgetCoords()
method. - cutoff (None, float) – cutoff distance for searching the KDTree. Default (None) is to use the length of the longest coordinate axis.
- coords (
-
assignBlocks
(atoms, res_per_block=None, secstr=False, **kwargs)[source]¶ Assigns blocks to protein from atoms using a block size of res_per_block or secondary structure information if secstr is True.
Returns an array of block IDs and an AtomMap corresponding to protein atoms.
Parameters: - atoms (
Atomic
) – atoms to be assigned blocks - res_per_block (int) – number of residues per block The last block may be smaller or larger than this. Default is None, allowing secstr to be used easily instead.
- secstr (bool) – use secondary structure information to assign blocks.
Default is False, allowing res_per_block to be used easily instead.
Any set of strings that can be retrieved by
getSecstr()
is acceptable including from PDB header, DSSP or STRIDE. - shortest_block (int) – smallest number of residues to be included in a block before merging with the previous block Default is 4 as smaller numbers can cause problems for distance matrices.
- longest_block (int) – largest number of residues to be included in a block before splitting it in half. Default is the length of the protein so it isn’t triggered.
- min_dist_cutoff (Number) – minimum distance of a residue from others beyond which
it is not included in the same block as them using
findSubgroups()
. Default is 20 A, which was found to work well with *res_per_block*=10.
- atoms (