Signature Dynamics of Protein Families (SignDy)

This module defines functions for signature dynamics (SignDy), analyzing normal modes obtained for conformations in an ensemble.

class ModeEnsemble(title=None)[source]

A collection of ENMs calculated for conformations in an Ensemble. or PDBEnsemble.

addModeSet(modeset, weights=None, label=None, matched=False, reweighted=False)[source]

Adds a modeset or modesets to the mode ensemble.

delModeSet(index)[source]

Removes a modeset or modesets from the mode ensemble.

getArray(mode_index=0)[source]

Returns a sdarray of row arrays.

getAtoms()[source]

Returns associated atoms of the mode ensemble.

getEigval(mode_index=0)[source]

Returns eigenvalue of a given mode index with respect to the reference.

getEigvals(mode_indices=None)[source]

Returns a sdarray of eigenvalues across modesets.

getEigvec(mode_index=0, sign_correction=True)[source]

Returns a sdarray of eigenvector across modesets.

getEigvecs(mode_indices=None, sign_correction=True)[source]

Returns a sdarray of eigenvectors across modesets.

getIndex(mode_index=0)[source]

Returns indices of modes matched to the reference modeset.

getIndices(mode_indices=None)[source]

Returns indices of modes in the mode ensemble.

getLabels()[source]

Returns the labels of the mode ensemble.

getMatchingStatus()[source]

Returns the matching status of each mode ensemble.

getModeSets(index=None)[source]

Returns the modeset of the given index. If index is None then all modesets are returned.

getReweightingStatus()[source]

Returns the reweighting status of each mode ensemble.

getTitle()[source]

Returns title of the signature.

getVariance(mode_index=0)[source]

Returns variances of a given mode index with respect to the reference.

getVariances(mode_indices=None)[source]

Returns a sdarray of variances across modesets.

getWeights()[source]

Returns a copy of weights.

is3d()[source]

Returns True is model is 3-dimensional.

isMatched()[source]

Returns whether the modes are matched across ALL modesets in the mode ensemble

isReweighted()[source]

Returns whether the modes are matched across ALL modesets in the mode ensemble

match(turbo=False, method=None)[source]

Matches the modes across mode sets according the mode overlaps.

Parameters:turbo (bool, int) – if True then the computation will be performed in parallel. The number of threads is set to be the same as the number of CPUs. Assigning a number will specify the number of threads to be used. Default is False
numAtoms()[source]

Returns number of atoms.

numModeSets()[source]

Returns number of modesets in the instance.

numModes()[source]

Returns number of modes in the instance (not necessarily maximum number of possible modes).

reorder()[source]

Reorders the modes across mode sets according to their collectivity

reweight()[source]

Reweight the modes based on matched orders

setAtoms(atoms)[source]

Sets the atoms of the mode ensemble.

setLabels(labels)[source]

Returns the labels of the mode ensemble.

setMatchingStatus(status)[source]

Returns the matching status of each mode ensemble.

setReweightingStatus(status)[source]

Returns the reweighting status of each mode ensemble.

setWeights(weights)[source]

Set atomic weights.

undoMatching()[source]

Restores the original orders of modes

undoReweighting()[source]

Restores the original weighting of modes

class sdarray[source]

A class for representing a collection of arrays. It is derived from ndarray, and the first axis is reserved for indexing the collection.

sdarray functions exactly the same as ndarray, except that sdarray.mean(), sdarray.std(), sdarray.max(), sdarray.min() are overriden. Average, standard deviation, minimum, maximum, etc. are weighted and calculated over the first axis by default. “sdarray” stands for “signature dynamics array”.

Note for developers: please read the following article about subclassing ndarray before modifying this class:

https://docs.scipy.org/doc/numpy-1.14.0/user/basics.subclassing.html

getArray()[source]

Returns the signature as an numpy array.

getLabels()[source]

Returns the labels of the signature.

getTitle()[source]

Returns the title of the signature.

getWeights()[source]

Returns the weights of the signature.

is3d()[source]

Returns True is model is 3-dimensional.

max(axis=0, **kwargs)[source]

Calculates the maximum values of the sdarray over modesets (axis=0).

mean(axis=0, **kwargs)[source]

Calculates the weighted average of the sdarray over modesets (axis=0).

min(axis=0, **kwargs)[source]

Calculates the minimum values of the sdarray over modesets (axis=0).

numAtoms()[source]

Returns the number of atoms assuming it is represented by the second axis.

numModeSets()[source]

Returns the number of modesets in the instance

setWeights(weights)[source]

Sets the weights of the signature.

std(axis=0, **kwargs)[source]

Calculates the weighted standard deviations of the sdarray over modesets (axis=0).

weights

Returns the weights of the signature.

calcEnsembleENMs(ensemble, model='gnm', trim='reduce', n_modes=20, **kwargs)[source]

Calculates normal modes for each member of ensemble.

Parameters:
  • ensemble (PDBEnsemble) – normal modes of whose members to be computed
  • model (str) – type of ENM that will be performed. It can be either ‘anm’ or ‘gnm’
  • trim (int) – type of method that will be used to trim the model. It can be either ‘trim’ , ‘slice’, or ‘reduce’. If set to ‘trim’, the parts that is not in the selection will simply be removed
  • n_modes – number of modes to be computed
  • turbo (bool) – if True then the computation will be performed in parallel. The number of threads is set to be the same as the number of CPUs. Assigning a number to specify the number of threads to be used. Default is False
  • match (bool) – whether the modes should be matched using matchModes(). Default is True
  • method (function) – the alternative function that is used to match the modes. Default is None
  • turbo – whether use Pool to accelerate the computation. Note that if writing a script, if __name__ == '__main__' is necessary to protect your code when multi-tasking. See https://docs.python.org/2/library/multiprocessing.html for details. Default is False
Returns:

ModeEnsemble

showSignature1D(signature, linespec='-', **kwargs)[source]

Show signature using showAtomicLines().

Parameters:
  • signature (sdarray) – the signature dynamics to be plotted
  • linespec (str) – line specifications that will be passed to showAtomicLines()
  • atoms (Atomic) – an object with method getResnums() for use on the x-axis.
  • alpha (float) – the transparency of the band(s).
  • range (bool) – whether shows the minimum and maximum values. Default is True
psplot(signature, linespec='-', **kwargs)

Show signature using showAtomicLines().

Parameters:
  • signature (sdarray) – the signature dynamics to be plotted
  • linespec (str) – line specifications that will be passed to showAtomicLines()
  • atoms (Atomic) – an object with method getResnums() for use on the x-axis.
  • alpha (float) – the transparency of the band(s).
  • range (bool) – whether shows the minimum and maximum values. Default is True
showSignatureAtomicLines(y, std=None, min=None, max=None, atoms=None, **kwargs)[source]

Show the signature dynamics data using showAtomicLines().

Parameters:
  • y (ndarray) – the mean values of signature dynamics to be plotted
  • std (ndarray) – the standard deviations of signature dynamics to be plotted
  • min (ndarray) – the minimum values of signature dynamics to be plotted
  • max (ndarray) – the maximum values of signature dynamics to be plotted
  • linespec (str) – line specifications that will be passed to showAtomicLines()
  • atoms (Atomic) – an object with method getResnums() for use on the x-axis.
showSignatureMode(mode_ensemble, **kwargs)[source]

Show signature mode profile.

Parameters:
  • mode_ensemble (ModeEnsemble) – mode ensemble from which to extract an eigenvector If this is not indexed already then index 0 is used by default
  • atoms (Atomic) – atoms for showing residues along the x-axis Default option is to use mode_ensemble.getAtoms()
  • scale (float) – scaling factor. Default is 1.0
showSignatureDistribution(signature, **kwargs)[source]

Show the distribution of signature values using hist().

showSignatureCollectivity(mode_ensemble, **kwargs)[source]

Show the distribution of signature variances using showSignatureDistribution().

showSignatureSqFlucts(mode_ensemble, **kwargs)[source]

Show signature profile of square fluctations.

Parameters:
  • mode_ensemble (ModeEnsemble) – mode ensemble from which to calculate square fluctutations
  • atoms (Atomic) – atoms for showing residues along the x-axis Default option is to use mode_ensemble.getAtoms()
  • scale (float) – scaling factor. Default is 1.0
  • show_zero (bool) – where to show a grey line at y=0 Default is False
calcEnsembleSpectralOverlaps(ensemble, distance=False, turbo=False, **kwargs)[source]

Calculate the spectral overlaps between each pair of conformations in the ensemble.

Parameters:
  • ensemble – an ensemble of structures or ENMs
  • distance (bool) – if set to True, spectral overlap will be converted to spectral distance via arccos.
  • turbo (bool) – if True, extra memory will be used to remember previous calculation results to accelerate the next calculation, so this option is particularly useful if spectral overlaps of the same ensemble are calculated repeatedly, e.g. using different number of modes. Note that for single calculation, turbo will compromise the speed. Default is False
calcSignatureSqFlucts(mode_ensemble, **kwargs)[source]

Get the signature square fluctuations of mode_ensemble.

Parameters:
  • mode_ensemble – an ensemble of ENMs
  • norm (bool) – whether to normalize the square fluctuations. Default is True
  • scale (bool) – whether to rescale the square fluctuations based on the reference. Default is False
calcSignatureCollectivity(mode_ensemble, masses=None)[source]

Calculate average collectivities for a ModeEnsemble.

calcSignatureFractVariance(mode_ensemble)[source]

Calculate signature fractional variance for a ModeEnsemble.

calcSignatureModes(mode_ensemble)[source]

Calculate mean eigenvalues and eigenvectors and return a new GNM or ANM object containing them.

calcSignatureCrossCorr(mode_ensemble, norm=True)[source]

Calculate the signature cross-correlations based on a ModeEnsemble instance.

Parameters:
  • mode_ensemble – an ensemble of ENMs
  • norm (bool) – whether to normalize the cross-correlations. Default is True
showSignatureCrossCorr(mode_ensemble, std=False, **kwargs)[source]

Show average cross-correlations using showAtomicMatrix(). By default, origin=lower and interpolation=bilinear keyword arguments are passed to this function, but user can overwrite these parameters. See also calcSignatureCrossCorr().

Parameters:
  • ensemble – an ensemble of structures or ENMs, or a signature profile
  • atoms (Atomic) – an object with method getResnums() for use on the x-axis.
showVarianceBar(mode_ensemble, highlights=None, **kwargs)[source]

Show the distribution of variances (cumulative if multiple modes) using histogram().

Parameters:
  • mode_ensemble (ModeEnsemble) – an ensemble of modes whose variances are displayed
  • highlights (list) – labels of conformations whose locations on the bar will be highlighted by arrows and texts
  • fraction (bool) – whether the variances should be weighted or not. Default is True
showSignatureVariances(mode_ensemble, **kwargs)[source]

Show the distribution of signature variances using showSignatureDistribution().

calcSignatureOverlaps(mode_ensemble, diag=True, collapse=False)[source]

Calculate average mode-mode overlaps for a ModeEnsemble.

If diag is True (default) then only diagonal values will be calculated. Otherwise, the whole overlap matrices will be calculated.

By default (collapse is False), the whole overlap matrices are returned as a 4-dimensional sdarray that is a matrix of overlap matrices.

If collapse is True then these will be collapsed together, giving a 2-dimensional array for full matrices. This operation is not defined for diagonal values.

showSignatureOverlaps(mode_ensemble, **kwargs)[source]

Show a curve of mode-mode overlaps against mode number with shades for standard deviation and range

Parameters:
  • diag (bool) – Whether to calculate the diagonal values only. Default is False and showMatrix() is used. If set to True, showSignatureAtomicLines() is used.
  • std – Whether to show the standard deviation matrix when diag is False (and whole matrix is shown). Default is False, meaning the mean matrix is shown.

type: std: bool

saveModeEnsemble(mode_ensemble, filename=None, atoms=False, **kwargs)[source]

Save mode_ensemble as filename.modeens.npz. If filename is None, title of the mode_ensemble will be used as the filename, after " " (white spaces) in the title are replaced with "_" (underscores). Upon successful completion of saving, filename is returned. This function makes use of savez_compressed() function.

loadModeEnsemble(filename, **kwargs)[source]

Returns ModeEnsemble instance after loading it from file (filename). This function makes use of numpy.load() function. See also saveModeEnsemble().

saveSignature(signature, filename=None, **kwargs)[source]

Save signature as filename.sdarray.npz. If filename is None, title of the signature will be used as the filename, after " " (white spaces) in the title are replaced with "_" (underscores). Upon successful completion of saving, filename is returned. This function makes use of savez_compressed() function.

loadSignature(filename, **kwargs)[source]

Returns sdarray instance after loading it from file (filename). This function makes use of numpy.load() function. See also saveSignature().

calcSubfamilySpectralOverlaps(mode_ens, subfamily_dict, **kwargs)[source]

Calculate average spectral overlaps (or distances) within and between subfamilies in a mode ensemble defined using a dictionary where each key is an ensemble member and the associate value is a subfamily name.

To use a range of modes, please index the mode ensemble e.g. mode_ens=mode_ensemble[:,3:20] to use modes 4 to 20 inclusive. Alternatively, there is the option to provide first and last keyword arguments, which would be used as the 3 and 20 above.

Parameters:
  • mode_ensemble (ModeEnsemble) – an ensemble of modes corresponding to a set of modes for each family member
  • subfamily_dict (dict) – a dictionary providing a subfamily label for each family member
  • first (int) – the first index for a range of modes
  • last (int) – the last index for a range of modes
  • remove_small (bool) – whether to remove small subfamilies with fewer than 4 members. Default is True
  • return_reordered_subfamilies – whether to return the reordered subfamilies in addition to the matrix. Default is False

type return_reordered_subfamilies: bool

showSubfamilySpectralOverlaps(mode_ens, subfamily_dict, **kwargs)[source]

Calculate and show the matrix of spectral overlaps or distances averaged over subfamilies. Inputs are the same as calcSubfamilySpectralOverlaps plus the following and those of showDomainBar if you wish.

Parameters:show_subfamily_bar (bool) – whether to show the subfamilies as colored bars using showDomainBar. Default is False
calcSignaturePerturbResponse(mode_ensemble, **kwargs)[source]

Calculate the signature perturbation response scanning based on a ModeEnsemble instance.

Parameters:mode_ensemble – an ensemble of ENMs