Chromatin clustering

calcGNMDomains(modes, method=<function Discretize>, **kwargs)[source]

Uses spectral clustering to separate structural domains in chromosomes and proteins.

Parameters:
  • modes (ModeSet) – GNM modes used for segmentation
  • method (func) – Label assignment algorithm used after Laplacian embedding of loci.
KMeans(V, **kwargs)[source]

Performs k-means clustering on V. The function uses sklearn.cluster.KMeans(). See sklearn documents for details.

Parameters:
  • V (ndarray) – row-normalized eigenvectors for the purpose of clustering.
  • n_clusters (int) – specifies the number of clusters.
Hierarchy(V, **kwargs)[source]

Performs hierarchical clustering on V. The function essentially uses two scipy functions: linkage and fcluster. See scipy.cluster.hierarchy.linkage() and scipy.cluster.hierarchy.fcluster() for the explaination of the arguments. Here lists arguments that are different from those of scipy.

Parameters:
  • V (ndarray) – row-normalized eigenvectors for the purpose of clustering.
  • inconsistent_percentile – if the clustering criterion for scipy.cluster.hierarchy.fcluster()

is inconsistent and threshold t is not given (default), then the function will use the percentile specified by this argument as the threshold. :type inconsistent_percentile: double

Parameters:n_clusters – specifies the maximal number of clusters. If this argument is given, then the function will

automatically set criterion to maxclust and t equal to n_clusters. :type n_clusters: int

Discretize(V, **kwargs)[source]

Adapted from discretize(). Copyright please see LICENSE.rst.

showLinkage(V, **kwargs)[source]

Shows the dendrogram of hierarchical clustering on V. See scipy.cluster.hierarchy.dendrogram() for details.

Parameters:V (ndarray) – row-normalized eigenvectors for the purpose of clustering.
GaussianMixture(V, **kwargs)[source]

Performs clustering on V by using Gaussian mixture models. The function uses sklearn.micture.GaussianMixture(). See sklearn documents for details.

Parameters:
  • V (ndarray) – row-normalized eigenvectors for the purpose of clustering.
  • n_clusters (int) – specifies the number of clusters.
BayesianGaussianMixture(V, **kwargs)[source]

Performs clustering on V by using Gaussian mixture models with variational inference. The function uses sklearn.micture.GaussianMixture(). See sklearn documents for details.

Parameters:
  • V (ndarray) – row-normalized eigenvectors for the purpose of clustering.
  • n_clusters (int) – specifies the number of clusters.