aspire.classification package

Submodules

aspire.classification.averager2d module

class aspire.classification.averager2d.AligningAverager2D(composite_basis, src, alignment_basis=None, image_stacker=None, dtype=None)

Bases: Averager2D

Subclass supporting averagers which perfom an aligning stage.

Parameters:
  • composite_basis – Basis to be used during class average composition (eg hi res Cartesian/FFB2D).

  • src – Source of original images.

  • alignment_basis – Optional, basis to be used only during alignment (eg FSPCA).

  • image_stacker – Optional, provide a user defined ImageStacker instance, used during image stacking (averaging). Defaults to MeanImageStacker.

  • dtype – Numpy dtype to be used during alignment.

abstract align(classes, reflections, basis_coefficients)

During this process rotations, reflections, shifts and correlations properties will be computed for aligners.

rotations is an (src.n, n_nbor) array of angles, which should represent the rotations needed to align images within that class. rotations is measured in CCW radians.

shifts is None or an (src.n, n_nbor) array of 2D shifts which should represent the translation needed to best align the images within that class.

correlations is an (src.n, n_nbor) array representing a correlation like measure between classified images and their base image (image index 0).

Subclasses of should implement and extend this method.

Parameters:
  • classes – (src.n, n_nbor) integer array of img indices.

  • reflections – (src.n, n_nbor) bool array of corresponding reflections,

  • basis_coefficients – (n_img, self.alignment_basis.count) basis coefficients,

Returns:

(rotations, shifts, correlations)

average(classes, reflections, coefs=None)

This subclass assumes we get alignment details from align method. Otherwise. see Averager2D.average

class aspire.classification.averager2d.Averager2D(composite_basis, src, dtype=None)

Bases: ABC

Base class for 2D Image Averaging methods.

Parameters:
  • composite_basis – Basis to be used during class average composition (eg FFB2D)

  • src – Source of original images.

  • dtype – Numpy dtype to be used during alignment.

abstract average(classes, reflections, coefs=None)

Combines images using stacking in self.composite_basis.

Subclasses should implement this. (Example EM algos use radically different averaging).

Should return an Image source of synthetic class averages.

Parameters:
  • classes – class indices, refering to src. (src.n, n_nbor).

  • reflections – Bool representing whether to reflect image in classes. (n_clases, n_nbor)

  • coefs – Optional basis coefs (could avoid recomputing). (src.n, coef_count)

Returns:

Stack of synthetic class average images as Image instance.

class aspire.classification.averager2d.BFRAverager2D(composite_basis, src, alignment_basis=None, n_angles=360, dtype=None)

Bases: AligningAverager2D

This perfoms a Brute Force Rotational alignment.

For each class,

constructs n_angles rotations of all class members, and then identifies angle yielding largest correlation(dot).

See AligningAverager2D, adds:

Parameters:

n_angles – Number of brute force rotations to attempt, defaults 360.

align(classes, reflections, basis_coefficients)

Performs the actual rotational alignment estimation, returning parameters needed for averaging.

class aspire.classification.averager2d.BFSRAverager2D(composite_basis, src, alignment_basis=None, n_angles=360, radius=None, dtype=None)

Bases: BFRAverager2D

This perfoms a Brute Force Shift and Rotational alignment. It is potentially expensive to brute force this search space.

For each pair of x_shifts and y_shifts,

Perform BFR

Return the rotation and shift yielding the best results.

See AligningAverager2D and BFRAverager2D, adds: radius

Params n_angles:

Number of brute force rotations to attempt, defaults 360.

Parameters:

radius – Brute force translation search radius. Defaults to src.L//16.

align(classes, reflections, basis_coefficients)

See AligningAverager2D.align

class aspire.classification.averager2d.BFSReddyChatterjiAverager2D(composite_basis, src, alignment_src=None, radius=None, dtype=None)

Bases: ReddyChatterjiAverager2D

Brute Force Shifts (Translations) - ReddyChatterji (Log-Polar) Rotations

For each shift within radius, attempts rotational match using ReddyChatterji. When averaging, performs shift before rotations,

Adopted from Reddy Chatterji (1996) An FFT-Based Technique for Translation, Rotation, and Scale-Invariant Image Registration IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 5, NO. 8, AUGUST 1996

This method intentionally does not use any of ASPIRE’s basis so that it may be used as a reference for more ASPIRE approaches.

Parameters:
  • alignment_basis – Basis to be used during alignment. For current implementation of ReddyChatterjiAverager2D this should be None. Instead see alignment_src.

  • src – Source of original images.

  • composite_basis – Basis to be used during class average composition.

  • alignment_src – Optional, source to be used during class average alignment. Must be the same resolution as src.

  • radius – Brute force translation search radius. Defaults to src.L//8.

  • dtype – Numpy dtype to be used during alignment.

align(classes, reflections, basis_coefficients)

Performs the actual rotational alignment estimation, returning parameters needed for averaging.

average(classes, reflections, coefs=None)

See Averager2D.average.

class aspire.classification.averager2d.EMAverager2D(composite_basis, src, dtype=None)

Bases: Averager2D

Citation needed.

Parameters:
  • composite_basis – Basis to be used during class average composition (eg FFB2D)

  • src – Source of original images.

  • dtype – Numpy dtype to be used during alignment.

class aspire.classification.averager2d.FTKAverager2D(composite_basis, src, dtype=None)

Bases: Averager2D

Factorization of the translation kernel for fast rigid image alignment. Rangan, A.V., Spivak, M., Anden, J., & Barnett, A.H. (2019).

Parameters:
  • composite_basis – Basis to be used during class average composition (eg FFB2D)

  • src – Source of original images.

  • dtype – Numpy dtype to be used during alignment.

class aspire.classification.averager2d.ReddyChatterjiAverager2D(composite_basis, src, alignment_src=None, dtype=None)

Bases: AligningAverager2D

Attempts rotational estimation using Reddy Chatterji log polar Fourier cross correlation. Then attempts shift (translational) estimation using cross correlation.

When averaging, performs rotations then shifts.

Note, it may be possible to iterate this algorithm…

Adopted from Reddy Chatterji (1996) An FFT-Based Technique for Translation, Rotation, and Scale-Invariant Image Registration IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 5, NO. 8, AUGUST 1996

This method intentionally does not use any of ASPIRE’s basis so that it may be used as a reference for more ASPIRE approaches.

Parameters:
  • composite_basis – Basis to be used during class average composition.

  • src – Source of original images.

  • alignment_src – Optional, source to be used during class average alignment. Must be the same resolution as src.

  • dtype – Numpy dtype to be used during alignment.

align(classes, reflections, basis_coefficients)

Performs the actual rotational alignment estimation, returning parameters needed for averaging.

average(classes, reflections, coefs=None)

This averages classes performing rotations then shifts. Otherwise is similar to AligningAverager2D.average.

aspire.classification.class2d module

class aspire.classification.class2d.Class2D(src, n_nbor=100, seed=None, dtype=None)

Bases: ABC

Base class for 2D Image Classification methods.

Base constructor of an object for classifying 2D images.

Parameters:
  • src – ImageSource or subclass, provides images.

  • n_nbor – Number of nearest neighbors to compute.

  • seed – Optional RNG seed to be passed to random methods, (example Random NN).

  • dtype – Numpy dtype, defaults to src.dtype.

abstract classify()

Classify the images from Source into classes with similar viewing angles.

Returns classes and associated metadata (classes, reflections, distances)

aspire.classification.class_selection module

Selecting the “best” classes is an area of active research.

Here we provide an abstract base class with two naive approaches as concrete implementations.

RandomClassSelector will select random indices from across the entire dataset, with RNG controlled by seed.

TopClassSelector’ will select the first `n_classes in order. This may be useful for debugging and development.

Additionally we provide a few methods that have been used historically, along with a few classes which should aid in constructing new methods.

class aspire.classification.class_selection.BandedSNRImageQualityFunction

Bases: ImageQualityFunction

Computes the ratio of variance of central pixels of image to pixels in a configurable outer band.

class aspire.classification.class_selection.BandpassImageQualityFunction

Bases: ImageQualityFunction

Replicate behavior of MATLAB cryo_sort_stack_bandpass method.

class aspire.classification.class_selection.BumpWeightedImageQualityMixin

Bases: WeightedImageQualityMixin

ImageQualityMixin to apply a [0,1] bump function.

class aspire.classification.class_selection.BumpWeightedVarianceImageQualityFunction

Bases: BumpWeightedImageQualityMixin, VarianceImageQualityFunction

Computes the variance of pixels after weighting with Bump function.

class aspire.classification.class_selection.ClassSelector

Bases: ABC

Abstract interface for class selection.

property quality_scores

All ClassSelector should assign a quality score array the same length as the selection output.

Function range is currently not limited, but [0,1] is favorable. Currently there is not an expectation that one quality scoring system relates to another, or that the score is a proper metric. Quality scores are only required to be a self consistent ordering.

For subclasses like TopClassSelector and RandomClassSelector where no quality information is derived, the associated _quality_scores should be set to zeros by _select.

select(classes, reflections, distances)

Using the provided arguments, calls internal _select method, checks selection is sane, and returns an array representing an ordered index into classes.

Parameters:
  • classes – (n_img, n_nbor) array of image indices

  • reflections – (n_img, n_nbor) boolean array of reflections between classes[i][0] and classes[i][j]`

  • distances – (n_img, n_nbor) array of distances between classes[i][0] and classes[i][j]`

Returns:

array of indices into classes

class aspire.classification.class_selection.DistanceClassSelector

Bases: ClassSelector

Selects top classes based on lowest mean distance as estimated by distances.

Note that distances is the Nearest Neighbors distances, and in the case of RIR this is a small rotationally invariant feature vector. For methods based on class average images, see subclasses of GlobalClassSelector.

class aspire.classification.class_selection.GlobalClassSelector(averager, quality_function, heap_size_limit_bytes=2000000000.0)

Bases: ClassSelector

Extends ClassSelector for methods that require passing over all class average images.

Initializes a GlobalClassSelector.

Because GlobalClassSelectors must compute all class averages, a heap cache maintains the top class averages as scored by quality_function. If you have the memory, recommend setting the cache to be > n_classes*img_size*img_size*img.dtype.

Parameters:
  • averager – An Averager2D subclass.

  • quality_function – Function that takes an image and returns numeric quality score. This score will be used to sort the classes. User’s may provide a callable function, but extending ImageQualityFunction is recommended. For example, this module provides methods for variance and SNR based quality.

  • heap_size_limit_bytes – Max heap size in Bytes. Defaults 2GB, 0 will disable.

property heap_ids

Return the image ids currently in the heap.

property heap_idx_map

Return map of image ids to heap position currently in the heap.

class aspire.classification.class_selection.GlobalWithRepulsionClassSelector(*args, **kwargs)

Bases: GreedyClassRepulsionMixin, GlobalClassSelector

Extends ClassSelector for methods that require passing over all class average images and also GreedyClassRepulsionMixin.

Sets optional exclude_k. All other args and **kwargs are passed to super().

GreedyClassRepulsionMixin is similar to cryo_select_subset from MATLAB, but MATLAB found exclude_k iteratively based on a desired result set size.

Parameters:

exclude_k – Number of neighbors from each class to exclude. Defaults to all neighbors.

class aspire.classification.class_selection.GreedyClassRepulsionMixin(*args, **kwargs)

Bases: object

Mixin to overload class selection based on excluding classes we’ve already seen as neighbors of another class.

If the classes are well sorted (by some measure of quality), we assume the best representation is the first seen.

Sets optional exclude_k. All other args and **kwargs are passed to super().

GreedyClassRepulsionMixin is similar to cryo_select_subset from MATLAB, but MATLAB found exclude_k iteratively based on a desired result set size.

Parameters:

exclude_k – Number of neighbors from each class to exclude. Defaults to all neighbors.

class aspire.classification.class_selection.ImageQualityFunction

Bases: ABC

A callable image quality scoring function.

The main advantage to using this class is to gain access to a grid caching and Image/Numpy conversion.

class aspire.classification.class_selection.NeighborVarianceClassSelector

Bases: ClassSelector

Selects classes based on variances of distances.

Note that distances is the Nearest Neighbors distances, and in the case of RIR this is a small rotationally invariant feature vector. For methods based on class average images, see subclasses of GlobalClassSelector.

class aspire.classification.class_selection.NeighborVarianceWithRepulsionClassSelector(*args, **kwargs)

Bases: GreedyClassRepulsionMixin, NeighborVarianceClassSelector

Selects top classes based on highest contrast with GreedyClassRepulsionMixin.

Sets optional exclude_k. All other args and **kwargs are passed to super().

GreedyClassRepulsionMixin is similar to cryo_select_subset from MATLAB, but MATLAB found exclude_k iteratively based on a desired result set size.

Parameters:

exclude_k – Number of neighbors from each class to exclude. Defaults to all neighbors.

class aspire.classification.class_selection.RampWeightedImageQualityMixin

Bases: WeightedImageQualityMixin

ImageQualityMixin to apply a linear ramp.

class aspire.classification.class_selection.RampWeightedVarianceImageQualityFunction

Bases: RampWeightedImageQualityMixin, VarianceImageQualityFunction

Computes the variance of pixels after weighting with Ramp function.

class aspire.classification.class_selection.RandomClassSelector(seed=None)

Bases: ClassSelector

Parameters:

seed – RNG seed, de

class aspire.classification.class_selection.TopClassSelector

Bases: ClassSelector

class aspire.classification.class_selection.VarianceImageQualityFunction

Bases: ImageQualityFunction

Computes the variance of pixels.

class aspire.classification.class_selection.WeightedImageQualityMixin

Bases: ABC

Extends ImageQualityFunction with a radial grid weighted function for use in user defined _function calls.

weights(L)

Returns 2D array of weights for a given resolution L. Computes and caches on first request.

Parameters:

L – resolution pixels

Returns:

2d weight array for LxL grid.

aspire.classification.legacy_implementations module

aspire.classification.legacy_implementations.bispec_2drot_large(coef, freqs, eigval, alpha, sample_n, seed=None)

alpha 1/3 sample_n 4000

aspire.classification.legacy_implementations.bispec_operator_1(freqs)
aspire.classification.legacy_implementations.pca_y(x, k, num_iters=2, seed=None)

PCA using QR factorization.

See:

An algorithm for the principal component analysis of large data sets. Halko, Martinsson, Shkolnisky, Tygert , SIAM 2011.

Parameters:
  • x – Data matrix

  • k – Number of estimated Principal Components.

  • num_iters – Number of dot product applications.

Returns:

(left Singular Vectors, Singular Values, right Singular Vectors)

aspire.classification.reddy_chatterji module

aspire.classification.reddy_chatterji.reddy_chatterji_register(images, reflection, mask=None, do_cross_corr_translations=True, dtype=None)

Compute the Reddy Chatterji method registering images[1:] to images[0].

This differs from papers and published scikit implimentations by computing the fixed base image[0] pipeline once then reusing.

Parameters:
  • images – Image data (m_img, L, L)

  • reflection – Image reflections (m_img,)

  • mask – Support of image. Defaults to disk with radius images.shape[-1]//2.

  • do_cross_corr_translations – Solve trnaslations by using cross correlation (log polar) method.

  • dtype – Specify dtype. Defaults to infer from images.dtype

Returns:

(rotations, shifts, correlations) corresponding to images

aspire.classification.rir_class2d module

class aspire.classification.rir_class2d.RIRClass2D(src, pca_basis=None, fspca_components=None, alpha=0.3333333333333333, sample_n=4000, bispectrum_components=300, n_nbor=100, bispectrum_freq_cutoff=None, large_pca_implementation='legacy', nn_implementation='legacy', bispectrum_implementation='legacy', batch_size=512, dtype=None, seed=None)

Bases: Class2D

Constructor of an object for classifying 2D images using Rotationally Invariant Representation (RIR) algorithm.

At a high level this consumes a Source instance src, and a FSPCA Basis pca_basis.

Yield class averages by first performing classify, then performing output.

Z. Zhao, Y. Shkolnisky, A. Singer, Rotationally Invariant Image Representation for Viewing Direction Classification in Cryo-EM. (2014)

Parameters:
  • src – Source instance, for classification.

  • pca_basis – Optional FSPCA Basis instance

  • fspca_components – Optinally set number of components (top eigvals) to keep from full FSCPA. Default value of None will infer from pca_basis when provided, otherwise defaults to 400.

  • alpha – Amplitude Power Scale, default 1/3 (eq 20 from RIIR paper).

  • sample_n – Threshold for random sampling of bispectrum coefs. Default 4000, high values such as 50000 reduce random sampling.

  • n_nbor – Number of nearest neighbors to compute.

  • bispectrum_freq_cutoff – Truncate (zero) high k frequecies above (int) value, defaults off (None).

  • large_pca_implementation – See pca.

  • nn_implementation – See nn_classification.

  • bispectrum_implementation – See bispectrum.

  • batch_size – Chunk size (typically number of images) for batched methods.

  • dtype – Optional dtype, otherwise taken from src.

  • seed – Optional RNG seed to be passed to random methods, (example Random NN).

Returns:

RIRClass2D instance to be used to compute bispectrum-like rotationally invariant 2D classification.

bispectrum(coef)

All bispectrum implementations should consume a stack of fspca coef and return bispectrum coefficients.

Parameters:

coef – complex steerable coefficients (eg. from FSPCABasis).

Returns:

tuple of arrays (coef_b, coef_b_r)

classify(diagnostics=False)

This is the high level method to perform the 2D images classification.

The stages of this method are intentionally modular so they may be swapped for other implementations.

Parameters:

diagnostics – Optionally plots distribution of distances

nn_classification(coef_b, coef_b_r)

Takes in features as pair of arrays (coef_b coef_b_r), each having shape (n_img, features) where features = min(self.bispectrum_components, n_img).

Result is array (n_img, n_nbor) with entry i representing index i into class input img array (src).

To extend with an additonal Nearest Neighbor algo, add as a private method and list in nn_implementations.

Parameters:
  • coef_b

  • coef_b_r

Returns:

Tuple of classes, refl, dists where classes is an integer array of indices representing image ids, refl is a bool array representing reflections (True is refl), and distances is an array of distances as returned by NN implementation.

pca(M)

Any PCA implementation here should return both coef_b and coef_b_r that are (n_img, n_components).

n_components is typically self.bispectrum_components. However, for small problems it may return n_components`=`n_img, since that would be the smallest dimension.

To extend class with an additional PCA like method, add as private method and list in large_pca_implementations.

Parameters:

M – Array (n_img, m_features), typically complex.

Returns:

Tuple of arrays coef_b coef_b_r.

Module contents