Class Averaging Architecture

ASPIRE now contains a broad collection of configurable and extensible components which can be combined to create class averaging solutions tailored to different datasets. The architecture was designed to both be modular and encourage experimentation. Lower level components are aggregated into a high level interface by ClassAvgSource instances. Starting there this document will descend into each contributing component.

ClassAvgSource

ClassAvgSource is the fully customizable base class which links together components into a cohesive source to be used with other ASPIRE components. A power user can instantiate an instance of each required component and assign them here for complete control.

        classDiagram
    class ClassAvgSource{
        src: ImageSource
        classifier: Class2D
        class_selector: ClassSelector
        averager: Averager2D
        +images()
        }

    ClassAvgSource o-- ImageSource
    ClassAvgSource o-- Class2D
    ClassAvgSource o-- ClassSelector
    ClassAvgSource o-- Averager2D

    class ImageSource{
        +images()
    }
    class Class2D{
        +classify()
    }
    class ClassSelector{
        +select()
    }
    class Averager2D{
        +average()
    }
    

While that allows for full customization, two helper classes are provided that supply defaults as a jumping off point. Both of these helper sources only require an input Source to be instantiated. They can still be fully customized, but they are intended to start with sensible defaults, so users only need to instantiate the specific components they wish to configure.

        classDiagram
   ClassAvgSource <|-- DebugClassAvgSource
   ClassAvgSource <|-- DefaultClassAvgSource
   class DebugClassAvgSource{
      src: ImageSource
      classifier: RIRClass2D
      class_selector: TopClassSelector
      averager: BFRAverager2D
      +images()
   }
   class DefaultClassAvgSource{
      version="0.11.0"
      src: ImageSource
      classifier: RIRClass2D
      class_selector: NeighborVarianceWithRepulsionClassSelector
      averager: BFSRAverager2D
      +images()
   }
    

DebugClassAvgSource is designed for use in testing, documentation, and development because it defaults to the simplest components while also maintaining the original input source index ordering. That is, the first 10 class averages from DebugClassAvgSource should correspond with the first 10 source images without requiring any index mappings etc.

DefaultClassAvgSource applies the most sensible defaults available in the current ASPIRE release. DefaultClassAvgSource takes a version string, such as 0.11.0 which will return a specific configuration. This version should allow users to perform a similar experiment across releases as ASPIRE implements improved methods. When a version is not provided, DefaultClassAvgSource defaults to the latest version available.

Classifiers

Classifiers take an image Source and attempts to classify into class_indices that identify images with similar viewing angles up to reflection. All Class2D instances are expected to implement a classify method which returns (class_indices, class_refl, class_distances). The three returned variables are expected to be 2D Numpy arrays in a neighbor network format having shape (src.n, n_nbors). So to retrieve the set of input source indices for the first class’s neighbors, we would want class_indices[0,:]. The first index class_indices[0,0] in the set is the index of the reference image used for classification. In this case class_indices[0,0]=0. The actual underlying image would be input_src.images[0], or more generally input_src.images[class_indices[c,0]] for some class c.

No further class selection or ordering occurs during classification. Those methods are broken out into other components.

Currently ASPIRE has a single classification algorithm known as RIRClass2D. This algorithm uses multiple applications of PCA in conjunction with bispectrum analysis to identify nearest neighbors in a rotationally invariant feature space.

        classDiagram
   class Class2D{
       +classify()
   }
 Class2D <|-- RIRClass2D
    

Class Selectors

Class Selectors consume the output of Class2D and attempt to order and/or filter classes down to a selection. Selecting the “best” classes in cryo-EM problems is still an area of active research. Some common methods are provided, along with an extensible base interface.

Generally, Class Selection comes in two flavors depending on what information is required to perform the selection.

Local Class Selectors

For “Local” class selection, we will attempt to use only the information returned from Class2D. In the case of RIRClass2D this would primarily be a network of distances as measured in the compressed feature space.

This approach has two main advantages. First, we already have this information computed as part of classification. Second, it allows us to register and stack a relatively small subset of the “best” classes. Because registration and alignment are computationally expensive this can reduce pipeline run times by an order of magnitude.

        classDiagram
   class ClassSelector{
      +select()
      }
    ClassSelector <|-- TopClassSelector
    ClassSelector <|-- RandomClassSelector
    ClassSelector <|-- NeighborVarianceClassSelector
    ClassSelector <|-- DistanceClassSelector
    ClassSelector o-- GreedyClassRepulsionMixin
    

Global Class Selectors

Global Class Selection techniques first compute the entire collection of registered and aligned class averages, then compute some quality measure on all classes.

Many classic experiments computed variance of each class averaged image, sorting to express highest variance. Sometimes this is referred to as contrast. Often times the classes were selected to avoid classes with views already seen. This can be accomplished now by using the VarianceImageQualityFunction in a GlobalWithRepulsionClassSelector.

An SNR based approach is also provided, and a bandpass method should be implemented in a future release. Again, these components are fully customizable and the base interfaces were designed with algorithm developers in mind.

To implementing concrete GlobalClassSelector instances, leverage the subcomponents described below.

        classDiagram
    ClassSelector <|-- GlobalClassSelector
    class GlobalClassSelector{
        averager: Averager2D
        function: ImageQuaityFunction
        heap_size: int
        }
    GlobalClassSelector *-- ImageQualityFunction
    GlobalClassSelector ..> Heap
    GlobalClassSelector <|-- GlobalWithRepulsionClassSelector


    class ImageQualityFunction{
       -_function
       +__call__()
       }
    ImageQualityFunction o-- WeightedImageQualityMixin
    ImageQualityFunction <|-- BandedSNRImageQualityFunction
    ImageQualityFunction <|-- VarianceImageQualityFunction
    ImageQualityFunction <|-- BandpassImageQualityFunction_TBD

    class WeightedImageQualityMixin{
        -_weight_function
    }
    WeightedImageQualityMixin <|-- RampWeightedImageQualityMixin
    WeightedImageQualityMixin <|-- BumpWeightedImageQualityMixin

    GlobalClassSelector <|-- RampWeightedVarianceImageQualityFunction
    RampWeightedImageQualityMixin <|-- RampWeightedVarianceImageQualityFunction
    GlobalClassSelector <|-- BumpWeightedVarianceImageQualityFunction
    BumpWeightedImageQualityMixin <|-- BumpWeightedVarianceImageQualityFunction
    

Class Repulsion

Class Repulsion are techniques used to avoid classes based on some criterion. Currently we provide GreedyClassRepulsionMixin, but this mix-in class can be mimicked to implement alternate schemes.

GreedyClassRepulsionMixin is based on the following intuition. Assume the selection has in fact ordered the classes so that the “best” classes occur first. It follows that the “best” expression of a viewing angle locus will be the first seen. Now assume the classifier returns classes with closest viewing angles (up to reflections). Then the classes formed by neighbors of the current expression are inferior. The aggressiveness of the neighbor repulsion count is tunable.

In practice, GreedyClassRepulsionMixin is a mix-in designed to be mixed into any other ClassSelector. Note, that repulsion can (and will) dramatically reduce the population of class averages returned.

Image Quality Functions

The ImageQualityFunction interface provides a consistent way to bring your own function to measure the quality of a single aligned and registered class average. This function should operate on a single Image, with conversions and broadcasting being handled behind the scenes.

An example would be VarianceImageQualityFunction which computes and returns variance.

Another advantage of using the class is that it exposes and manages a grid cache, which is handy to avoid recomputing the same grid for every image when using spatial methods.

WeightedImageQualityMixin

WeightedImageQualityMixin is designed to mix with subclasses of ImageQualityFunction, extending them with a weighted image mask applied prior to the image quality function call.

Two concrete examples are provided BumpWeightedVarianceImageQualityFunction and RampWeightedVarianceImageQualityFunction which apply the respective weight functions prior to the variance calculation.

Again, WeightedImageQualityMixin exposes and manages a grid cache, this time for grid weights.

Averagers

Averagers consume from a Source and return averaged images defined by class network arguments class_indices and class_refl. You may find the terms averaging and stacking used interchangeably in this context, so know that averaging does not always imply arithmetic mean.

Some averaging techniques, those subclassing AligningAverager2D have distinct alignment and averaging stages. Others such as expectation-maximization (EM) may perform these internally and provide only an opaque averages stage.

        classDiagram
     class Averager2D{
         basis: Basis
         src: ImageSource
         +average()
     }
     Averager2D ..> ImageStacker
     Averager2D <|-- AligningAverager2D
     class AligningAverager2D{
         align()
     }
     ImageSource *-- Averager2D
     Averager2D <|-- AligningAverager2D
     Averager2D <|-- EMAverager2D_TBD
     Averager2D <|-- FTKAverager2D_TBD
     AligningAverager2D <|-- BFRAverager2D
     BFRAverager2D <|-- BFSRAverager2D
     AligningAverager2D <|-- ReddyChetterjiAverager2D
     ReddyChetterjiAverager2D <|-- BFSReddyChetterjiAverager2D
    

Each AligningAverager2D can be configured to use a custom ImageStacker if desired.

ImageStacker

ImageStacker provides an interface for the common task of stacking images. Implementations for common stacking methods are provided and should work for both Image and (1D) coefficient stacks. Users experimenting with advanced stacking are responsible for selecting an ImageStacker method appropriate for their data.

Note that the ASPIRE default is naturally MeanImageStacker.

        classDiagram
     class ImageStacker{
         stack()
     }
     class SigmaRejectionImageStacker{
         sigma
     }
     class WinsorizedImageStacker{
         percentile
     }
     ImageStacker <|-- MeanImageStacker
     ImageStacker <|-- MedianImageStacker
     ImageStacker <|-- SigmaRejectionImageStacker
     SigmaRejectionImageStacker .. Gaussian
     SigmaRejectionImageStacker .. FWHM
     ImageStacker <|-- WinsorizedImageStacker