The Resolution Revolution Is Still Ongoing
High-Resolution Ab-Initio Reconstruction: Extending Cryo-EM to Smaller Particles with CryoSPARC™
Over the past two decades, cryo-EM has revolutionized structural biology, enabling high-resolution structure determination of macromolecules and protein complexes that were previously inaccessible. By capturing multiple conformational states from a single dataset, cryo-EM has provided unique insights into molecular heterogeneity and dynamics.
Despite this progress, important limitations remain. One long-standing challenge is the structural determination of small proteins, typically below ~50 kDa.
In a recent preprint, Kookjoo Kim, Huan Li, and Oliver B. Clarke challenge this size limit by implementing a new data-processing workflow in CryoSPARC, termed high-resolution heterogeneous ab initio reconstruction (HR-HAIR).

The Challenge of Small Proteins: Why Bother?
In many structural biology pipelines, cryo-EM is simply reserved for large protein complexes, while smaller proteins are redirected to complementary techniques such as X-ray crystallography. While pragmatic, this division overlooks key advantages that cryo-EM brings to the study of small enzymes and regulatory proteins.
Cryo-EM enables the direct observation of sample composition with minimal manipulation. How often are monomers, dimers, or transient subcomplexes discarded during data processing without a second look? And how often might these “minor” species be central to biologically or therapeutically relevant mechanisms of association, dissociation, or regulation?
Access to structures of small particles can therefore inform mechanism, target engagement, and assembly equilibria, critical parameters in drug discovery and development pipelines.

Acting on the Workflow: Focusing on Data Processing
When sample optimization for small particles and hardware improvements fail (or are too time-consuming to justify in a production environment) the natural next step is to focus on the third pillar of the cryo-EM workflow: data processing. This is the approach taken by Kookjoo Kim, Huan Li, and Oliver B. Clarke in their recent preprint.
The authors identified a critical bottleneck in SPA of small proteins: the estimation of initial particle orientations during ab initio reconstruction. At low resolution, small proteins often lack sufficient structural features for reliable alignment, raising a simple but powerful question posed by Kim et al.: what happens if ab initio reconstruction is run using high spatial frequency information?
Their answer is high-resolution heterogeneous ab initio reconstruction (HR-HAIR), a CryoSPARC-based workflow that derives initial particle orientations from high-frequency signal. By incorporating higher resolution information from the start, increasing the number of iterations, and using very small resolution step sizes between iterations, HR-HAIR bypasses the need for reliable low-resolution features and converges on interpretable initial maps. Using this approach, the authors demonstrated that high-resolution ab initio reconstruction can enable structural determination of proteins below 30 kDa.
The workflow was validated on several publicly available datasets, resolving iPKAc (39 kDa) to 2.7 Å and a hemoglobin αβ dimer (29 kDa) to ~4 Å. Notably, a 37 kDa Aca2–RNA complex was reconstructed directly from a blob-picked particle stack in a single HR-HAIR run, followed by local refinement, without prior 2D or 3D classification. These workflow highlights how software advances alone can extend the practical reach of cryo-EM.
The fact that ab initio reconstruction alone is able to generate interpretable maps for iPKAc and Aca2-RNA, even in the presence of significant preferred orientation, suggests that the use of this approach for high resolution reconstruction and classification may be currently underestimated [...]

One important limitation, however, is that CryoSPARC’s Ab-Initio Reconstruction does not perform half-set splitting during this stage, leaving subsequent local refinement susceptible to overfitting and resolution overestimation.
The New Homogeneous Ab-Initio Refinement (BETA)
Building on these findings, CryoSPARC v5 introduces a new reconstruction strategy directly inspired by the HR-HAIR method. The new Homogeneous Ab-Initio Refinement (BETA), job adapts CryoSPARC’s stochastic gradient descent–based ab initio algorithm into a gold-standard refinement framework that preserves the independence of the two half-sets and corresponding half-maps throughout reconstruction.

This job is streamlining the workflow for challenging datasets, where CryoSPARC’s stochastic gradient descent algorithm outperforms the traditional expectation maximization algorithm used in refinements, and ensures an easier resolution estimation and validation.
By expanding cryo-EM into the small-particle regime, CryoSPARC v.5 enables gold standard workflows, where revisiting existing data can reveal structures, mechanisms, and opportunities that were previously missed.