Tutorial: CTF Refinement

Last Updated: November 29, 2019

This tutorial details the background, implementation, and use of CTF refinement in cryoSPARC v2.12+.

CTF Refinement includes two major components: local (per-particle) CTF refinement and global (per-group) CTF refinement. Local CTF Refinement adjusts each particle's defocus value to estimate the z-position of the particle in the sample/ice. Global CTF Refinement adjusts the higher-order CTF terms (beam-tilt, trefoil, spherical aberration, tetrafoil) across an entire group of images to find the optimum values, accounting for misalignment or aberrations in the microscope itself.

In cryoSPARC, both local and global CTF refinement can be performed standalone (using aligned particles and a reference volume as input) or they can be performed on-the-fly during a 3D refinement, so that the values are iteratively optimized along with particle alignments.


Local CTF refinement (per-particle defocus)

This is a relatively straightforward optimization process of finding the optimal per-particle defocus for each particle in a dataset. Per-particle defocus refinement has been previously proposed and implemented in many other software packages for single particle EM (cisTEM, RELION, Thunder, etc.).

Local CTF Refinement in cryoSPARC requires aligned particle images and a 3D reference (two half-maps), ideally already at a high resolution. Experimental particle images are compared against the 3D reference from their half-set, from the best known pose, at various defocus levels, and the best defocus is selected. The optimal defocus ideally corresponds to the height of the particle in the sample/ice.

Since each particle can be at a different height and ice thicknesses can be 10 times larger than the particle diameter in many cases, per-particle defocus refinement can often make a large difference in the accuracy of CTF correction for each particle. However, it generally works best for larger, highly rigid, high quality samples that already reach relatively high resolutions (better than 4A). In general, it is a good idea to try local CTF refinement on every dataset, and to use a homogeneous (gold-standard) refinement to check whether the overall resolution increased or decreased.

How to run: Create a Local CTF Refinement job using the job builder and connect particles from a previously-run refinement (the particles must have alignments3D defined). Also connect the refined volume from the same refinement job. You can optionally connect a separate mask input, otherwise the mask_refine that is included in the volume input from the previous refinement will be used by default.

Local CTF Refinement job builder with inputs and parameters specified

The most important parameters to adjust are:

  • Minimum fit res (A): controls the minimum resolution used for fitting. Generally CTF refinement should be done only with medium to high resolution signal, as low resolution signal can throw off CTF fits. For smaller particles, change this to a higher resolution.
  • Maximum fir res (A): controls the maximum resolution used for fitting. Higher resolution signal is better for CTF refinement, until there is too much noise present in the half-maps. Leave this blank to have the maximum resolution automatically determined via FSC between the two input half-maps.
  • Defocus search range: controls how far above and below the current defocus to search for the optimal defocus of each particle. If you used Patch CTF Estimation previously in cryoSPARC, this value can be made relatively small, about the same as the thickness of ice you expect to have in the sample, since the input defocus values will already be fairly accurate.
Local CTF Refinement job card

Once the job is run, several diagnostic plots will be created that show the progress of CTF refinement.

Local CTF Refinement event log output

Plots of per-particle defocus error landscapes show the change in log-likelihood across the range of tested defocus values. The curves should like like these above, showing a clear minimum near 0 change in defocus. The X-axis is in units of Angstroms. The Y-axis is in log units, so each change of 1 unit corresponds to a change of 1/e^1 = 0.367 in probability. Therefore, plots with a minimum that is hundreds of units deep indicate that we are highly confident about the optimal defocus value. On the other hand, plots with very shallow minima (tens of units) indicate uncertainty in the optimal defocus.

Local CTF Refinement event log output

Histograms showing the change in per-particle defocus across all the particles in the half-set indicate the total amount of deviation from the input defocus parameters that was achieved by CTF refinement. The histogram should generally be very peaked near zero and should not have heavy tails. Heavy tails, or the presence of many particles having optimal defocus values at the ends of the search range indicates that defocus refinement was not very confident or accurate.

Global CTF Refinement (per-exposure-group beam-tilt, trefoil, spherical aberration, tetrafoil)

Ultra high resolution cryo-EM structures require correcting for electron-optical aberrations and microscope misalignments that result in nuanced "high-order" terms in the Contrast Transfer Function (CTF). These higher order terms (corresponding with beam tilt, trefoil, spherical aberration, tetrafoil) can only be detected at very high resolution, and cannot easily be estimated by straightforward measurements on the microscope. Therefore, the strength of each of these aberrations must be estimated from single particle data itself, by refining the corresponding CTF parameters against a high-resolution reference map. This process of high-order aberration estimation and correction was pioneered by (Zivanov et al. 2019) in RELION 3.1.

CryoSPARC v2.12 contains a GPU accelerated implementation of high-order aberration estimation and correction. Estimation is done by directly maximizing the likelihood of observing the experimental images given a 3D reference map, using LBFGS.

Images collected on a given microscope generally will have related CTF parameters for higher-order aberrations. The images that are related (same grid, same image shift position, etc.) can be grouped into "exposure groups" so that they can all be refined at once, with more signal. Creation and management of exposure groups is explained in the next section.

Like local CTF refinement, Global CTF Refinement generally works best with larger, more rigid particles. However, Global CTF Refinement does use signal from all the particles in an exposure group, and so can detect beam tilt and other aberrations even with smaller/flexible structures.

How to run: Create a Global CTF Refinement job using the job builder and connect particles from a previously-run refinement (the particles must have alignments3D defined). Also connect the refined volume from the same refinement job. You can optionally connect a separate mask input, otherwise the mask_refine that is included in the volume input from the previous refinement will be used by default.

Global CTF Refinement job builder with inputs and parameters specified

The most important parameters to adjust are:

  • Minimum fit res (A): controls the minimum resolution used for fitting. Generally global CTF refinement should be done only with medium to high resolution signal, as low resolution signal can be unreliable. For smaller particles, change this to a higher resolution.
  • Maximum fit res (A): controls the maximum resolution used for fitting. Higher resolution signal is better for CTF refinement, until there is too much noise present in the half-maps. Leave this blank to have the maximum resolution automatically determined via FSC between the two input half-maps.
  • Fit Tilt/Trefoil/Spherical Aberration/Tetrafoil: Select which higher-order aberrations should be refined. Tilt and Trefoil are 3rd order and require less high resolution signal to accurately detect, compare to spherical aberration and tetrafoil which are 4th order. In some cases, optimizing the 4th order terms can be detrimental, especially if per-particle defocus or the 3rd order terms are not yet correctly refined.
Global CTF Refinement job card

Once the job is run, several diagnostic plots will indicate the phase delay and fit diagnostics of each type of aberration.

Global CTF Refinement event log output

For each order of aberration (odd and even), three plots are made. The first shows the phase error data that is measured from all the particles in aggregate. On the left is the full phase error, on the right are the masked out terms that will be used for fitting. For the odd terms, the aberrations appear as anti-symmetric patters of delay (blue) and advancement (red) of the diffracted beam. The second plot shows the fit predicted values of the phase delay, after refining CTF parameters. The third plot shows the residual phase error between the data and fit, which should only contain noise indicating a good fit.

Similar plots are made for the even terms in the CTF. Note that odd terms are optimized from zero each time the Global CTF Refinement job is run, meaning that the plots will always show aberrations in the measured data (first plot). Even terms, on the other hand, are optimized starting from their current input values. Therefore if Global CTF Refinement is run twice, the second time, the even terms will show nearly zero aberration in the measured data (since the input CTF parameters are already nearly correct).

Note that in the output log of Global CTF refinement the units of each aberration parameter are printed. Beam-tilt is internally parameterized in Angstroms rather than radians, as converting to the latter requires a non-zero spherical aberration coefficient. Values in milli-radians are printed in cases where the spherical aberration is non-zero.

On-the-fly CTF refinement in homogeneous refinement

In cryoSPARC v2.12, both Local CTF Refinement and Global CTF Refinement can be run as standalone jobs. However, since the refinement of these parameters is very fast, they can also be run on-the-fly during iterations of Homogeneous Refinement. In the new Homogeneous Refinement job in v2.12, there are new parameters to enable local and/or global CTF refinement. CTF refinement is carried out iteratively with refinement of 3D poses and the 3D map, starting once the initial refinement is converged.

The new Homogeneous Refinement job in v2.12 will create plots similar to the standalone CTF refinement jobs, and the final CTF parameters after refinement will be outputted along with the 3D alignments of particles.

Nonuniform refinement with high-order CTF correction

In cryoSPARC v2.12, the Non-uniform Refinement job has been updated to use the new GPU code that supports higher-order CTF correction, but this is NOT enabled by default. You must turn on the Enable higher-order CTF parameter in Non-uniform refinement.

On-the-fly CTF refinement cannot be done during a Non-uniform Refinement, so particles should be processed through the standalone Local CTF Refinement then Global CTF Refinement jobs first.

Exposure Groups

In cryoSPARC v2.12+, particles, movies, and micrographs are organized into "Exposure Groups", which allow images with the same microscope configuration (beam tilt, image shift, etc) to have their CTF refinement done independently in a streamlined manner. This section describes the tools in cryoSPARC to create and manage exposure groups.

At Import time: When you import a dataset (movies, micrographs or particles) in cryoSPARC v2.12+, the set of imported data is automatically set with a new "exposure group ID". This ID is unique within a project (the group ID increments with each import job, starting from zero) unless overridden using the Override Exposure Group ID parameter. Using this method, you can import your datasets separately based on their beam tilt groups, or any other groups where you would like to use, and the grouping of imports will be retained even if the datasets are merged later on in processing.

Using the Exposure Groups utilities:

If you have a dataset that was imported prior to v2.12, or a dataset that contains multiple exposure groups and you would like to separate each of the groups in the dataset, you can use the Exposure Group Utilities Job. This job allows you to view, split, and combine datasets into one or more exposure groups.

To split a dataset into exposure groups, can select which file path attribute of the dataset will be used to identify unique groups. For example, in EPU, when capturing multiple images per hole, each shift position should be separated as a separate group. The groups can be identified by the first section of numbers right after the the word "Data" in the file path, as outlined below:

FoilHole_21256428_Data_21254194_21254195_20190622_0517_Fractions.mrc

Knowing this, we can separate our exposure (or particle) dataset into unique exposure groups. Input your data into the Exposure Group Utilities job, and select the split mode. Use the parameters Field to use to split Dataset, Start Slice Index, and Number of characters to Consider (more information here) to create unique tokens out of the file paths available. The job automatically creates and sets exposure groups for these tokens:

Exposure group utilities

You can choose to output each exposure group separately by using the Split Outputs by Exposure Group parameter.

You can also combine multiple exposure or particle datasets by connecting them all into the respective input slot in the Exposure Groups Utilities job. Using the combine&set mode and Set Exposure Group Value parameter, you can combine all connected datasets and set their exposure group to the same value. Note that when this happens, the job will check that the CTF values across the exposure group are consistent- you can decide what the job will do if it finds inconsistent values using the Combine Strategy parameter.

For advanced users: Another way to modify the exposure group ID for a dataset is to Export the job that created the dataset (or create a .csg file manually) (view data management tutorial) and modify the .cs file directly. You will have to modify the field ctf/exp_group_id (and mscope_params/exp_group_id for movie/micrograph datasets or location/exp_group_id for particle datasets) for all items inside the dataset. You can set these columns with the desired group identifiers, which do not need to be sequential but do need to be unique.

If your dataset does not have this result slot (which may be the case for jobs not processed by Patch CTF Estimation or imported prior to v2.12), you will have to first add the field, then modify the fields. See the python (2.7) example below.

#import the modules
from cryosparc2_compute import dataset
from cryosparc2_compute import common

#load the dataset
particle_dataset = dataset.Dataset().from_file(<path_to_particle_dataset>)

#add missing fields (this example is for particle datasets)
particle_dataset = common.create_missing_fields_in_dataset(particle_dataset, 'ctf', 'particle.ctf')

#set the exposure group id
particle_dataset.data['ctf/exp_group_id'][:] = 2

You can then re-import this dataset using the Import Result Group job.

This website uses cookies to ensure you get the best experience. To learn more, please refer to our Privacy Policy