Tutorial: CTF Refinement

This tutorial details the background, implementation, and use of CTF refinement in CryoSPARC v2.12+.

Overview

CTF Refinement includes two major components: local (per-particle) CTF refinement and global (per-group) CTF refinement. Local CTF Refinement adjusts each particle's defocus value to estimate the z-position of the particle in the sample/ice. Global CTF Refinement adjusts the higher-order CTF terms (beam-tilt, trefoil, spherical aberration, tetrafoil) across an entire group of images to find the optimum values, accounting for misalignment or aberrations in the microscope itself.

In CryoSPARC, both local and global CTF refinement can be performed standalone (using aligned particles and a reference volume as input) or they can be performed on-the-fly during a 3D refinement, so that the values are iteratively optimized along with particle alignments.

New: As of CryoSPARC v3.3+, Global CTF Refinement now supports the estimation and correction of anisotropic magnification present in the particle images.

Local CTF refinement (per-particle defocus)

This is a relatively straightforward optimization process of finding the optimal per-particle defocus for each particle in a dataset. Per-particle defocus refinement has been previously proposed and implemented in many other software packages for single particle EM (cisTEM, RELION, Thunder, etc.).

Local CTF Refinement in CryoSPARC requires aligned particle images and a 3D reference (two half-maps), ideally already at a high resolution. Experimental particle images are compared against the 3D reference from their half-set, from the best known pose, at various defocus levels, and the best defocus is selected. The optimal defocus ideally corresponds to the height of the particle in the sample/ice.

Since each particle can be at a different height and ice thicknesses can be 10 times larger than the particle diameter in many cases, per-particle defocus refinement can often make a large difference in the accuracy of CTF correction for each particle. However, it generally works best for larger, highly rigid, high quality samples that already reach relatively high resolutions (better than 4A). In general, it is a good idea to try local CTF refinement on every dataset, and to use a homogeneous (gold-standard) refinement to check whether the overall resolution increased or decreased.

Run Local CTF Refinement

Create a Local CTF Refinement job using the job builder and connect particles from a previously-run refinement (the particles must have alignments3D defined). Also connect the refined volume from the same refinement job. You can optionally connect a separate mask input, otherwise the mask_refine that is included in the volume input from the previous refinement will be used by default.

The most important parameters to adjust are:

  • Minimum fit res (A): controls the minimum resolution used for fitting. Generally CTF refinement should be done only with medium to high resolution signal, as low resolution signal can throw off CTF fits. For smaller particles, change this to a higher resolution.

  • Maximum fit res (A): controls the maximum resolution used for fitting. Higher resolution signal is better for CTF refinement, until there is too much noise present in the half-maps. Leave this blank to have the maximum resolution automatically determined via FSC between the two input half-maps.

  • Defocus search range: controls how far above and below the current defocus to search for the optimal defocus of each particle. If you used Patch CTF Estimation previously in cryoSPARC, this value can be made relatively small, about the same as the thickness of ice you expect to have in the sample, since the input defocus values will already be fairly accurate.

Once the job is run, several diagnostic plots will be created that show the progress of CTF refinement.

Plots of per-particle defocus error landscapes show the change in log-likelihood across the range of tested defocus values. The curves should like like these above, showing a clear minimum near 0 change in defocus. The X-axis is in units of Angstroms. The Y-axis is in log units, so each change of 1 unit corresponds to a change of 1/e^1 = 0.367 in probability. Therefore, plots with a minimum that is hundreds of units deep indicate that we are highly confident about the optimal defocus value. On the other hand, plots with very shallow minima (tens of units) indicate uncertainty in the optimal defocus.

Histograms showing the change in per-particle defocus across all the particles in the half-set indicate the total amount of deviation from the input defocus parameters that was achieved by CTF refinement. The histogram should generally be very peaked near zero and should not have heavy tails. Heavy tails, or the presence of many particles having optimal defocus values at the ends of the search range indicates that defocus refinement was not very confident or accurate.

Global CTF Refinement (per-exposure-group beam-tilt, trefoil, spherical aberration, tetrafoil, and anisotropic magnification)

Ultra high resolution cryo-EM structures require correcting for electron-optical aberrations and microscope misalignments that result in nuanced "high-order" terms in the Contrast Transfer Function (CTF). These higher order terms (corresponding with beam tilt, trefoil, spherical aberration, tetrafoil) can only be detected at very high resolution, and cannot easily be estimated by straightforward measurements on the microscope. Therefore, the strength of each of these aberrations must be estimated from single particle data itself, by refining the corresponding CTF parameters against a high-resolution reference map. This process of high-order aberration estimation and correction was pioneered by (Zivanov et al. 2019) in RELION 3.1.

While microscope misalignments can result in higher order terms affecting the CTF, microscopes occasionally show magnification anisotropy. The result of this anisotropy is that micrographs are slightly distorted by a linear transformation (or "stretch") in the image plane. Unlike higher order aberrations, anisotropic magnification cannot be corrected by better microscope alignment, and must either be estimated using the diffraction pattern of known crystalline samples, or by projection-matching using a high quality reference map. As of CryoSPARC v3.3+, the latter method of anisotropic magnification estimation and correction is now supported, which also follows closely the developments made by Zivanov et al. High-resolution signal is typically required to estimate any anisotropy, and unless the anisotropy is extreme, correcting for it will typically only improve maps that have already reached a fairly high resolution. Furthermore, errors in defocus and astigmatism due to magnification anisotropy are also corrected when fitting the magnification matrix.

CryoSPARC v2.12+ contains a GPU accelerated implementation of high-order aberration estimation and correction. In all cases, estimation is done by directly maximizing the likelihood of observing the experimental images given a 3D reference map, using LBFGS.

Images collected on a given microscope generally will have related CTF parameters for higher-order aberrations and anisotropic magnification. The images that are related (same grid, same image shift position, etc.) can be grouped into "exposure groups" so that they can all be refined at once, with more signal. Creation and management of exposure groups is explained in the next section.

Like local CTF refinement, Global CTF Refinement generally works best with larger, more rigid particles. However, Global CTF Refinement does use signal from all the particles in an exposure group, and so can detect beam tilt and other aberrations even with smaller/flexible structures.

Run Global CTF Refinement

Create a Global CTF Refinement job using the job builder and connect particles from a previously-run refinement (the particles must have alignments3D defined). Also connect the refined volume from the same refinement job. You can optionally connect a separate mask input, otherwise the mask_refine that is included in the volume input from the previous refinement will be used by default.

The most important parameters to adjust are:

  • Number of iterations: controls the number of iterations of CTF refinement that are done. It is important that the number of iterations is at least 2 when anisotropic magnification is being estimated together with the other aberrations. This is to allow the aberrations to be fit to the data while accounting for anisotropic magnification. If only aberrations are being estimated, 1 iteration is usually sufficient.

  • Minimum fit res (A): controls the minimum resolution used for fitting. Generally global CTF refinement should be done only with medium to high resolution signal, as low resolution signal can be unreliable. For smaller particles, change this to a higher resolution.

  • Maximum fit res (A): controls the maximum resolution used for fitting. Higher resolution signal is better for CTF refinement, until there is too much noise present in the half-maps. Leave this blank to have the maximum resolution automatically determined via FSC between the two input half-maps.

  • Fit Tilt/Trefoil/Spherical Aberration/Tetrafoil: Select which higher-order aberrations should be refined. Tilt and Trefoil are 3rd order and require less high resolution signal to accurately detect, compare to spherical aberration and tetrafoil which are 4th order. In some cases, optimizing the 4th order terms can be detrimental, especially if per-particle defocus or the 3rd order terms are not yet correctly refined. Note: as of v4.0, only the third-order aberrations (Tilt and Trefoil) and fit by default, whereas the fourth-order aberrations (Spherical Aberration and Tetrafoil) are not fit by default.

  • Fit Anisotropic Magnification: Activate to enable the estimation of anisotropic magnification. Note that this is inactive by default, since significant anisotropic magnification is a relatively rare phenomenon compared to beam tilt and other aberrations.

Once the job is run, several diagnostic plots will indicate the phase delay and fit diagnostics of each type of aberration.

For each order of aberration (odd and even), three plots are made. The first shows the phase error data that is measured from all the particles in aggregate. On the left is the full phase error, on the right are the masked out terms that will be used for fitting. For the odd terms, the aberrations appear as anti-symmetric patterns of delay (blue) and advancement (red) of the diffracted beam. The second plot shows the fit predicted values of the phase delay, after refining CTF parameters. The third plot shows the residual phase error between the data and fit, which should only contain noise indicating a good fit.

Similar plots are made for the even terms in the CTF. Note that odd terms are optimized from zero each time the Global CTF Refinement job is run, meaning that the plots will always show aberrations in the measured data (first plot). Even terms, on the other hand, are optimized starting from their current input values. Therefore if Global CTF Refinement is run twice, the second time, the even terms will show nearly zero aberration in the measured data (since the input CTF parameters are already nearly correct).

Note that in the output log of Global CTF refinement the units of each aberration parameter are printed. Beam-tilt is internally parameterized in Angstroms rather than radians, as converting to the latter requires a non-zero spherical aberration coefficient. Values in milli-radians are printed in cases where the spherical aberration is non-zero.

Anisotropic Magnification

If magnification correction is enabled, three plots are also made akin to that of the aberration plots. The first plot shows the predicted displacement per-pixel based on an unconstrained least-squares fit, which should typically show a linear trend. The absence of a trend indicates that the anisotropy is not significant, and a non-linear trend could indicate that the anisotropy is severe (so that multiple iterations are necessary to converge), or that there are other systematic effects in the data. The second plot shows the fitted displacement values based on the current estimate of the magnification matrix, and the third plot shows the residual (i.e. the unconstrained displacements minus the linear fit). Note that two sets of plots are made, showing the displacements in the x direction and the y direction separately.

Similar to the even aberrations, anisotropic magnification is optimized from its current values at the start of the iteration. This is done because unlike the refinement of odd and even aberrations, the refinement of anisotropic magnification involves approximations to the log likelihood objective function, and this approximation improves as the magnification matrix converges. As well, all high-order CTF parameters are fit to the current values of the magnification matrix. For these reasons, it is recommended to perform at least 2 iterations of CTF refinement when fitting aberrations together with anisotropic magnification. After two iterations, the residual anisotropy should typically be very small.

Above is an example of the anisotropic magnification plot from the first iteration, for EMPIAR-10395. On the left are the displacement plots for the Y coordinate from the first iteration, indicating that there is a significant linear trend in the predicted displacements at each voxel. On the right are the plots from the final iteration, showing a residual with no linear trend, hence no fit. The absence of a fit in the final iteration plot indicates that the anisotropic magnification matrix has converged. This job was run with three iterations.

Ewald Sphere Correction

Both Local and Global CTF Refinement may be run with Ewald Sphere correction enabled. This means that estimation of per-particle defocus and high-order aberration parameters can be done while accounting for the curvature of the Ewald Sphere. Generally, this does not significantly impact the outcome unless previous reconstructions have shown that Ewald Sphere correction results in a measurable resolution increase.

To use this feature in either Local or Global CTF Refinement jobs, activate the Account for EWS curvature parameter and make sure to set the EWS curvature sign to the correct value of curvature determined from previous reconstructions with Ewald Sphere enabled. For more information on how to obtain these reconstructions and the curvature sign, please refer to the Ewald Sphere Correction tutorial for a detailed walkthrough.

On-the-fly CTF refinement in homogeneous refinement

In CryoSPARC v2.12+, both Local CTF Refinement and Global CTF Refinement can be run as standalone jobs. However, since the refinement of these parameters is very fast, they can also be run on-the-fly during iterations of Homogeneous Refinement. In the new Homogeneous Refinement job in v2.12+, there are new parameters to enable local and/or global CTF refinement. CTF refinement is carried out iteratively with refinement of 3D poses and the 3D map, starting once the initial refinement is converged.

The new Homogeneous Refinement job in v2.12+ will create plots similar to the standalone CTF refinement jobs, and the final CTF parameters after refinement will be outputted along with the 3D alignments of particles.

Non-uniform refinement with high-order CTF correction

In CryoSPARC v2.12+, the Non-uniform Refinement job has been updated to use the new GPU code that supports higher-order CTF correction, but this is NOT enabled by default. You must turn on the Enable higher-order CTF parameter in Non-uniform refinement. Please also note that legacy refinement jobs will not support the correction of high-order CTF aberrations or anisotropic magnification.

On-the-fly CTF refinement cannot be done during a Non-uniform Refinement, so particles should be processed through the standalone Local CTF Refinement then Global CTF Refinement jobs first.

Exposure Groups

In CryoSPARC v2.12+, particles, movies, and micrographs are organized into "Exposure Groups", which allow images with the same microscope configuration (beam tilt, image shift, etc) to have their CTF refinement done independently in a streamlined manner. This section describes the tools in CryoSPARC to create and manage exposure groups.

At Import time

When you import a dataset (movies, micrographs or particles) in CryoSPARC v2.12+, the set of imported data is automatically set with a new "exposure group ID". This ID is unique within a project (the group ID increments with each import job, starting from zero) unless overridden using the Override Exposure Group ID parameter. Using this method, you can import your datasets separately based on their beam tilt groups, or any other groups where you would like to use, and the grouping of imports will be retained even if the datasets are merged later on in processing.

Using the Exposure Groups utilities

If you have a dataset that was imported prior to v2.12, or a dataset that contains multiple exposure groups and you would like to separate each of the groups in the dataset, you can use the Exposure Group Utilities Job. This job allows you to view, split, and combine datasets into one or more exposure groups.

To split a dataset into exposure groups, can select which file path attribute of the dataset will be used to identify unique groups. For example, in EPU, when capturing multiple images per hole, each shift position should be separated as a separate group. The groups can be identified by the first section of numbers right after the the word "Data" in the file path, as outlined below:

FoilHole_21256428_Data_21254194_21254195_20190622_0517_Fractions.mrc

Knowing this, we can separate our exposure (or particle) dataset into unique exposure groups. Input your data into the Exposure Group Utilities job, and select the split mode. Use the parameters Field to use to split Dataset, Start Slice Index, and Number of characters to Consider (more information here) to create unique tokens out of the file paths available. The job automatically creates and sets exposure groups for these tokens:

You can choose to output each exposure group separately by using the Split Outputs by Exposure Group parameter.

You can also combine multiple exposure or particle datasets by connecting them all into the respective input slot in the Exposure Groups Utilities job. Using the combine&set mode and Set Exposure Group Value parameter, you can combine all connected datasets and set their exposure group to the same value. Note that when this happens, the job will check that the CTF values across the exposure group are consistent- you can decide what the job will do if it finds inconsistent values using the Combine Strategy parameter.

For advanced users

Another way to modify the exposure group ID for a dataset is to Export the job that created the dataset (or create a .csg file manually) (view data management tutorial) and modify the .cs file directly. You will have to modify the field ctf/exp_group_id (and mscope_params/exp_group_id for movie/micrograph datasets or location/exp_group_id for particle datasets) for all items inside the dataset. You can set these columns with the desired group identifiers, which do not need to be sequential but do need to be unique.

If your dataset does not have this result slot (which may be the case for jobs not processed by Patch CTF Estimation or imported prior to v2.12), you will have to first add the field, then modify the fields. See the python (2.7) example below.

#import the modules
from cryosparc_compute import dataset
from cryosparc_compute import common

#load the dataset
particle_dataset = dataset.Dataset.load(<path_to_particle_dataset>)

#add missing fields (this example is for particle datasets)
particle_dataset = common.create_missing_fields_in_dataset(particle_dataset, 'ctf', 'particle.ctf')

#set the exposure group id
particle_dataset['ctf/exp_group_id'][:] = 2

You can then re-import this dataset using the Import Result Group job.

Last updated