Microscopy Kinds, Techniques and Relative Performances

My own take on the different kinds of optical microscopy techniques is that they almost all fall into the following categories:

  1. Intensity imaging methods which suffer from a divergent noise contribution from out-of-focus information, in exactly the same way as the night sky should be uniformly bright given an infinite universe as described by Olber’s Paradox. These methods thus lack depth sectioning capability and can only be used with thinly sliced sections to forestall noise buildup from out-of-focus scatterers. This category is basically bright and dark field microscopy and can image a whole field of view at once (rather than by scanning, as in category 2);
  2. Intensity imaging methods which either structure the illumination light or make use of nonlinear processes to control the noise buildup that happens in category 1. This category includes confocal and multiphoton imaging and all such methods must build images up through scanning, one pixel at a time;
  3. Interferometric methods where the phase delay through a thinly sliced specimen is converted to intensity variations by interferometry: this is the classic Zernike Phase Contrast Imaging technique;
  4. The sensationally named “superresolution” methods which make use of both optical information together with further information about the sample such that the further information allows one to locate a point source more finely than the diffraction limit of light would otherwise allow. These methods are often wrongly described as “beating” or “outdoing” or “overcoming” the diffraction limit; they do nothing of the kind and must always use further information for the final location of a point source. The most developed of these methods are STED, STORM and structured illumination (SIM), which uses periodically modulated illumination to downconvert high spatial frequency information in images to the spatial frequency band where this information can be imaged by a diffraction limited micrcroscope. This is exactly the same mechanism that begets Moiré fringe patterns.

Category 1: Brightfield, Darkfield, Microscope Slides, Pathology Labs and Olber’s Paradox

My drawing below tries to explain the signal-to-noise calculation for microscopy with uniform sample lighting. I first put this question to myself many years ago as: “Why can’t I put my hand under the objective of a powerful microscope, light my hand strongly (with, say, a fibre bundle lightsource) and see the cells in my hand by bringing the microscope into focus on my skin?

Figure 1: Signal to Noise Calculation for a Thick Sample with Brightfield Microscopy

I’ve drawn the illumination light field in blue and the light scattered from an object in green (most often I design systems for fluorescence), but the argument works just as well in reflexion mode. The main point is that we have out collector lens system imaging the object onto, say, a CCD array and the aberration free field is such that the photon coupling amplitude from scatterer to CCD pixel varies as $\exp(i\,k\,R)/(1+ (i R/R_0))$, where $R$ is the distance of the object from the focus of the collector system. Here I’ve drawn the object we want to see at the focus. But there are also out-of-focus scatterers lit just as brightly as the object and the formula $\exp(i\,k\,R)/(1+ (i R/R_0))$ means that, in the farfield, each scatterer contributes a power proportional to $1/R^2$ to the pixel. So far so good: it seems the sensitivity of the instrument to out-of-focus information drops off very swiftly with distance from focus. Not swiftly enough, for we must sum up the noise contribution from the whole out of focus volume. If we assume scatterers (noise objects) are roughly equally responsive and are uniformly distributed in the large, then we can do this summation in spherical shells as shown. So each scatterer in the spherical shell a distance $R$ from the focus contributes noise power proportional to $1/R^2$ but each spherical shell volume varies like $R^2$ so that roughly each spherical shell of thickness $\delta R$, no matter how far from focus, contributes a roughly equal noise power. Therefore, if the sample is infinitely thick, the noise power diverges. This is exactly the reason why one would expect a uniformly bright night sky in an infinitely long living, infinitely wide universe as in Olber’s paradox.

One cannot see the cells in one/s own hand with a brightfield microscope because the out-of-focus noise levels are divergent.

This is precisely the reason we must prepare microscope slides in a pathology laboratory: the divergent noise figure is controlled by physically “gating” the out-of-focus information: we simply slash it off with a knife!

Category 2: Confocal and Multiphoton Imaging: Controlling the Out-Of-Focus Noise

My drawing below repeats this calculation for confocal microscopy.


Figure 2: Signal to Noise Calculation for Confocal Microscopy

Here we structure the lighting by focusing it one pixel at a time on the point we want to image. A system I have worked with lights the sample with the image of the tip of a single mode optical fibre: this begets a convergent lightfield in the sample as shown in my drawing by the blue lightfield. The fluorescence (or reflected light) is also gathered by the single mode of the same optical fibre. The fibre’s mode works as a sending and receiving antenna, with its directivity and gain reciprocally related as for any other antenna. The upshot of all this is now that, not only is there a $\exp(i\,k\,R)/(1+ (i R/R_0))$ coupling amplitude for light returning to the optical fibre’s single mode, there is another $\exp(i\,k\,R)/(1+ (i R/R_0))$ excitation amplitude as well because the illumination field is focussed. Therefore, the probability that a noise scatterer is raised to a fluoresciung state varies like $1/R^2$ and, given that this happens, the probability that the photon couples back into the fibre also varies like $1/R^2$. This means that the noise power contributed by a spherical shell of radius $R$ centred on the focus and of thickness $\delta R$ is proportional to $R^2 \times R^{-2}\times R^{-2} \delta R = R^{-2} \delta R$. Therefore, the noise gathered from an infinite medium is now a finite number and the object tissue does not have to be sliced to be imaged by a confocal microscope.


Figure 3: Signal to Noise Calculation for Two-Photon Microscopy

Another way to achieve the same noise probabilities is through multiphoton excitation]. Here, a fluorophore is raised into an extremely short lingering virtual state and then is raised again to the first excited state by a second excitation photon. So the probability for this to happen is proportional to the square of the intensity, not simply the intensity as for one-photon fluorescence. So now the probability that a noise object will be raised to fluorescence varies like $1/R^4$ an, even if we gather all the fluorescence and take no steps to localize it (by focussing it through a pinhole or single mode fibre tip) the total noise converges. Often in two photon imaging one does just that: the return path does not have to be an imaging system at all to form the image: all we need to do is register the fact of fluorescence and we know where it is coming from since we know where the illumination system is focussed. If we work in this so-called non-descanned mode, the each out of focus spherical shell contributes noise proportional to $R^{-2} \delta R$ as for the confocal microscope; if we image the fluorescence as we did for the confocal microscope, we get even better noise rejection, for the noise from the same shell now varies like $R^{-4} \delta R$. For $N$ photon fluorescence , we can likewise achieve $R^{-(2\,N)} \delta R$ noise performance.

Below is a comparison between the “tightness of localization” achieved by one and two photon fluorescence. The top trail shows the $1/(R_0^2+R^2)$ fluorescence probability for one photon fluorescence: the bottom dot shown by the arrow shows the $1/(R_0^2+R^2)^2$ two-photon fluorescence probability.


Figure 4: Comparison of Confinement of One-Photon and Two-Photon Fluorescence. (Top) One-photon fluorescence shows long trail through fluorophore; (Bottom) Two-photon fluorescence is tightly confined to a speck region marked by the arrow.

Category 4: Superresolution

See the Wikipedia Page “Super Resolution Microscopy”

Superresolution is typified by techniques such as stimulated emission depletion microscopy, ground state depletion microscopy and like ideas pioneered for example by Stefan Hell. This is often sensationally described as “breaking the diffraction limit of light” but this is misleading. The essential idea is that some further information is used to localise where a detected photon has come from and this further information is what gets one below the diffraction limit. For example, STED microscopy uses a high powered “fore-pulse” focussed as a first order (one with a null at its centre and whose intensity varies like $r^2\,e^{−\alpha\,r^2}$) Gaussian mode on excited fluorophores to relax all those fluorophores aside from those at the very centre and within a distance that is much below the diffraction limit. Then the system reads the fluorescence arising from a second pulse following very soon (nanoseconds) after the first and with the same focus, but this time in the zeroth order Gaussian (the one that varies like $e^{−\alpha\,r^2}$) mode. Now only fluorophores well within a radius that is well below the diffraction limit from the central focal point can contribute to this second reading, and this knowledge lets one infer to within tens of nanometres where the light has come from.

Another technique called STORM (also on the “Super Resolution Microscopy” Wikipedia Page ) uses advanced fluorophore chemistry and nonlinear interaction to switch fluorophores on and off quickly so that only the fluorophores near the focus are excited at once. With so few fluorophores lit, the out of focus noise is almost nonexistent and so the signal is clean enough that the point spread function can be deconvolved by Fourier techniques from the image. Depending on the signal received, the deconvolution achieves an improvement of resolution length proportional to $1/\sqrt{N}$ where $N$ is the number of photons gathered. One can get down to tens of nanometers resolution in optimal conditions with this technique, however, the sample has to be meticulously prepared and sectioned.

Other techniques include modulation of the optical image to achieve the spatial analogue of syncrhonous downconversion (the same phenomenon that yields Moiré patterns) of the high spatial frequency content of an image down to the spatial frequency range where it can be imaged by conventional, diffraction-limited light microscopy.