Sometimes pithy photographic explanations, although valid and meaningful for the cognoscenti, can need some unpacking for most folk. Consider that Ansel Adams and others have used whole books, with charts, tables, graphs, and examples, just to say “expose for the shadows, develop for the highlights.” Thus it is — possibly, anyway — with photon noise.
This post is about the root cause of photon noise in digital captures. It ignores Pixel Response Non-Uniformity (PRNU), and read noise, pattern and otherwise. In many photographic situations, photon noise is the main source of image noise.
Most of you probably know all I’m about to say, but probably some of you don’t. I’m going to try and say it with the minimum of math and technobabble, but I’ve included enough for those who are experts to see some details. If you don’t understand something, just keep reading; you probably don’t have to understand it to get my point.
Imagine that you have a 24 by 36 mm monochromatic sensor. Let’s say the fill factor is 100%, and there are no micro lenses. The quantum efficiency of the sensor for D55 light is 50%. Let’s mount that sensor in a camera, and put a perfect lens on the camera. This lens is so perfect that there is no diffraction. Let’s put perfect electronics in our camera that allow us to count every photoelectron with zero read noise. Now let’s focus our perfect lens to infinity and aim it at a perfect point source of D55 light at a distance that’s its focal length away from the point source. Thus the light landing on the sensor is perfectly collimated. Let’s set the lens to f/8. Now let’s say that the pixel pitch of our sensor is 10 µm. That means that we have a 2400 by 3600 pixel sensor, or 8.64 megapixels. Then let’s adjust the intensity of our light source so that 691.2 billion photons per second fall on the sensor. We’re going to leave the light source at that level for the rest of this thought experiment. Because of the quantum efficiency of the sensor, that means that, on average, each pixel in our camera counts 40,000 electrons per second.
Image A: Let’s set the shutter in our camera to one second, and take a picture. The average electron count in our picture is 40,000, and the standard deviation as the square root of that, or 200. The signal-to-noise ratio (SNR) is 40,000 over 200, or 200. The spectrum of the noise is white: all frequencies are equally represented.
Image B: Now let us mentally reconfigure our sensor so did it has the same resolution in pixels: 2400 by 3600, but a pixel pitch of 5 µm. The physical size of our sensor is now 12 by 18 mm, and, because it’s smaller, the number of photons falling on our sensor in one second is ¼ the number that fell on our larger sensor, or 172.8 billion per second. Thus, on average only 10,000 photoelectrons are produced in each pixel, the standard deviation is the square root of that, or 100. The signal-to-noise ratio is 10,000 divided by 100, or 100. The spectrum of the noise is white.
Image C: Let’s make a four-second exposure with our small physical size sensor. The average electron count in our picture is 40,000, and the standard deviation is the square root of that, or 200. The signal-to-noise ratio is 40,000 over 200, or 200. The spectrum of the noise is white: all frequencies are equally represented. There is no way to tell from the statistics or spatial frequency of this image and the first image we made, the one with a one-second exposure and a physically larger sensor. On average, the same number of photons fell on each pixel of both sensors, and that’s all that matters.
Image D: Now let’s reconfigure the 24x36mm sensor so that the pitch is 5 um, (it’s now a 4800×7200 pixel sensor) and make a one-second exposure. On average 10,000 photoelectrons are produced in each pixel, the standard deviation is the square root of that, or 100. The signal-to-noise ratio is 10,000 divided by 100, or 100. The spectrum of the noise is white. The image looks just like four of Image B set side by side.
Image E: Let’s take image D, and downsample it to 2400×3600 by adding together the electron count of all the odd-numbered (assuming the indices start with one) pixels in each row and column to the values of the pixels to their immediate right, directly below them, and diagonally below and to the right of them. For those skilled in the art of image processing, this amounts to convolving the image with a 2×2 box filter, resampling using nearest neighbor, and trimming the result. The average electron count in our picture is 40,000, and the standard deviation is the square root of that, or 200. The signal-to-noise ratio is 40,000 over 200, or 200. The spectrum of the noise is white: all frequencies are equally represented. There is no way to tell from the statistics or spatial frequency of this image and Image A. On average, the same number of photons were used to create each pixel of Image A and Image E, and that’s all that matters.
Image F: Let’s downsample Image D to 2400×3600 by nearest neighbor. The image is indistinguishable from Image B, both in statistics and spatial frequency content. Each pixel saw on average one-quarter the number of photons as those in image A, so its SNR is half that of Image A.
Image G: Let’s downsample Image D to 2400×3600 by some other method: bilinear interpolation, bicubic interpolation, Lanczos, or something else. The standard deviation of the resultant image, and thus the SNR, will depend on the algorithm used. The spatial frequency content of the resultant image, will also depend on the algorithm used.
The constant throughout all this is that the number of photons counted determines the noise and the SNR. Downsizing from a similarly sized and illuminated sensor with a finer pitch can replicate images captured at larger pitches only using a particular downsizing algorithm, one that is usually not used in photography.
Eric Fossum, inventor of the CMOS image sensor, summed it up succinctly: “The only way to increase SNR in counting things that are described by Poisson statistics is to increase the number of things that are counted. Increasing area at constant flux or increasing time at constant flux are two ways to do that. Increasing the flux for constant time and area also works.”