Today we have a guest poster, Jack Hogan. Over on the DPR forum, a question has been asked, and argued endlessly: when faced with a 16-stop intra-scene dynamic range, what’s the dynamic range of an image captured with a 14-bit camera? Jack responded with a little Chautauqua on how a camera works that I thought deserved some more web ink. I have edited Jack’s words for clarity. Any errors are likely mine.
Take it away, Jack:
- We are interested in scene DR as it is projected onto the sensing plane of our camera where arriving photons are collected within a rectangular area typically but not necessarily divided into smaller squarish portions (pixels) and converted to photoelectrons in order to be counted and recorded in a file. The total number of e- is the same independently of the number of pixels within the sensing area, so clearly the more the pixels the fewer the e-/pixel. We call the number of photoelectrons so collected the ‘signal‘. It is an ‘analog’ signal independent of bit depth.
- Photons and photoelectrons arrive and are converted with random timing, so the signal is never perfectly clean but it is always somewhat noisy. The inherent SNR of the signal is well defined and equal to the square root of its count in photoelectrons – we call this shot noise.
- The field of view, the size of the sensor and other sensor characteristics (pixel size and shape) are all needed to define what we normally call scene DR
- Pixels are typically square and of arbitrary dimensions: i.e. we could make them 1 nm^2, 1mm^2, the size of the sensing area or whatever.
- However pixels do have finite physical characteristics and therefore cannot record an infinite DR (nor can they practically be made as small or as large as we want). Current pixels can collect at most about 3000+ photoelectrons per micron^2. For a current 24MP full frame camera with 6 micron pixel pitch this translates into about 75,000 e-, after which pixels ‘fill up’ and top out (in photography we say they ‘clip’ or saturate). For a pixel of twice the area the saturation count would approximately double to 150k e- – of course in that case the camera’s ‘resolution’ would be halved to 12MP
- Designers choose camera characteristics (including pixel size) and photographers choose Exposure so that the brightest desirable highlights found in typical photographic situations do not exceed saturation count
- Therefore by the definition of eDR* the largest scene dynamic range that the 24MP FF camera in 5) can record is 16.2 stops = log2(75,000:1). This number depends on pixel size and is independent of bit depth.
- Is 7) the dynamic range of the natural scene? No, that’s potentially much larger. It is the scene dynamic range as referenced to that specific camera with its pixel sizes and sensor characteristics. Scene dynamic range as viewed through the sensor of another camera with other pixel sizes and characteristics would be different. For instance the 12MP camera in 5) could record the same scene with an eDR* up to 17.2 stops = log2(150,000:1)
- While collecting and digitizing the converted photoelectrons the camera electronics adds some noise to the signal, which we typically model as if it were all added at the same time that the collected photoelectrons roll off the pixels. We call this read noise. For modern DSCs this tends to be in the 2 to 50 e- range.
Note that so far we have always spoken about the undigitized ‘analog’ signal only: we have not decided the bit depth at which to digitize it yet.
What procedure shall we use to prepare it for printing at 8×12 with maximum dynamic range and no visible loss of detail? We need to know the minimum number of pixels required for the average human to resolve all of the detail in the 8×12 print when viewed at standard distance. Let’s say that it is 8MP.
What would the scene eDR of your canyon be as seen through the pixel size of a camera of the same format as yours but with 8MP resolution?
Say you captured the canyon with the 8MP camera. If you used a 16MP camera of the same format instead of an 8MP camera, pixel area would be halved and you would have too many pixels for your 8×12. But is the scene DR information captured by both cameras roughly the same for your purposes? Yes, because both sensors sampled the same overall area. Sure the 16MP camera will have recorded more spatial resolution, but as far as the number of e- counted and their inherent noise is concerned they are virtually indifferent when viewed at the same size.
Here is an example with signal and shot noise SNR (see 2 above) of how it would work at the same exposure with everything equivalent other than pixel size. The half sized pixels would clearly see only half the photons arrive:
- Information from average pixel in 8MP recording: average signal 100 e-, SNR 10
- Information from average pixel in 16MP recording: average signal 50 e-, SNR 7
- Information from average pixel in 16MP recording 2:1 into 8MP: average signal 50+50=100 e-, SNR sqrt(7^2+7^2)=10. Same as if the pixels were twice as big.
Does it make any difference whether we use the data from the 8MP recording or from the binned 2:1 16MP recording as far as the eDR of the print in the specified viewing conditions is concerned? Not really, the recorded scene DR information is effectively the same at these viewing conditions. Does it make a difference to the observer when viewing the 8×10 print? Not really.
Note that we have not decided on bit depth yet. [If you are interested in how a camera can capture information whose amplitude is below the least-significant bit of the analog to digital converter, take a look here.]
* This is how DxO defines engineering DR for their purposes (it’s a fairly well accepted definition):
Dynamic range is defined as the ratio between the highest and lowest gray luminance a sensor can capture. However, the lowest gray luminance makes sense only if it is not drowned by noise, thus this lower boundary is defined as the gray luminance for which the SNR is larger than 1