About a year ago I wrote a post on how read and quantizing noise interact in a digital camera. I concluded that, when the standard deviation of Gaussian read noise exceeded one-half the least-significant bit (LSB), the read noise provided sufficient dither that further increases in ADC precision would offer no increase in average digitizing accuracy.
In the ensuing time, some have proposed that greater amounts of read noise are necessary in order to avoid deleterious visual effects. One common statement is that 1.3 LSBs of noise are necessary. Some have offered reasons why my simple simulation gave over-optimistic results.
- Visual impact of noise, averaged by the human eye-brain system, averages few image values. Put another way, the circle of confusion is smaller than the essentially-infinite one that I assumed.
- The operation of demosicing generates chroma noise that is significant with more dithering than half an LSB.
- Converting to a gamma of 2.2 or so emphasizes behavior near zero, and exposes visual errors that would otherwise be unnoticed.
I thought to test all those assertions. I wrote a Matlab script to make an image from a simulation where the average level increased from top to bottom, and the noise from left to right. Here is the result, for a demosaiced image with output gamma of 2.2, 2 bits of precision, and noise from 0 LSBs to 1 LSB progressing linearly from left to right.
You can see the effects of the dither in reducing posterization. 0.5 LSB is in the center of the image, and it looks like the posterization is just about completely gone by then.
What if we look at read noise levels from zero to 2 LSBs:
We can see that the posterization is gone about a quarter of the way into the image, but that the average levels continue to shift as more noise is added. That’s because adding more noise increases the number of clipped pixels, since values below zero are represented as zero, and values above full scale are represented as full scale.
In order to minimize this effect, I changed the precision to 3 bits, lifted the lower level of the average gray to 3 LSBs, and the upper lever of the average gray dropped to 5 LSBs, which, with a maximum amount of added noise of 1.5 LSBs, will mean that clipping occurs about a tenth of the time at the maximum noise (RH side) of the image.
Here’s the result, with the read noise varying from zero to 1.5:
It’s a little hard to see what’s going on. Dropping the gamma to one helps:
Now we can see that 0.5, or about one third of the way from left to right, seems to be adequate to reduce posterization. If you look hard at the lightest tones, you could convince yourself that it takes almost three-quarters of an LSB of noise to completely smooth them out.
As Jack Hogan pointed out when he saw these images, in a real camera there would be photon noise in addition to the read noise. That would reduce the amount of read noise you need for adequate dithering in the brighter tones.
CarVac says
The real tough part of posterization is when a gradient is colored and you get hue shifts as the color channels step across bins individually.
Can you redo this test with a color gradient, like maybe 0.6 0.8 1.0 relative values to the channels?
You may need the vertical axis to cross more quantization boundaries so you can easily see both the intended average color and the local hue, though.
CarVac says
I did the above experiment myself with the colored gradient in Octave and came up with the same result as the monochromatic test in the article: when the standard deviation of the noise is equal to half the quantization step, that’s good enough to eliminate banding.
Jim says
I did it, too, and got the same answer. The images are interesting, though. I’ll post them.
Jim
Toh says
Hi Jim – This is the best demonstration of noise dither I’ve seen with modern camera relevance, thank you!
I’m trying to get intuition on the tradeoffs here between spatial resolution and dynamic range and whether it depends on the situation at hand. Am I correct in believing that:
– If one has a sufficiently high resolution sensor with no optical limits, it could be have just 1 stop of DR (i.e. on/off) and with enough noise dither, one could reconstruct a regular DR image from it with some digital low-pass filtering (though not very efficiently as it’d take a thousand pixels to produce 10 stops). In this example, a noise gaussian of 1 stdev = 1/2 max value would reasonably provide coverage. Would the ‘ideal’ noise function+amount be one where a signal value of X translates into a probability of being quantized as ‘1’ of X?
– However, there must be times where we value spatial resolution more than DR. For example, if we know that there’s a clear 0.25->0.75 border, noise dithering creates uncertainty around where the border is?
spider-mario says
Regarding your first point: from what I understand, in the audio world, DSD essentially works like that (it’s a 2.82 MHz signal at one bit per sample, as opposed to the 44.1 kHz @ 16 bits used by CDs).
https://en.wikipedia.org/wiki/Direct_Stream_Digital
JimK says
These DSD- like schemes are a form of delta modulation, which has been around in one form or another since the 1940s. Their best counterpart is cameras, to my knowledge, are photon counting regimes.