Convolution filtering and read noise

A reader commented on the previous post, and posted this link to a web page where he analyzed the spatial aspects of the read noise on a Sony NEX-6. He contends, with excellent justification, that the construction of the sensor on that camera, and sensors with similar column-parallel ADCs, creates more low-frequency read noise in the row direction than the column direction (when the camera is in landscape orientation).

I thought I would adapt his convolution-based analysis techniques to my Nikon D810 ISO 12800 dark-field image, and see if that proved more illuminating than the frequency domain approach of he last post.

I constructed three families of convolution kernels: nxn (square), 1xn (row-filtering) and nx1 (column-filtering). I passed them in various sizes over the green channel of the raw D810 test, then measured the standard deviation of the resultant images:

d910RNISO12800

You can see that there’s different behavior in the images filtered by the  two one-dimensional kernels: that’s called anisotropy. The results are similar to what Ilya discovered with the NEX-6, but with two exceptions. First, the line labeled “Vertical” has a slight upward curve to it at high kernel sizes, indicating that the noise isn’t white in that direction. Second, the Vertical and Horizontal lines diverge more slowly than in the NEX-6 results (which, to be fair, used a somewhat different methodology), indicating lower DAC noise with the S810 than the NEX-7.

Since I was doing all this filtering anyway, I thought I’d have the program write out the square-kernel-filtered images. What I saw surprised me. Here are some samples, all normalized so each maximum is full scale, and all encoded with a gamma of 2.2.

At a kernel size of 2, random hot pixels prohibit being able to see anything:

D810ISO12800lp2

With a kernel of 11×11, the columns show much more variation than the rows, making DAC noise look like it’s the culprit:

D810ISO12800lp11

With a kernel size of 36, a mottling is added to the mix, and it’s not all in one direction:

D810ISO12800lp36

With a kernel size of 152, there is the crosshatching we saw in the previous post, and some areas of high noise are apparent across the bottom and on the left and, to a lesser extent,, the right side:

D810ISO12800lp152

With a 434×434 kernel, the low-frequency systemic noise features are quite visible:

D810ISO12800lp434

More of same at 1233×1233. By the way, the motion of the features is due to using mirroring near the edges of the images when the kernel needs more data than the image can provide.

D810ISO12800lp1233

What’s this all mean to my quest for finding a metric for the visual effects of read noise? It means it’s a harder problem than I thought. It also means that there may be opportunity to filter out some of the low-frequency read noise components by subtracting a low=pass filtered image, or, equivalently, varying the black point across the sensor field.

In search of a read noise ugliness metric

Beauty is in the mind of the beholder, so ’tis said. I guess ugliness must be as well. Maybe I’m on a fool’s mission, but I’d like to figure out a way to mathematically calculate the visual effect of read noise.

If you look at the histogram of dark-field noise for a digital camera, providing the firmware that creates the raw image hasn’t cut off the bottom, you’ll see something that looks approximately Gaussian, with a bit of over-representation in the upper regions.

However, it is widely believed that read noise is more damaging to image quality than a similar amount of Gaussian noise, because read noise usually forms a pattern with some regularity. The viewer detects that pattern, and it is more distracting than a similar amount of Gaussian noise.

That’s the theory, anyway. I thought I’d explore some of the details, and maybe come up with some kind of numerical description of how visually distracting read noise patterning is. Foreshadowing: I havn’t gotten there yet. The journey is the destination, though; join me on the journey.

Here’s the histogram of the central 512×512 pixels in one of the green channels of a Nikon D810 dark-field image made at 1/8000 second, ISO 12500, and high ISO noise reduction set to high:

d810df histo

The black point on this camera is 600, and the histogram is truncated a couple of hundred counts to the left of that. There is at least one pixel that is about 3000.

Here’s what the central 512×512 section looks like, with the black point subtracted out and scaled up in amplitude be a factor of 1000:

darfieldnoisx1000

It looks like there is some kind of pattern, doesn’t it? The eye is really good at picking out patterns. Maybe too good it can make them up as well. Here’s a similar image of Gaussian noise — I haven’t matched the statistics:

rand-noise1000

Can you see mountains and plains? I can.

Many patterns are periodic. One way to find periodicities in images is to look at the Fourier transform. Here’s the magnitude of the Fourier transform of the 512×512 crop from the Nikon D810 dark field image:

fftwhole

There’s a vertical line. Let’s see what the origin looks like:

fftzoome

Except for the vertical line, I can’t see much. Let’s see some of the numbers:

fftzoomnumbers

Not real illuminating. What if we average the FFT image using a 3×3 kernel, then take a close look at the origin?

fft3x3avgs

If looks like there are peaks in the vertical frequency axis just away from the origin.

If we look at the frequency distribution of the image spectrum averaged in each direction with 8 pixel wide buckets, we get this:

spectrumreadnoise

Except for the dc (zero frequency) component of the horizontal spectrum (that’s our white line), it looks like the dark field is white noise.

Let’s go back to the image in the space domain, and get rid of most of the high frequency information by passing a 25×25 pixel averaging kernel over it, then applying gain of 1000:

darfieldnoisxlp-25-1000

Now we can see some lumpiness and crosshatch features. What happens if we look at the spectrum of that image?

spectrumreadnoiselp25

Are we looking at a pattern, or just the lowpass filter response of the 25×25 averaging? Here’s the spectrum of similarly-filtered Gaussian noise:

spectrumrandnoiselp25

Hmm… We can’t see much of a pattern in the unfiltered read noise though there are hints. We can see a pattern if we lowpass filter the dark field image. And so far, I don’t have a way to assign a number to the patterning.

Stay tuned.

Color space conversion accuracy — summary

Unless something comes up, I’m done with the color space conversion accuracy work. I’ll use this post to summarize what I’ve found over the last couple of weeks and link to the posts with the details.

The first conclusion is that, using 16-bit precision, for all colors within the gamut of both the source and destination color space (two restrictions that apply throughout this post), conversion among model-based RGB color spaces can be performed with accuracy much greater than the quantization errors involved in converting to 16-bit or 16-bit unsigned integer representation after the conversion. This is also true between RGB color spaces and CIElab and CIELuv.

The second finding is that the ACE color engine that ships with Photoshop and other Adobe image editing programs, while not as exact as doing the color conversion calculations in double precision floating point, is sufficiently accurate to make conversions among RGB color spaces and from those spaces to and from Lab color (CIELuv is not supported by Photoshop) safe for photographers wishing to take advantage of some quality of any of these color spaces.

The round trip from say Adobe RGB or ProPhoto RGB is occasionally attractive to gain advantage of moves that are difficult to impossible in RGB, but convenient in Lab. If you’ve been scared to do this because of vague fears about damaging your image, set those fears aside.

The third discovery is that the Microsoft ICM color engine is not as accurate for these conversions as ACE, and should be avoided.

One caveat that applies to the situation where not all colors in the image are within the gamut of the target space is that the ACE engine as accessed from Photoshop doesn’t allow perceptual rendering intent, just relative colorimetric, which maps out of gamut colors to the gamut envelope. Photoshop does allow you to select absolute colorimetric, but when ACE is doing the work, it silently substitutes relative colorimetric for that choice.

On dynamic range — a guest post

Today we have a guest poster, Jack Hogan. Over on the DPR forum, a question has been asked, and argued endlessly: when faced with a 16-stop intra-scene dynamic range, what’s the dynamic range of an image captured with a 14-bit camera? Jack responded with a little Chautauqua on how a camera works that I thought deserved some more web ink. I have edited Jack’s words for clarity. Any errors are likely mine.

Take it away, Jack:

  1. We are interested in scene DR as it is projected onto the sensing plane of our camera where arriving photons are collected within a rectangular area typically but not necessarily divided into smaller squarish portions (pixels) and converted to photoelectrons in order to be counted and recorded in a file. The total number of e- is the same independently of the number of pixels within the sensing area, so clearly the more the pixels the fewer the e-/pixel. We call the number of photoelectrons so collected the ‘signal‘. It is an ‘analog’ signal independent of bit depth.
  2. Photons and photoelectrons arrive and are converted with random timing, so the signal is never perfectly clean but it is always somewhat noisy. The inherent SNR of the signal is well defined and equal to the square root of its count in photoelectrons – we call this shot noise.
  3. The field of view, the size of the sensor and other sensor characteristics (pixel size and shape) are all needed to define what we normally call scene DR
  4. Pixels are typically square and of arbitrary dimensions: i.e. we could make them 1 nm^2, 1mm^2, the size of the sensing area or whatever.
  5. However pixels do have finite physical characteristics and therefore cannot record an infinite DR (nor can they practically be made as small or as large as we want). Current pixels can collect at most about 3000+ photoelectrons per micron^2. For a current 24MP full frame camera with 6 micron pixel pitch this translates into about 75,000 e-, after which pixels ‘fill up’ and top out (in photography we say they ‘clip’ or saturate). For a pixel of twice the area the saturation count would approximately double to 150k e- – of course in that case the camera’s ‘resolution’ would be halved to 12MP
  6. Designers choose camera characteristics (including pixel size) and photographers choose Exposure so that the brightest desirable highlights found in typical photographic situations do not exceed saturation count
  7. Therefore by the definition of eDR* the largest scene dynamic range that the 24MP FF camera in 5) can record is 16.2 stops = log2(75,000:1). This number depends on pixel size and is independent of bit depth.
  8. Is 7) the dynamic range of the natural scene? No, that’s potentially much larger. It is the scene dynamic range as referenced to that specific camera with its pixel sizes and sensor characteristics. Scene dynamic range as viewed through the sensor of another camera with other pixel sizes and characteristics would be different. For instance the 12MP camera in 5) could record the same scene with an eDR* up to 17.2 stops = log2(150,000:1)
  9. While collecting and digitizing the converted photoelectrons the camera electronics adds some noise to the signal, which we typically model as if it were all added at the same time that the collected photoelectrons roll off the pixels. We call this read noise. For modern DSCs this tends to be in the 2 to 50 e- range.

Note that so far we have always spoken about the undigitized ‘analog’ signal only: we have not decided the bit depth at which to digitize it yet.

What procedure shall we use to prepare it for printing at 8×12 with maximum dynamic range and no visible loss of detail? We need to know the minimum number of pixels required for the average human to resolve all of the detail in the 8×12 print when viewed at standard distance. Let’s say that it is 8MP.

What would the scene eDR of your canyon be as seen through the pixel size of a camera of the same format as yours but with 8MP resolution?

Say you captured the canyon with the 8MP camera. If you used a 16MP camera of the same format instead of an 8MP camera, pixel area would be halved and you would have too many pixels for your 8×12. But is the scene DR information captured by both cameras roughly the same for your purposes? Yes, because both sensors sampled the same overall area. Sure the 16MP camera will have recorded more spatial resolution, but as far as the number of e- counted and their inherent noise is concerned they are virtually indifferent when viewed at the same size.

Here is an example with signal and shot noise SNR (see 2 above) of how it would work at the same exposure with everything equivalent other than pixel size. The half sized pixels would clearly see only half the photons arrive:

  • Information from average pixel in 8MP recording: average signal 100 e-, SNR 10
  • Information from average pixel in 16MP recording: average signal 50 e-, SNR 7
  • Information from average pixel in 16MP recording 2:1 into 8MP: average signal 50+50=100 e-, SNR sqrt(7^2+7^2)=10. Same as if the pixels were twice as big.

Does it make any difference whether we use the data from the 8MP recording or from the binned 2:1 16MP recording as far as the eDR of the print in the specified viewing conditions is concerned? Not really, the recorded scene DR information is effectively the same at these viewing conditions. Does it make a difference to the observer when viewing the 8×10 print? Not really.

Note that we have not decided on bit depth yet. [If you are interested in how a camera can capture information whose amplitude is below the least-significant bit of the analog to digital converter, take a look here.]

* This is how DxO defines engineering DR for their purposes (it’s a fairly well accepted definition):

Dynamic range is defined as the ratio between the highest and lowest gray luminance a sensor can capture. However, the lowest gray luminance makes sense only if it is not drowned by noise, thus this lower boundary is defined as the gray luminance for which the SNR is larger than 1

ICM vs ACE

In Photoshop, as installed under Windows, You can choose between Adobe’s ACE color engine, and Microsoft’s ICM when performing color space conversions. I have been using ACE up to now. I wondered if ICM could produce more accurate results.

In a word, no.

I took the 256 million color sRGB noise image that I’d created earlier and made some one-way transforms to Adobe (1998) RGB and ProPhoto RGB, then measured the accuracy using Matlab. The first think I noticed was that the ICM transforms take a bit longer. That gave me hope that they might be more accurate. The second was that ICM, unlike ACE, respects your choice of Absolute rendering intent  when converting among RGB working spaces.

Here are the error stats, in CIELab DeltaE:

onewayICMstats

The ACE worst case on the sRGB>Adobe conversion is slightly worse. In all other respects, ACE (labeled Photoshop above)  is better.

For round-trip conversions, with rel meaning relative colorimetric rendering intent:

rtICMstats1

ACE is much better. In fact, as far as I’m concerned, the high worst-case round trip errors for ICM conversions make it unsuitable for things like a quick conversion to Lab and back to do some tricky color editing.

For the record, I’m using Photoshop 2014.1.0 CC x64, running under Windows 7 SP1 x64.

And I had such high hopes…

Comparing Photoshop and algorithmic color space conversion errors

I took my 256 megapixel image that’s been filled with random 16-bit entries with uniform probability density function, brought it into Photoshop, assigned it the sRGB profile, converted it to Adobe (1998) RGB and wrote it out. I went back to the sRGB image, converted it to ProPhoto RGB, and wrote that out.

Then I took the sRGB image on round trip color space conversions to and from Adobe RGB, ProPhoto RGB and CIELab, writing out the sRGB results.

In Matlab, I compared the one-way and roundtrip images with ones that I created by performing the color space conversions using double precision floating point. I quantized to 16-bit integer precision after each conversion.

Here are the results, with the vertical axis units being CIELab DeltaE:

psvsml

You can see that the conversions that I did myself are more accurate. In the case of the worst-case round trip conversions, they are about one order of magnitude more accurate. In the case of the one-way conversions, they are relatively more accurate. The striking Photoshop worst-case error with the one-way conversion to Adobe RGB makes me think that either Adobe or I have a small error in the nonlinearity of that space.

Photoshop did its conversions faster than I did mine. I suspect that they’re not using double precision floating point. In fact, from the amount of the errors, I’d be surprised to find that they’re using single precision floating point.

Except for the one-way conversion to Adobe RGB, even the Photoshop worst-case errors are not bad enough to scare me off from doing working color space conversions whenever I think a different color space would help me out. Still, it would be nice if Adobe offered a high-accuracy mode for color space conversions, and let the user decide when she wants speed and when she wants accuracy.

Chained color space conversion errors with many rgb color spaces

[Note: this post has been extensively rewritten to correct erroneous results that arose from not performing adequate gamut-mapping operations to make sure the test image was representable within the gamut of all the tested working spaces.]

I’ve been trying to come up with a really tough test for color space conversion testing, one that, if passed at some editing bit depth, would give us confidence that we could freely perform color space conversions to and from just about any RGB color space without worrying about loss of accuracy. I think I’ve  found such a test.

I picked 14 RGB color spaces that, in the past, some have recommended as working spaces, although several of them are obsolete as such:

  1. IEC 61966-2-1:1999 sRGB
  2. Adobe (1998) RGB
  3. ProPhoto RGB
  4. Joe Holmes’ Ektaspace PS5
  5. SMPTE-C RGB
  6. ColorMatch RGB
  7. Don-4 RGB
  8. Wide Gamut RGB
  9. PAL/SECAM RGB
  10. CIE RGB
  11. Bruce RGB
  12. Beta RGB
  13. ECI RGB v2
  14. NTSC RGB

If you’re curious about the details of any of these, go to Bruce LIndbloom’s RGB color space page and get filled in.

I wrote a Matlab script that reads in an image, assigns the sRGB profile to it, then computes from it an image that lies within all of the above color spaces. It does that with this little bit of code:

colorspac clipping

This script didn’t pull the gamut in far enough that the buildup of double precision floating point round-off errors didn’t cause colors to be generated that were out of the gamut of some of the color spaces. I added another gamut-shrinking step:

gamutSmoosh

This code shrinks the gamut somewhat in CIELab. I could probably get away with less shrinkage, but I got tired of watching the program go through many iterations before it finally threw a color out of gamut, forcing me to start all over again.

Here’s the sRBG image before the gamut-constraining process:

origScaled

And here it is afterwards:

gamutSmooshImage

Here’s the difference between the two in CIELab DeltaE, normalized to the worst-case error, which is about 45 DeltaE, and a gamma of 2.2 applied:

gamutSmooshErrorImage

After the gamut-constraining, the program picks a color space at random, converts the image to that color space algorithmically (no tables) in double precision floating point, quantizes it to whatever precision is specified, measures the CIELab and CIELuv DeltaE from the original image, then does the whole thing again and again until either the computer gets exhausted or the operator gets bored.

Here’s what happens when you leave the converted images in double precision floating point:

rand14DPFP

The worst of the worst is around 5 trillionths of a DeltaE.

If we quantize to 16 bit integers after every conversion:

rand14-16bit

The worst case error is less than a tenth of a DeltaE, and the mean error is a little over 1/100th of a DeltaE.

With 15-bit quantization, here is the situation:

rand14-15bit

More or less the same as with 16-bit quantization, but the errors are twice as bad. The worst-case error doesn’t get over one DeltaE until about 40 conversions, though.

With 8-bit quantization, we see a different story, as the quantization errors dominate the conversion errors get obvious quickly:

rand14-8bit

 

 

Sequential color space conversions at varying precision

I am now spreading my color space net to include a wide variety of possible RGB working color spaces. I picked 13 for testing:

  1. IEC 61966-2-1:1999 sRGB
  2. Adobe (1998) RGB
  3. Beta RGB
  4. Bruce RGB
  5. CIE RGB
  6. ColorMatch RGB
  7. Don-4 RGB
  8. ECI RGB v2
  9. Joe Holmes’ Ektaspace PS5
  10. PAL/SECAM RGB
  11. ProPhoto RGB
  12. SMPTE-C RGB
  13. Wide Gamut RGB

If you’re curious about the details of any of these, go to Bruce LIndbloom’s RGB color space page and get filled in.

I wrote a program to take an image of all 16+ million possible colors in SRGB and map it into an sRGB image that is within the gamut of all of the above color spaces. More on how I did that in a subsequent post.

Then I wrote a program to convert the image to sRGB, the first on on the above list, and convert that image to all the other color spaces on the above list in list order. Then it moved on to the next color on the list, and did the same thing again. And again, and again, until it reached the bottom of the list. It skipped the conversions to the source color space. That gave me 13*12, or 156 conversions. I set up the program so that it would either leave the image in double-precision floating point after the conversions, or quantize it to integers whose precision I could choose. Then I computed some stats.

Here’s the result of leaving the converted images in 64-bit floating point:

16MseqFP

Some conversions produce greater errors than others, but all the errors are very small, being much less than one-trillionth of a DeltaE

If we convert each image to 16-bit unsigned integer representation after each conversion, we get this:

16Mseq16bit

The errors are all under one-thousandth of a CIELab DeltaE.

With conversion to 8-bit unsigned integer representation after each conversion:

16Mseq8bit

Now we have mean errors of 1/3 of a DeltaE, and worst-case errors of about 2 DeltaE. You definitely want to be careful when converting color spaces if you’re working with 8-bit images.

I expected that some RGB color space conversions would be more prone to error than others, and that turns out to be the case. What surprises me is how small the differences are — one binary order of magnitude covers them all.

Color space conversion errors with ProPhoto RGB — 15 bits

Now let’s look at ProPhotoRGB conversion errors with 15-bit quantization.  I took all 16+ million colors in sRGB and converted them to PPRGB and back, quantizing to 15-bits per color plane after each conversion. The conversions themselves were performed algorithmically (no tables involved) with double precision floating point, and have errors on the order of 10^-14 CIELab DeltaE.  I looked at the CIELab DeltaE errors after each iteration:

s2pp2s15sigma s2pp2s15wc s2pp2s15mean

If we take all the colors available  in an 8-bit representation of PPRGB, convert them to 15bits per color plane, thence to Lab and back, quantizing to 15 bits after eadh conversion this is what we see:

pp2labp2pp15sigma pp2labp2pp15wc pp2labp2pp15mean

 

In the past tests, 16 bit results have had half the error of 15-bit ones. Assuming that continues, in a 15 or 16-bit editing environment, it doesn’t look like there’s anything to be worried about in performing repeated conversions between sRGB and PPRGB, as long as the conversion algorithms themselves are written so that the internal calculations are accurate.

Color space conversion errors with ProPhoto RGB — 8 bits

ProPhoto RGB is not recommended as an 8-bit color space, but I thought I’d start there. I took all 16+ million colors in sRGB and converted them to PPRGB and back, quantizing to 8-bits per color plane after each conversion. The conversions themselves were performed algorithmically (no tables involved) with double precision floating point, and have errors on the order of 10^-14 CIELab DeltaE.  I looked at the CIELab DeltaE errors after each iteration, computed some stats, and watched how they changed.

s2pp2ssigmalab s2pp2swclab s2pp2smeanlab

The worst-case results are truly awful, but I was surprised by how little damage there was to the mean and standard deviation.

Here’s the original image:

RGB16Million

Here’s the error image after 50 iterations, normalized to worst-case equals full scale, and gamma-corrected with a gamma of 2.2:

8bitrtppdiff

Here’s the error image multiplied by 3 to show the smaller errors better:

8bitrtppdiffx3

Here’s a closeup of the upper left corner:

8bitrtppdiffulx3

And the lower tight corner:

8bitrtppdifflrx3

The patterns are pretty, but that’s about all I get out of the error images.

If we take all 16 million colors representable in 8-bit PPRGB to Lab and back over and over, quantizing to 8-bits after each conversion, we get these curves:

pp2lab2ppsogma pp2lab2ppwc pp2lab2ppmean

I’m a little surprised that the entire PPRGB gamut appears to lie within the Lab gamut. There are RGB triplets in PPRGB that don’t correspond to visible colors, and there are RGB triplets in Lab that don’t correspond to visible colors, but it’s surprising that all the PPRGB ones are representable in Lab. I did use a D65 spectrum sampled by the 2 degree standard observer for the Lab white point, if it make a difference.