Speculating on Sony’s raw compression

Hans van Driest has posted a possible explanation of why Sony uses the tone compression algorithm that it does. It’s speculation, to be sure, but, if true, might explain a lot. Here is is post, lightly edited for clarity.

Sony uses column conversion, meaning they use a lot of ADCs in parallel. This, in combination with some other tricks, seems to get rid of most of the pattern noise and allows low read noise, resulting in the great dynamic range of Sony sensors. With so many ADCs, they have to be simple, and simple they are. Sony uses the most basic analog to digital converters, the slope ADC, which compares a voltage ramp with the voltage out of the sensor and stops a counter when the two are equal. This is not the fastest way to make a conversion. For 14 bits, such an ADC needs 2^14 clock cycles for each conversion. When using, say, a 400MHz clock, this means 41us per conversion. Sounds fast, but they must perform over 6,000 of such a conversions for each image, stretching the conversion time to a bit more than 0.25 sec. This is a bit slow for high frame rates, and also for live view.

The shape of the voltage ramp going into the comparator does not have to be linear. Sony uses a variation of an exponential slope (thus the compression). This cuts the conversion time down by a factor of eight (2^11 instead of 2^14). Now the total conversion time is slightly over 0.03 sec. Great for live view.

It might very well be that this explains why Nikon live view is as poor as it is (line skipping to reduce conversion time), compared to that of Sony.

The elegant thing is that this compression is not really costing much, if anything.  14 bits are needed for the dynamic range. But signal to noise ratio, when light is hitting the sensor, is not only determined by the ADC and read noise, but also by a property of the light itself; shot noise. Say the a7R sensor has a full well capacity of 60000 (optimistic). Such a full well capacity means that at maximum illumination, the SNR is the square root of 60,000  or approximately 245, which can easily be resolved with an 8 bit ADC. So with 11 bits, there is room to spare. Shot noise goes up with the signal, not just as fast, but it goes up. So the 14 bits are needed for the deep shadows, but once there is enough light on a pixel, you do not need them anymore.

There’s something that bothers me about the above. The tone curve chosen by Sony, with its linear sections of slopes increasing by a binary order of magnitude each time, seems chosen for digital-to-digital implementation. If I were an engineer with the ability to have any monotonic tone curve I wanted just by controlling the shape of the ramp, I’d go all the way and use a LUT-driven DAC to generate the ramp and encode directly with a gamma of 2.2 (or 1.8, if I were feeling Apple-ish).

3 thoughts on “Speculating on Sony’s raw compression”

  1. > If I were an engineer with the ability to have any monotonic tone curve I wanted just by controlling the shape of the ramp, I’d go all the way and use a LUT-driven DAC to generate the ramp and encode directly with a gamma of …

    L* might be a good choice. Broken line characteristics are not typical for ADCs. Not to mention I do not see any artifacts in the regions of kinks.

    1. Iliah, when I was working for IBM in the early 90s, I did some research on whether discontinuities in slope were easily visible the way that discontinuities in value are. My conclusion, never published because the sample size was too small, was that slope discontinuities are hard to spot, unlike value discontinuities.T he human vision system seems to be tuned to discover luminance value discontinuities, and chroma discontinuities to a lesser degree.

      Jim

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>