It’s pretty clear to me that the biggest aliasing problems today are caused by the Bayer pattern and similar methods that construct a color sensor by detecting different spectra at different places on the chip. One way to make a big improvement would be to get all the RGB photosensitive regions that make up a pixel looking at the same place on the lens-created image.
There is currently a sensor that does that very thing: the Foveon sensor, available in some Sigma cameras. It works by stacking the red, green and blue photodetectors on top of each other. There are some disadvantages in color accuracy over a Bayer array, and the chips appear to be noisier than the best conventional CMOS chips. There is some controversy on how to compare pixel counts on Foveon versus conventional sensors. I have previously argued here that the pixel count on Bayer-array sensors should be divided by two to get real RGB pixels. The Foveon marketers have been caught up in the same silly numbers race as the Bayer guys, and are arguing that their pixel count should be multiplied by three. I’d ignore that, and just count the real pixels. Viewed that way, the current highest resolution Foveon chip, at about 5 megapixels, is close in image information to the Nikon D3 and D3s, whose pixel counts correct to 6 megapixels. There is no full frame Foveon chip, nor is there a medium format chip.
Other ways to accomplish the same goal with today’s technology:
- Three monochromatic chips with prisms and filters to separate the colors, similarly to many video cameras. Not likely for high end still photography because of cost and size impacts.
- One monochromatic chip with a filter wheel. This was a commercial alternative in the late 80s and early 90s, but seems to have fallen from favor. Only works with motionless subjects.
- Micro-movement of the sensor, making several separate exposures at each position so that each of a constellation of Bayer elements sees on average the same light. This was done by Imacon (now Hasselblad) years ago. It, too, only works on non-moving subjects.
- You can’t buy it yet, but for four years, Nikon has had a patent on a photodetector system that collects three color photosites under one microlens.
In the future, who knows? My money is on the technology that made digital audio work so well: oversampling. The basic idea is to sample the input data at a higher frequency than you’ll use for the output, then use digital filtering to remove unwanted high-frequency signals before converting the samples to the lower output rate. It provides an improvement because the digital filters can be made to do things impractical in the analog world. The analog antialiasing filters used in audio are a lot more precise and controllable than the ones used in photography, so oversampling should provide even bigger gains.
How might this work? Since we’re dreaming, let’s imagine a sensor that has a 4×4 array of photosites – half green, one-quarter red and blue – for each output pixel. We’ll put the centers of the 4×4 arrays 8 micrometers apart, so that we could make a full frame 35mm sized chip with 13.5 million real, three-color pixels (a little better resolution then the D2x) , or a 4.5x6cm sensor with over 42 megapixels. Since the actual sensors are on a pitch of two micrometers (Three times the wavelength of red light!), even with a 1.4 to 2 multiplier for the Bayer pattern, we can count on the lens itself being an adequate antialiasing filter: it will have to resolve better than between 250 and 360 lp/mm to create an alias. If you just average all the green pixels to one value, and did the same for the reds and the blues, you’ll have the equivalent of a huge Foveon chip. But you could do better than that. By applying some smart filtering algorithms, you could trade off low noise (which you get by using lots of microsites in the calculation) and high resolution (which you get by using fewer. It’s just a dream at this point; such a chip would have more than 200 million photosites in 35mm form, and more than 600 million in the medium format size.
Oversampling could offer an advantage in chip yield as well. Dead photosites could be identified as such during testing, and the chip programmed to ignore them. Assuming one or fewer failures per 4×4 array, there are at least three other good samples per color per output point. All that would be added would be a bit of noise, and all that would be lost would be a bit of resolution.
Can we get photosites that small? The Canon S70 already has a pixel pitch of 2.3 micrometers, so that part ought to work. The tricky part would be keeping the fill factor high as the pixels get smaller. If that’s not possible, we’d lose low-light performance and dynamic range.
Since the sensor can now detect whether there’s high frequency information in the image or not (assuming it can sort out the signal from the noise), it could do tricky things like, in the absence of high frequency detail, recruiting adjacent 4×4 arrays, or parts of them, and averaging in their contents to lower noise. That boils down to having adaptive pixel size: big where there’s not much detail, and small where there is, which matches really well with the eye’s ability to see noise in smooth areas, but not so easily where there’s detail.
It’s a fair question to ask if it makes sense to do all this processing on the chip, or simply upload a 400MB raw file for the 35mm size sensor or a 1.2GB file for the medium format one. It ends up depending on how fast it is to do the processing on the chip versus the time spent transmitting those big files.
Producing this series of posts has been instructive for me, since collecting and writing down my thoughts has refined them. It’s also made me think about another topic: antialiasing in reconstruction. More on that later.