Color photography without demosaicing in the real world

Today I continue exploring using a technique to produce half-sized images from sensors with Bayer color filter arrays (CFAs). The technique was described here, and the way I’m implementing it was explained here.

I started with a raw file of this scene, photographed with the Sony a7R and the Sony/Zeiss 55mm f/1.8 Sonnar FE (aka the Zony 55) lens:

2611

I demosaiced the image with bilinear interpolation, and with adaptive homogeneity-directed demosaicing (AHD), then downsampled those images to half size in Photoshop using bicubic sharper. I also produced a half size image with the four-pixels-into-one technique. I enlarged a tight crop of each of the three images by 400% using nearest neighbor.

This is the image demosaiced with bilinear interpolation:

2611 bilinear

This is the image demosaiced with AHD:

2611 AHD

This is the image downsampled using the four-pixels-into-one method:

2611 2x2

The AHD and the bilinear interpolation images are virtually identical. The four-pixels-into-one crop is clearly sharper, but it also suffers from false-color artifacts, which are almost completely absent from the bilinear interpolation and AHD images. The side of the chimney is a good place to see the false-color four-pixels-into-one artifacts (which are red there), but you can also see many false colors in the tree leaves, although it wouldn’t be so obvious that they are artifacts if you didn’t have the other images to compare it with. Notice that the small branches and foliage in the bottom part of the crop are larger in the bilinear interpolation and AHD images, a symptom of lost sharpness.

This is typical of the differences throughout the scene. My conclusion is that the four-pixels-into-one technique is not, in general, worth the trouble. If the objective were a monochrome image, maybe things would be different.

It is also possible that some specialized filtering operation before the four-pixels into one operation would help. However, I have no plans to explore this.

Color photography without demosaicing in practice

Yesterday’s post trotted out an idea for producing moderate resolution files from high-resolution cameras without demosaicing, and speculated that such files might show higher quality and fewer artifacts than similar-sized filed down-res’d from demosaiced raw files.

I looked around for a tool to perform an experiment. I didn’t have to look far. DCRAW has an option, -h, that produces a half-sized image using the idea that I proposed in yesterday’s post, right down to averaging the two green channels. I took a raw file of an ISO 12233 chart made with the Sony a7R and the Zeiss 135mm f/2 Apo Sonnar ZF.2, and “developed” in in DCRAW three ways, with the four-pixels-into-one half-sized option, and demosaiced using bilinear interpolation and adaptive homogeneity-directed demosaicing. I brought all the images into Photoshop, down-res’d the demosaiced images by 50% using bilinear and bicubic sharper, and stacked them up in five layers. On the bottom, the four-pixels-into-one image, above it the bilinear-interpolation-demosaiced images down-res’d both ways, and above that the adaptive homogeneity-directed demosaiced images down-res’d both ways. I put a curves layer on top of everything to set the black and the white point.

The two images that were down-res’d with bilinear interpolation were both softer than their bicubic sharper counterparts, and didn’t have significantly fewer artifacts, so I ignored them. The differences between the images demosaiced with the two quality settings were minimal, so I ignored the bilinear interpolation demosaiced one.

Overall, the image processed without demosaicing had greater microcontrast, but more false-color artifacts and more evidence of luminance aliasing artifacts. This was a surprise to me.

Take a look at this section of the target, enlarged 300% using nearest neighbor.

First with conventional demosaicing:

demosaiced

And with the four-pixels-into-one technique:

4to1

Take a look at the vertical bar on the left which has a slanted, but mostly horizontal, grating of increasing spatial frequencies running from top to bottom. Notice that in both images, the apparent frequency actually drops from above the “5” to above the “6”, further drops above the “7”, then starts to increase again. This is classic aliasing, where spatial frequencies above half the sampling frequency appear at lower than half the sampling frequency. It occurs in this section of the image at much higher contrast in the four-pixels-into-one case than in the demosaiced case.

To see what happens with spatial frequencies below the Nyquist frequency (half the sampling frequency), look at the variable-pitch horizontal grating to the left of the zone plate at the top of the image. The lines at about 3.5 are at the Nyquist frequency. You can see that the contrast at that point in the demosaiced image is close to zero, which is appropriate. So there’s antialiasing filtering in the combination of the demosaicing and the down-sampling. It’s not a brick-wall filter, though; the contrast of the area above the “2” is adversely affected compared to the four-pixels-into-one image.

If you look below the “3” at the slanted, but mostly vertical grating at the bottom center, you’ll see what happens to spatial frequencies just below the Nyquist frequency. They are reproduced with greater contrast in the four-pixels-into-one case, but there’s more false color.

The four-pixels-into-one technique doesn’t produce the anticipated freedom from false color with test charts. It appears that the sophistication of the current demosaicing techniques is enough to overcome any advantage that might accrue to the four-pixels-into-one approach.

As we’ve seen with cameras that omit the anti-aliasing filter, sometimes approaches that don’t work well for test charts are good for some real-world subjects. I’ll look at that next.

Color photography without demosaicing

In the beginning, digital photography sensors were monochromatic. If you wanted color, you made three successive exposures through different filters, which were usually mounted on a wheel for rapid sequencing. A variant of this approach was to use a series of prisms and half-silvered mirrors to split the imaging light into three beams which were filtered independently and simultaneously imaged on three monochromatic sensors. This approach is still used in some video cameras.

In the early 1970s, before digital photography even got off the ground,  Bruce Bayer invented his color filter array (CFA) pattern, which exposed different sensels through different color filters, all in a single exposure.

Here’s what a Bayer CFA looks like:

Bayer pattern

As an aside, I’ve always thought that having some techie thing named after you was neat. To that end, last year I proposed the Kasson pattern, which was — and still is — available for license at no charge, providing my name is displayed prominently on any camera using it.

Kasson pattern

Amazingly, no one has taken me up on my generous offer.

OK, back to Bayer. His array has been very successful, and the vast majority of contemporary digital cameras use it. There have been variants with different patterns and different colors in the filters, but they haven’t caught on. There is a totally different approach, Foveon, invented by Carver Mead, that uses the fact that different wavelengths of light penetrate silicon to different depths and stacks the various wavelength (they’re not red, green, and blue, but more like white, yellow, and red) sensels vertically. It remains a niche technology.

So the Bayer array is the big kahuna of photographic capture, making possible the marvelous cameras that we have today. It’s not all good, though. Images captured using a Bayer CFA are missing some information. In the production of a standard RGB image from a raw file from a Bayer CFA camera, half the green pixels, and three-quarters of the red and blue pixels, have to be created in software. The process is called demosaicing, and there are many algorithms. Not surprisingly, with two thirds of the data in the demosaiced image produced by sophisticated guesswork, there are problems in some demosaiced images. They include false colors and weird patterns. It’s amazing that it works as well as it does.

It occurs to me, with cameras like the Pentax 645Z, that there is now a practical alternative to demosaicing of Bayer-mosaiced files. If the raw developer took each block of two green, one red, and one blue raw pixel and used that to compute the output pixel, say by averaging the greens, passing on the reds and blues, and then doing the usual color space conversion, we’d have an image untouched by the hand of demosaicing. With 50 megapixels in the 645Z sensor, we’d have a 12.5 megapixel output file, which is plenty for many purposes. Because two green raw pixels are averaged to produce one processed pixel, we’ll have about 0.7 as much photon luminance noise, which sounds like an advantage until you realize that, using conventional demosaicing and res’ing the result down to 12.5 MP we’d get half the noise.

Is this a practical approach? Only experimentation can answer that. At best, it will not replace demosacing in all circumstances, but it may prove useful for some purposes.

PS. If you sometimes look at the photo boards and want a chuckle, take a look here: http://www.lensrentals.com/blog/2012/03/hammerforum-com

The Pentax 645Z

There have been rumblings for months about a new Pentax 645 based on the Sony 50MP 33x44mm chip that’s in the Phase One IQ250 and the Hasselblad H5D-50c. The buzz was that it would be cheap (for a medium format camera) and that Ricoh would put some marketing, service, and distribution wood behind the camera in the US, unlike with the 645D.

The announcement came yesterday – actually, most everything was on the web a day or two earlier – and it was everything the rumor mill said it would be and more. The price wasn’t $10K, but $8500. It will take all the Pentax 645 lenses, and Ricoh has already ramped up production of them. It shakes the sensor to clean it, like a 135-style DSLR. There are 27 autofocus points, unfortunately concentrated in the center.

There’s a reflective LCD on the top of the camera, like a 1DX, a D4, a D800, or an H-series ‘blad. That’s great; you won’t have any trouble seeing what you’re doing outdoors, as is the case with cameras whose only display main panel or the EVF (Sony and Leica, I’m talking to you). I wish it would display the histogram after every shot like the Hassy, but you can’t have everything.

You’re going to use mirror lockup a lot on a camera like this. The 645Z has a separate control for it, which I like. It’s a knob, not a button, and I’m not so sure about that. I like mirror lockup to work the way it does on the H-series ‘blads, with a toggle-mode button. The Hassy’s button is hard to reach (but it’s reassignable) and the Pentax knob looks to be easy to get to.

It looks solid mechanically: magnesium body, lots of weather sealing. It’s got two tripod mounts, so you can shoot verticals without the instability of an L-bracket in that orientation.

I’m excited about the articulating LCD panel. Using the similar panel on the a7R has spoiled me, and if a bought a MF camera without one, like the IQ250 or the H5D-50c, I’d have to rig up tethering to a tablet to get the same viewing angle.

Communications with the outside world is via USB3, like the IQ250. I hate it that the only way you can talk to the HxD is with the nearly-obsolete Firewire; it’s one of the big reasons why I haven’t upgraded my H2D-39 in seven years.

A camera like this needs great focusing, and these days that means great live view. Initial reports say that the 645’s implementation is good. The H5D-50c requires tethering for good live view. The IQ250 has respectable live view on its touch screen.

Whether a 33x44mm sensor rates being called medium format is a legitimate question. I leave that to others. I’ll call it that for now, just for convenience. The sensor has 5/3 the area of a 24x36mm sensor. If you’re making a print with a classical 4:5 aspect ratio, you’ve got 90% more area in the 645Z because of its native aspect ratio of 4:3.  For full frame prints, the photon noise would be 23% lower than with a similar-area 50 MP 135 sensor (if one existed); for 8x10s, the photon noise would drop by 27%. The differences are a bit smaller than those between APS-C and full frame 135. A nice improvement, but not dramatic.

What’s not to like about the 645Z? Those of you who followed my struggles with the Sony a7R’s shutter shock know that it doesn’t take much shake to cut into the effective resolution of a high-pixel-count sensor. There are two ways to drive shutter-induced vibration to very low levels. The tried-and-true method is to use leaf shutters. With interchangeable lenses, that means a shutter in each lens, with the associated costs (Yeah, I know that there are ways to put a big leaf shutter behind the lens like an Argus C3, but really…). The cost of the shutters is a big deal with inexpensive lenses for 35mm-sized cameras, but not so much for MF lenses. However, Pentax has always had the shutter in the camera, and it doesn’t look like they’re going to change now.

That leaves me looking at third-party leaf-shutter lenses.

The flange distance of the Pentax 645 lens mount is 70.67mm. The Hasselblad H-system flange distance is 61.63mm, so H-system lenses won’t work on the Pentax. Mamiya 645 lenses won’t work either. Hasselblad V-series lenses, with their 74.9mm flange distance, should work with an adapter, but many, if not most, of them have marginal performance with a five-micron sensor. As far as I know, a convenient way to wind and trigger those shutters on the Pentax doesn’t yet exist. So the leaf shutter option doesn’t look too promising.

The next way to reduce shutter shock is with electronic first-curtain shutter (EFCS). I have seen how effective this can be in my testing of the Sony a7. Unfortunately, the 645Z doesn’t offer that feature.

So, in the new Pentax, our weapons in the war against camera-generated vibration dwindle down to what we’ve had with SLRs for years: mirror lockup. That eliminates a big vibration generator, but leaves the first shutter curtain – for some reason that I’ve never been able to figure out, the launching of the second curtain does not seem to contribute much to blur. Whether that’s enough remains to be seen.

It doesn’t look like Ricoh is offering variable delay between mirror up and shutter trip, like Hasselblad does in the H3D and following cameras. I don’t consider that a big deal, since I don’t envision using the camera handheld, but those who plan to use it that way should think about doing some vibration testing to see if it’s a problem with the lenses and shutter speeds that are important to them.

Personally, I view this camera as a special-purpose device, for use where resolution and clean tonality are paramount. Resolution is not the most important thing in a camera for most purposes. The camera that gets the most exposures from me is the Nikon D4, which is the lowest-resolution big-boy camera I own. Fast AF, reasonable (but admittedly not small) size and weight, rapid exposure ability, a big buffer (60 raw images to the 645Zs 10), and a feeling that you could drive nails with the camera when you’re not taking pictures trump pixel count for most purposes.

If you’re a working commercial photographer, you might come to a different conclusion and use the 645 as your go-to camera. At the price, you could buy a backup body or two and still come in under the price of the Hasselblad H5D-50c, which is the lowest-cost alternative (did I really say that Hasselblad was the lowest-cost anything?).

Then there’s stepping up to buying lenses for the camera. The good news is that the lenses don’t feature Leica or Hasselblad-pricing. But there’s even better news. If, like me, you’ve decided that this is a special-purpose camera, you’ll only buy lenses as needed for particular projects. That means that one or two will probably do you for a while.

The 645Z is not a perfect vehicle for the Sony 50MP chip. It doesn’t have leaf shutter lenses. Its flange distance means you can’t use many third-party lenses. But it’s so good that I found my mouse finger twitching over the pre-order button on the B&H website. In the end I decided to wait in hopes that some camera manufacturer – are you listening, Sony? – will put this sensor in a short-flange distance, mirrorless body. The best being the enemy of the good, by the time that happens, I may be holding out for a full-frame 645 sensor.

How to expose the moon?

Last night’s lunar eclipse occasioned a flurry of web traffic about how to set your camera to expose it correctly. I got to thinking – not always a good thing – about the problem, and the more I thought about it the harder it seemed.

Let’s assume that you’re making an image and you know the moon is the brightest thing in the field. Let’s make the further assumption that the moon is not large in the framed image; it’s only a component of the overall scene.

If you like to use your camera’s exposure meter – I don’t – you could set it to spot mode, meter the moon, and place it on Zone VII or (if you’re feeling lucky) VIII by opening up two or three stops from your meter reading. There’s a problem with this approach. Do you know that your camera’s spotmeter is taking its reading entirely from the moon, and not averaging in parts of the sky? If it is, it will think the moon is dimmer than it is, and you’re likely to have blown highlights in the final image. It’s actually worse than that; the light from the edges of the moon is dimmer than the light from the center, since the sunlight hits the edges at an angle, so maybe you should only open up a stop or two.

If you’re a fundamentalist photographer, you’ll note that the moon is a gray rock lit by the sun, and therefore, to place it on Zone V, or turn it into a middle gray, you’ll use the “Sunny 16” rule and set the f-stop to f/16 and the shutter speed to one over the ISO setting. Shooting digital, you don’t want a gray moon, you want an ETTR moon, so open up two, or, if you’re still feeling lucky, 3 stops. This is a pretty conservative way to go, since the moon’s reflectivity, at 12%, is less than an 18% gray card. This is the calculation that Ansel Adams famously muffed in exposing the negative of Moonrise, Hernandez, New Mexico. He did make a nice save, though.

The fundamentalist approach is useless during an eclipse, since you won’t know how bright the light falling on the moon is.

If, like me, you like to use the in-camera histogram, you could just make an exposure and look. If you’ve calibrated your camera’s settings using some variant of UniWB, the in-camera histo is a pretty good stand-in for the real raw histogram, and if you haven’t, you won’t blow the highlights, but you will probably not get a real ETTR exposure. However, there’s a fly in the ointment; the in-camera histogram is derived from the JPEG preview image, which is subsampled from the full-resolution sensor image. Unless the moon is reasonably large in the image, the subsampled JPEG is likely to omit the brightest pixel in the raw file. Even if it’s there, can you see one blown pixel on your camera’s histogram?

As far as I know, there’s no easy in-camera solution to getting a perfectly ETTR’d capture under the circumstances I’ve outlined here. You’ve got two choices: back off the estimated ETTR setting (probably the best move if you’re not fanatical about ETTR), or make a test image and look at the raw file (shooting tethered is a special case of this). A lunar eclipse takes long enough that that’s a viable option.

Or maybe you could slap on a long lens, make a test image, look at the in-camera histogram, and put your shorter lens back on the camera. Assuming similar T-factors in the lenses, that should work fine.

This may be a good example of analysis paralysis.

How much image quality is enough?

When we photographers capture images, how much quality should we strive for? A lot depends on how much we know about the eventual use of the image.

Why not just strive for the highest possible quality? Once you say that’s your goal, you’ve signed up for very expensive equipment, the use of a tripod almost all the time, a camera bag that’s too heavy to carry for any distance, and probably a big collection of lights, stands, reflectors, soft boxes, gobos, and the like. And maybe an assistant or two.

Not many of us want to go there. So we compromise. How much we should compromise depends on our objectives for the images.

If we’re shooting for the web, a very small sensor is all we need, if the image can tolerate the deep depth of field that goes along with that decision. If the light’s bright, we may even be able to get away with a cell phone.

If we’re making small portfolios – say 6×8 inch images– a micro four-thirds camera will probably do the job. We may or may not need a tripod. We might need lights, though.

The magazine market isn’t what it used to be. Neither is the book world. But let’s consider them anyway. Now we need to consider the intent of our images. Does the image need sharpness, smooth tonality, elegant shading, and lighting that pops? We’re probably talking full frame 135-style cameras, and maybe medium format. If we’re doing fashion or product work, bring on the lights, diffusers, and assistants.

If we’re selling prints, how much capture quality we need depends on the size of the print. I don’t buy the theory that people back up as the print gets bigger, so resolution doesn’t matter. I think the bigger the print, the more variation in viewing distance you get. Time and time again, I’ve seen people back way up so they can get the gestalt, and then bore right in so they can see the details. When you see somebody doing that to your work, you don’t want the whole thing to fall apart if the viewer is a foot away. Big prints need big sensors. Big sensors need big lenses. We probably need heavy tripods, too.

Thus, when we trip the shutter, we should have a pretty good idea of how big a print we’ll ever want to make from that capture. That’s a tall order. Maybe it’s impossible. Who knows the future?

Here’s an example of what I’m talking about. I just received an order for this image (click here if you want to learn more about it):

Betterlight_00165-Edit

The client wants a 60×60 inch print. When I made the image, I was thinking of large prints — maybe 30×30 — as a possibility, but I had not contemplated one that large. The image is a 6000×6000 pixel squeeze from a 64000×6000 capture. Hence, there’s plenty of information in the vertical, but only enough for 100 ppi in the horizontal direction, or 1/3.6 of what is ideal and about half of what I’ll usually tolerate in a large print. Fortunately, the image doesn’t rely on ultimate crispness to make its point, so I went back to the original capture, and resampled it to 21600×21600. At least that way I’ll get to take advantage of all the vertical pixels, even if I’ll have to make up some horizontal ones.

However, the available quality could have easily not worked at that size with an image that needed to scream crisp to get its point across. That’s an example of a larger point. The attributes of image quality that you should strive to… to what? Not to maximize; that’s what this whole post is about. To get to acceptable levels? That sounds so engineering-driven and heartless. Anyway, the attributes of image quality that your work needs to fulfill its mission are the aspects you need to concentrate on.

Trying to build too much quality into our images can lead to far fewer of them, as the cost and hassle of making pictures gradually overcomes our will to make art. Also, having more quality in the files than we ever use in the print is useless. Walking around with nothing better than an iPhone means small images, limited photographic options, and — unless a iPhone happens to be your thing — a restricted ability to communicate as an artist.

The title of today’s post is a question that’s easy to ask, but hard to answer.

A new gallery

I’ve made some changes to the gallery section of the main web site. Actually, Robin Ward, who writes all the web site code and does all the heavy lifting, made the changes, and I am thankful to her. Anyway, the slit scans that had been in the New Work gallery are now in a gallery called Timescapes. The New Work gallery has been given over to some stitched panos made in Maine and Quebec with a handheld M240 and the 50 ‘Lux, synthetic slit scans of NYC subway cars and soccer players, a few autohalftoned fire house images, and one lonely B&W semi-abstract.

You may notice that the subway images are dated 2011, and wonder how they get to be called new work. I date my images with the moment the exposure was made. These images were originally assembled manually using the visual language of the Staccato series. Last year, I reworked them with computer-driven techniques.

Here’s what I have to say about Timescapes in the artist’s statement:

For the last 25 years, From Alone in a Crowd, with its subject motion, through This Green, Growing Land and Nighthawks, which used camera motion, to Staccato, which stitched together little movies, most of my photography has been about movement in one way or another. Timescapes is explicitly so. In a normal photograph, the three-dimensional world is forced into a two-dimensional representation, with both dimensions representing space. In Timescapes, space is constrained even further, to only one dimension, and time becomes the second. Finish line cameras at racetracks work the same way. In this series, I examine what happens in a one-pixel wide line over a period of a minute or two to several hours.

Since the readers of this blog generally have a more technical bent than the viewers of my general web site, I’ll give you some technical details about how I did the work.

I started with a Betterlight scanning back on a Linhof or Ebony view camera. The back has a 3×6000 pixel sensor array that is moved across the image plane with a stepper motor. There’s a panoramic mode built into the Betterlight software. In that mode, the software expects that the camera is installed on a motorized rotary platform. It instructs the stepper motor in the back to position the line sensor in the center position, and leaves is there while it sends instructions to the motor in the platform to slowly spin the camera.

So how do I turn this back into a slit scan camera?

I lie to the software.

I tell it that the camera is on a rotary platform, when in fact it is stationary. Any changes to the image that the line sensor sees are the results of changes in the scene. From this simple beginning stems many interesting images.

Cleaning up sidecar files

My autohalftoning workflow has evolved to something like the following.

  • Write some code
  • Parameterize it.
  • Find some parameters that produce interesting results
  • Set up the software to do some ring-arounds
  • Import the ring-arounds into Lightroom
  • Delete all but the good ones
  • Manually remove the orphaned sidecar files.

The last step is not a lot of fun. I wrote some code to automate it:

sidecarcleanup

Making sidecar files

One of the issues I’m having to confront in the autohalftoning work has come up before in my image-processing programming, but I’ve always sidestepped it. When I write an image-manipulation program, I try to parameterize all of the options, rather than change the code to invoke them. It makes it a lot easier to go back to a set of procedures that’s worked well, and use that as a starting point for further explorations. The problem has been keeping track of what parameters are associated with any particular processed image.

Up to now, I’ve dealt with the issue by manually assigning file names that indicate the processing. There are several problems with that approach. First, I don’t always remember to include all the parameters, or think that a few will be obvious to me when I look at the file later. Also, the parameter descriptions get pretty cryptic because I’m trying to keep the file names short. And the file names get awkwardly long anyway.

Inspired by the way that Adobe Camera Raw keeps track of the processing the user has picked for a particular raw file, I came up with a better approach than arcane, manually-created file names: sidecar files. ACR doesn’t store the output images, and uses sidecar files so that it can perform the earlier processing in their absence. That’s not quite my problem. I am perfectly happy to store the output files, I just want to be able to look at a summary of the processing steps.

I wrote a method to write sidecar files with the same names as the processed images, but in Excel format, so that they have an .xls extension instead of a .tif one. Here’s the code:

sidecare code

There’s another advantage of this method. I can set up the autohalftoning code to do ring-arounds on any parameter or parameters I want. Then I can go through the images, find the ones I like, and look at the sidecars to see what the process was.

I recommend this approach to anyone rolling their own image processing code.

Adding a dc component to autohalftoning kernels

In addition to kernel size and construction, you can also get useful effects by making the kernel sum to numbers slightly larger than one. This means that the kernel is not strictly a highpass filter, but will preserve some low frequency information. You don’t want much; multiplying the center element of a fence kernel by 1.01 to 1.10 seems to be the sweet spot, but you can have a great deal of control over the look of your image this way.

If you’re into weirdness, you can make the center entry slightly smaller than one, which gives an effect reminiscent of a photographic negative.

I wrote a little Matlab code to tweak the kernel center value.

adjustCenter

And here’s what it looks like applied in varying amounts to an image:

fs-1fenceBottom13clipoffsetp05thrp0kc1ClipHighmpy108

fs-1fenceBottom13clipoffsetp05thrp0kc1ClipHighmpy102

fs-1fenceBottom13clipoffsetp05thrp0kc1ClipHighmpy105