A few days ago, I made a post with a handwaving defense of the use of slanted edge MTF metrics for analyzing handheld image sharpness. Today, I’d like to take another crack at it, this time with more rigor.
The upside? A clearer basis for the capabilities and the limitations of the technique.
The downside? This post is harder for me to write, and will ne harder for you to read, than the previous one. In addition, it requires the reader to have at least a rudimentary understanding of the mathematics and engineering of digital image processing. I’m not going to use equations except by reference, but this post is not for everyone. If you feel no affinity with Joseph Fourier, feel free to move on.
Is anyone still with me? Great. Both of you are welcome to join me on this little journey. First, a homework assignment. Download and skim this paper. Or, if you wish, just download it and keep it open in a window where you can refer to it as you read this post.
The first thing to notice about the paper is who wrote it. Well, given their positions in the author list, and the rudimentary nature of the content, maybe they didn’t do much of the actual writing. Another clue that they were a bit hands-off is the somewhat tortured English of the text. But still, cast your eyes on the names Robert Ulichney and Jan Allebach.
These people are giants in image processing. When I was a color scientist for IBM in the early 1990s, and going to SPIE conferences, Allebach routinely impressed everyone with his own contributions, and also with those of his students. He’s a nice guy, too. Ulichney wrote the book – literally – on digital halftoning. His inventiveness did not extend to the title, which is just plain Digital Halftoning, but it’s a great book, and one that I got a lot out of. I’m not the only one; the book came out in 1987, and I don’t think it’s a coincidence that soon afterwards error diffusion with blue noise dithering became the fashionable halftoning algorithm. The presence of these names on the paper is, at least for me, a quality control indication, and I have confidence of the accuracy of the contents.
OK, that’s enough motivation. Now let’s look at the paper.
I direct your attention to equation (1). Although the paper doesn’t say so explicitly, this equation indicates the assumption that the imaging system is linear. It’s not an unusual assumption. It has the advantage that it makes the math much more tractable. Also, many imaging systems are indeed approximately linear, or can be made so by unwinding deliberate non-linearities such as those associated with common RGB color spaces, Sony’s lossy raw compression technique, and the like. Some raw conversion processing is, however, non-linear, and could cause problems in an analysis that assumes linearity. Much sharpening is done linearly, so that’s not likely to be a problem, although it could be, since the algorithms of many raw converters are proprietary, without inside information, we’d never know. Lightroom’s Exposure control is nonlinear near the white point. In my earlier MTF testing, I stayed well away from there.
The paper goes on to say that, to get the system’s modulation transfer function (MTF), we perform a Fourier transform on the system’s impulse response, or point-spread function (PSF), and throw away the imaginary part. The authors go on to develop two other representations of an imaging system’s transfer function beside the PSF, the line-spread function (LSF) and the edge-spread function (ESF). You can see where this is going; they’re looking for things that will have a decent signal-to-noise ratio when made into a target. They don’t say it this way, but the LSF is a the response of the system to a one-dimensional delta function and the ESF is the response to the integral of a one-D delta function in a direction perpendicular to the line. Looking at it the other way, if we have the ESF, we can compute the LSF by differentiating. But since the imaging system is linear, we can compute the response of the system to a line target by differentiating the response to a step target. So if we know the ESF, we can get to the LSF, and thus to the PSF in the direction of the line.
If you’re an engineer,this will all be pretty familiar to you by analogy to one-dimensional systems. In the language of electrical engineering, you can get the step response of a linear system by integrating the impulse response with respect to time, and you can get the impulse response by differentiating the step response with respect to time.
With the preliminaries out of the way, they get to the meat of the three MTF methods that they consider. We’re only concerned about the slanted edge method in this post. The paper authors describe that so clearly that I’m just going to quote them:
“The basic idea for [the slanted-edge] method is that after getting the LSF by derivative of the ESF, compute the Fourier transform of the LSF. Normalize the Fourier transform value [they mean values] in order to get the spatial frequency response (SFR), denoted as the MTF.”
Pretty straightforward, huh? Hold on, we’re not quite done yet.
The authors elide the details of getting the SFR from a real image by referencing a program called SFREdge, which is available in several forms as Matlab source code. I tracked it down. You can download it here or here. This program was the starting point for the Imatest SFR code, which is, perhaps not coincidentally, written in Matlab.
The paper then compares slanted edge analysis on two imaging systems to two other methods, and concludes that, [my summary, not theirs] in optical systems where anisotropy exists, that slanted edge analysis should be performed on perpendicular edges. Not a surprise, really.
OK, what have we learned about the limitations of the slanted edge method for computing MTF? The linearity limitation, though not explicitly called out in the paper, is the one that sticks in my mind. I can imagine that there are camera systems for which a better signal to noise ratio could be obtained by performing SFR computation on more than two edge angles.
Where does this leave us with respect to running SFRedge or sfrmat2, or any code derived from it, on images that contain a blur component? Let’s assume that the imaging system is sufficiently linear for tripod-mounted SFR analysis. Without that, no Imatest SFR analysis makes sense.
If we can demonstrate that motion blur can be modeled as a summation of a series of images without motion blur, since addition is a linear operation, that would be a sufficient condition for the applicability of SFR results to images with motion blur.
So consider the following thought experiment. Imagine the path of the camera during a time exposure is known exactly. Now imagine picking n equally-spaced instants during the handheld exposure, and taking a single tripod mounted photograph from the camera position at each instant. Add the photographs together, and they form the output of a linear system. Let n approach infinity, and the result is still linear. Then consider that counting the summation of photons over time is exactly what the camera’s sensor does. QED.
There is one potential fly in the ointment. What if the scene changes during the exposure? Not likely with a target, but a limitation that rigor compels me to mention.