Friday, July 10, 2009

A Python Implementation of The Log-Polar Transform

Update: I have written a new article on the log-polar transform, implementing a simpler algorithm, and also taking advantage of NumPy's features to improve performance. Check it out!

The log-polar transform is a popular pre-processing step used in Computer Vision systems. It has several properties that make it attractive for image processing: overall image resolution is lowered (reducing processing requirements) while a small region of interest is kept intact; scalings and rotations in the original image are turned to translations across the transform's axes, facilitating matching tasks; and its conformity ensures that results obtained from the transform can be cast back into the original image.

On the Computation of the Discrete Log-Polar Transform explains the underlying theory, discusses how to find the correct transform parameters, and includes a complete set of algorithms. Even though at times I have deviated from its advice, in general it was an invaluable resource while developing my own Python implementation of the log-polar transform, which you can download here.

The logpolar module depends on the NumPy package; the PIL package is also needed to run the demonstration code, but otherwise is not a dependency. Its most important public symbol is the transform() function, which is largely a direct implementation of the Algorithm 2 presented in the paper's 6th chapter. The modification's I introduced are:
  • The paper isn't much clear about what the transform output's dimensions should be, so I implemented a small function to automatically calculate convenient defaults;
  • Originally, the transform's center influenced the range of the output's rho dimension. This was not interesting for my purposes, so I added a normalizing factor to the radial distance r, allowing rho's range to be set solely in terms of the input image's dimensions and the transform's parameters;
  • The algorithm was rather optimistic in its use of the arc tangent function, in that it is expected to generate the whole range of values [0, 2π) from the y to x ratio. Most programming languages won't provide anything so sophisticated out-of-the-box; Python is no exception, so I have implemented an enhanced arc tangent function over the standard library's math.atan();
  • The log-polar transform's perifery is a many-to-one map: a number of input pixels are averaged to provide the value for each output pixel. The original algorithm accumulated the pixel values in a sum matrix A and the numbers in a count matrix B, and then iterated over the matrixes to calculate the averages. I decided to employ an iterative mean formula instead, trading a little precision for better performance;
  • Instead of defining the fovea explicitly as a radius r0 from the transform's center, I found it simpler to define it implicitly as the first k lines of the transform's output for which the number of mapped pixels are less than the row length. A simple nearest neighbour algorithm is then used to expand those pixels over the empty area.
Additionally, a convenience class Snapshot is provided, which allows PIL images to be treated as NumPy matrixes, and thus directly fed to the logpolar.transform() function.