Home Page of Helio Perroni Filho

Update: I have written a new article on the log-polar transform, implementing a simpler algorithm, and also taking advantage of NumPy's features to improve performance. Check it out!

The log-polar transform is a popular pre-processing step used in Computer Vision systems. It has several properties that make it attractive for image processing: overall image resolution is lowered (reducing processing requirements) while a small region of interest is kept intact; scalings and rotations in the original image are turned to translations across the transform's axes, facilitating matching tasks; and its conformity ensures that results obtained from the transform can be cast back into the original image.

On the Computation of the Discrete Log-Polar Transform explains the underlying theory, discusses how to find the correct transform parameters, and includes a complete set of algorithms. Even though at times I have deviated from its advice, in general it was an invaluable resource while developing my own Python implementation of the log-polar transform, which you can download here.

The logpolar module depends on the NumPy package; the PIL package is also needed to run the demonstration code, but otherwise is not a dependency. Its most important public symbol is the transform() function, which is largely a direct implementation of the Algorithm 2 presented in the paper's 6th chapter. The modification's I introduced are:

The paper isn't much clear about what the transform output's dimensions should be, so I implemented a small function to automatically calculate convenient defaults;
Originally, the transform's center influenced the range of the output's rho dimension. This was not interesting for my purposes, so I added a normalizing factor to the radial distance r, allowing rho's range to be set solely in terms of the input image's dimensions and the transform's parameters;
The algorithm was rather optimistic in its use of the arc tangent function, in that it is expected to generate the whole range of values [0, 2π) from the y to x ratio. Most programming languages won't provide anything so sophisticated out-of-the-box; Python is no exception, so I have implemented an enhanced arc tangent function over the standard library's math.atan();
The log-polar transform's perifery is a many-to-one map: a number of input pixels are averaged to provide the value for each output pixel. The original algorithm accumulated the pixel values in a sum matrix A and the numbers in a count matrix B, and then iterated over the matrixes to calculate the averages. I decided to employ an iterative mean formula instead, trading a little precision for better performance;
Instead of defining the fovea explicitly as a radius r0 from the transform's center, I found it simpler to define it implicitly as the first k lines of the transform's output for which the number of mapped pixels are less than the row length. A simple nearest neighbour algorithm is then used to expand those pixels over the empty area.

Additionally, a convenience class Snapshot is provided, which allows PIL images to be treated as NumPy matrixes, and thus directly fed to the logpolar.transform() function.

Home Page of
Helio Perroni Filho

Friday, July 10, 2009

A Python Implementation of The Log-Polar Transform

Pages

Archives

Home Page ofHelio Perroni Filho

Friday, July 10, 2009

A Python Implementation of The Log-Polar Transform

Pages

Archives

Home Page of
Helio Perroni Filho