\documentclass{InsightArticle} \usepackage[dvips, bookmarks, bookmarksopen, backref, colorlinks,linkcolor={blue},citecolor={blue},urlcolor={blue}, ]{hyperref} % to be able to use options in graphics \usepackage{graphicx} % for pseudo code \usepackage{listings} % subfigures \usepackage{subfigure} \usepackage{amsfonts} \title{Phase Correlation Method for ITK} \release{1.1} \author{Jakub Bican$^{1}$} \authoraddress{$^{1}$jakub.bican@matfyz.cz, Department of Image Processing, Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic} \begin{document} \maketitle \ifhtml \chapter*{Front Matter\label{front}} \fi \begin{abstract} \noindent Phase Correlation Method (PCM or SPOMF - Symmetric Phase-Only Matched Filter) is a well known image registration method, that exploits Fourier Shift Theorem property of Fourier Transform. Even though it is limited to estimate only shifts between two images, it is very useful as it is robust to frequency-dependent noise and initial image overlap area. Furthermore, it evaluates very fast by way of two forward FFTs, one complex image division and one inverse FFT. This document describes the implementation of Phase Correlation Method for the Insight Toolkit ITK \url{www.itk.org}. The implementation splits the algorithm in several classes to enable further improvements to the method to be implemented (such as different sub-pixel algorithms). The paper is accompanied with the source code, tesing and example code and input data for purposes of testing of this implementation. \end{abstract} \section{Theoretical background} Phase correlation method \cite{Kuglin1975} takes advantage of \emph{Fourier shift theorem} that relates the phase information of an image and its shifted copy. If $f_M$ is shifted copy of $f_F$: \begin{displaymath} f_M(\vec x) = f_F(\vec{x}+\Delta\vec{x}), \end{displaymath} then according to the shift theorem, the Fourier transforms of the images are related as follows: \begin{displaymath} F_M(\vec\omega) = F_F(\vec\omega) e^{i(\omega_x\Delta{}x_x+\omega_y\Delta{}x_y+\omega_z\Delta{}x_z)}, \end{displaymath} and \begin{equation} \label{eq:shifttheoremdiv} \frac{F_M(\vec\omega)}{F_F(\vec\omega)} = e^{i(\omega_x\Delta{}x_x+\omega_y\Delta{}x_y+\omega_z\Delta{}x_z)}. \end{equation} Inverse Fourier transform of equation \ref{eq:shifttheoremdiv} gives us a \emph{correlation surface} $Corr(\vec x)$ which has a form of Dirac delta function located at $\Delta\vec{x}$: \begin{equation} \label{eq:PCM} Corr(\vec x) = \mathfrak{F}^{-1}\left(\frac{F_M(\vec\omega)}{F_F(\vec\omega)}\right) = \mathfrak{F}^{-1}\left(e^{i(\omega_x\Delta{}x_x+\omega_y\Delta{}x_y+\omega_z\Delta{}x_z)}) = \delta(\vec{x}+\Delta\vec{x}\right). \end{equation} Thus, locating a peak in a correlation surface results in offset $\Delta\vec{x}$ that can be used to align the two images at pixel-level. Quotient of spectrums $F_M$ and $F_F$ in equations \ref{eq:shifttheoremdiv} and \ref{eq:PCM} is in practise (when $f_M$ is not exactly the shifted copy of $f_F$) usually computed as \begin{displaymath} \widehat{Corr}(\vec\omega) = \frac{F_M F_F^*}{\left| F_M \right| \left| F_F \right|}, \end{displaymath} so that PCM computes the correlation of whittened images (images with $\left| F \right| := 1$). $\widehat{Corr}(\vec\omega)$ is a Fourier transform of correlation surface $Corr(\vec x)$ and is usually called \emph{cross-power spectrum}. \section{Principles of PCM implementation} The implementation is based on basic PCM principle described above, but is dealing with arbitrarily sized and spaced images. It also enables implementation of various enhancements published later. The basic PCM algorithm has 4 (5) steps: \begin{enumerate} \item[0.] Resample Fixed and Moving image $f_F$ and $f_M$ to same physical size and resolution: images $f_F^{rs},f_M^{rs}$. \item Compute forward FFT of the images: spectra $F_F^{rs},F_M^{rs}$. \item Compute the ratio of the two spectra and get complex (frequency domain) cross-power spectrum $\widehat{Corr}\left(\vec\omega\right)$. \item Compute inverse FFT of the complex cross-power spectrum and get real correlation surface $Corr(\vec{x})$. \item Find the maximum peak in the correlation surface and determine the shift $\Delta\vec{x}$. \end{enumerate} The 0th step is not necessarily the part of the algorithm, but it is a prerequisite for further operations. The images must have same size and should have same real size (i.e. spacing) to reach meaningful results. First, the image with smaller real size in certain dimension is padded by zeros to reach the size of the other image in that dimension. This is done for all dimensions in one step. The next step should be resampling of one image to have the same spacing as the second image. This is actually done after forward FFTs by cropping high frequencies in some dimension from an image which has higher resolution in that dimension. This is done implicitly for all dimensions by simply omitting those high frequencies from computation during 2nd step. The 2nd step is implemented in separate class called \emph{operator} that is plugged in the main class during run time. It is the first point of possible extensions. It enables for example frequency-dependent filtering and other techniques. The 3rd and 4th step are called \emph{optimization} as they have to determine the shift from the cross-power spectrum. Optimization is carried by \emph{optimizer}, which is another pluggable component of the method and the second point of possible extensions. As some optimizers work on complex cross-power spectrum (for example \cite{hoge2003medimg}) and some others on real correlation surface (as the maximum-peak method introduced by original paper \cite{Kuglin1975} and implemented here), the method incorporates the inverse FFT filter according to the input image type of the optimizer. \section{Usage} The method itself is implemented in class \code{PhaseCorrelationImageRegistrationMethod} and it is in fact a mini-pipeline that consists of two \doxygen{PaddImageFilter}s that padd the images and cast data of integral types to real types, two \doxygen{FFTRealToComplexConjugateImageFilter}s, one \code{PhaseCorrelationOperator} (or descendant), possibly one \doxygen{FFTComplexConjugateToRealImageFilter} and one descendant of \code{PhaseCorrelationOptimizer} (for example \code{MaxPhaseCorrelationOptimizer}). Before execution, it is necessary to set fixed image, moving image, operator and optimizer to the \code{PhaseCorrelationImageRegistrationMethod}'s instance. The input images may be of any dimension and any scalar data type that is convertible to some real type and may have different size, spacing and real size, since they are automatically resampled to the same proportions as described above. The method is invoked normally by \code{Update()} or by updating some downstream filter. The result can be obtained in form of \doxygen{TranslationTransform} as filter output, that is passable down the pipeline, or in form of translation parameters that describe the shift in every dimension. The transformation can be directly used to transform the Moving image to match the Fixed image. Users familiar with \doxygen{ImageRegistrationMethod} will notice similar interface of the main class \code{PhaseCorrelationImageRegistrationMethod}. \section{Example} This section describes the example code attached with this contribution in \code{PhaseCorrelationImageRegistrationMethodExample.cxx}. With this example, you can use \code{BrainProtonDensitySliceBorder20.png} and \code{BrainProtonDensitySliceShifted13x17y.png} images from ITK's example data set. These images are attached with this source and testing code as well. First, make the standard includes amd check the command line arguments. \small \begin{verbatim} #include "itkPhaseCorrelationImageRegistrationMethod.h" #include "itkPhaseCorrelationOperator.h" #include "itkMaxPhaseCorrelationOptimizer.h" #include "itkImageFileReader.h" #include "itkResampleImageFilter.h" #include "itkCastImageFilter.h" #include "itkImageFileWriter.h" int main( int argc, char *argv[] ) { if (argc<4) { std::cout << "Usage: " << std::endl; std::cout << argv[0] << " FixedImageFile MovingImageFile" << " OutputImageFile" << std::endl; return EXIT_FAILURE; } \end{verbatim} \normalsize The second standard step is to declare the types and read the images. \small \begin{verbatim} const unsigned int Dimension = 2; typedef float PixelType; typedef itk::Image< PixelType, Dimension > ImageType; typedef itk::ImageFileReader ReaderType; ReaderType::Pointer fixedReader = ReaderType::New(); ReaderType::Pointer movingReader = ReaderType::New(); fixedReader->SetFileName( argv[1] ); movingReader->SetFileName( argv[2] ); try { fixedReader->Update(); movingReader->Update(); } catch (itk::ExceptionObject & e) { std::cerr << "Unable to read input images!" << std::endl; std::cerr << e << std::endl; return EXIT_FAILURE; } \end{verbatim} \normalsize Now init the registration method. Declare the type of the registration method, operator and optimizer and instantiate these components. \small \begin{verbatim} typedef itk::PhaseCorrelationImageRegistrationMethod RegistrationType; typedef itk::PhaseCorrelationOperator OperatorType; typedef itk::MaxPhaseCorrelationOptimizer OptimizerType; RegistrationType::Pointer pcmRegistration = RegistrationType::New(); OperatorType::Pointer pcmOperator = OperatorType::New(); OptimizerType::Pointer pcmOptimizer = OptimizerType::New(); \end{verbatim} \normalsize Now connect the optimizer and operator to the registration method and assign the inputs (fixed image and moving image). \small \begin{verbatim} pcmRegistration->SetOperator( pcmOperator ); pcmRegistration->SetOptimizer( pcmOptimizer ); pcmRegistration->SetFixedImage( fixedReader->GetOutput() ); pcmRegistration->SetMovingImage( movingReader->GetOutput() ); \end{verbatim} \normalsize When everything is prepared, execute the registration. \small \begin{verbatim} try { pcmRegistration->Update(); } catch ( itk::ExceptionObject & e ) { std::cout << "Some error during registration:" << std::endl; std::cout << e << std::endl; return EXIT_FAILURE; } \end{verbatim} \normalsize The registration result can be retrieved as translation parameters or as a translation transform from registration method's output. \small \begin{verbatim} RegistrationType::ParametersType parameters = pcmRegistration->GetTransformParameters(); RegistrationType::TransformType::ConstPointer transform = pcmRegistration->GetOutput()->Get(); std::cout << "Translation found: " << parameters << std::endl; \end{verbatim} \normalsize Finally, resample the moving image by found transformation and write it. First, declare the types for resampler, caster and writer and instantiate these classes. \small \begin{verbatim} typedef itk::ResampleImageFilter ResamplerType; typedef itk::Image CharImageType; typedef itk::CastImageFilter CasterType; typedef itk::ImageFileWriter WriterType; ResamplerType::Pointer resampler = ResamplerType::New(); CasterType::Pointer caster = CasterType::New(); WriterType::Pointer writer = WriterType::New(); \end{verbatim} \normalsize Connect the components and set the resultant transform to the resampler. \small \begin{verbatim} resampler->SetOutputParametersFromImage( fixedReader->GetOutput() ); resampler->SetTransform( transform ); resampler->SetInput( movingReader->GetOutput() ); caster->SetInput( resampler->GetOutput() ); writer->SetInput( caster->GetOutput() ); writer->SetFileName( argv[3] ); \end{verbatim} \normalsize Execute the pipeline to resample, write the output image and then exit. \small \begin{verbatim} try { writer->Update(); } catch(itk::ExceptionObject & e) { std::cerr << "Unable to generate or write output image!" << std::endl; std::cerr << e << std::endl; return EXIT_FAILURE; } return EXIT_SUCCESS; } \end{verbatim} \normalsize \section{Literature survey} This section aims to give a short overview of principal papers focusing phase correlation. The relation of PCM to other image registration methods is described in \cite{zitova2003}. This overview should be considered as introductory, just to give some references to the methods and techniques related to PCM. The method was originally introduced by Kuglin and Hines \cite{Kuglin1975} in 1975. Their approach designated to register only translated images is a basis for implementation in this contribution. Then, the method was used to register translated and rotated images \cite{castro1987} and finally to register translated, rotated and scaled images \cite{reddy1996}. The last approach transforms rotation and scaling to translation by log-polar mapping of shift-invariant magnitudes of image spectrums. Such operation can be easily implemented by combination of ITK filters with the classes from this contribution, but the technique is limited to register 2-D images only (as it is not possible to transform 3-D rotations to translations). The idea was further developed in \cite{zokai2005}. Authors of papers \cite{keller2005ip,keller2005pami} deal with disadvantages of polar mapping by employing pseudo-polar Fourier transform\cite{averbuch2006}. Other techniques were proposed in \cite{lucchese2000,lucchese2001}. There is a group of papers that aim to enhance PCM to register images with subpixel precision. The basic approach of interpolating the images or correlation surface (e.g. \cite{abdou1998}) is outperformed by estimating the polyphase decomposition of cross power spectrum in \cite{shekarforoush1996, foroosh2002}. In \cite{stone2001}, Stone et al. estimate the subpixel shift by first eliminating the effect of aliasing and then using least-squares fit to 2-D phase difference data. As this is a difficult task, authors of \cite{hoge2003medimg, hoge2003icip2, hoge2005} separate the shift estimation in every dimension by SVD or high-order SVD of cross-power spectrum. An improvement of robustness of this approach is given in \cite{keller2004}. Foroosh and Hoge give in \cite{ForooshHoge2003} a detailed study of some subpixel PCM techniques both in spatial and frequency domain. They also discuss the performance of PCM in case of images acquired by different sensor modalities. Several methods for registration of 3-D images using frequency-domain techniques were proposed in \cite{lucchese1998,lucchese2002} and \cite{keller2005cvpr,keller2006}. The second group of papers employs 3-D pseudopolar Fourier transform to estimate the rotation from shift-invariant frequency magnitudes. \section{Further improvements} This contribution provides implementation of basic PCM algorithm. It can be further improved or extended for example by: \begin{itemize} \item filtering of input data to overcome the boundary effect, \item filtering of spectral data to overcome the effect of aliasing, \item implementing various subpixel techniques both in spatial and frequency domain. \end{itemize} Most of the techniques that exist in the literature can be implemented by creating a new operator or optimizer. \section{Software Requirements} You need to have the following software installed: \begin{itemize} \item Insight Toolkit 3.0 \item CMake 2.4 \end{itemize} PCM implementation uses a new customization of \code{New()} method of \doxygen{FFTRealToComplexConjugateImageFilter} and \doxygen{FFTComplexConjugateToRealImageFilter} introduced in ITK 3.0. This customization automatically selects the available FFT implementation between Vnl (default) or FFTW (if available for current data type). This behavior can be overriden by using a factory mechanism and so force PCM implementation to use desired FFT implementation. \appendix %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % % Insert the bibliography using BibTeX % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \bibliographystyle{plain} \bibliography{pcm} \nocite{ITKSoftwareGuideSecondEdition} \end{document}