3D nonrigid medical image registration using a new information theoretic measure. Bicao Li, Guanyu Yang, Jean Louis Coatrieux, Baosheng Li, Huazhong Shu To cite this version: Bicao Li, Guanyu Yang, Jean Louis Coatrieux, Baosheng Li, Huazhong Shu. 3D nonrigid medical image registration using a new information theoretic measure.. Physics in Medicine and Biology, IOP Publishing, 2015, 60 (22), pp.8767-90. <10.1088/0031-9155/60/22/8767 >. <hal-01253712> HAL Id: hal-01253712 https://hal.archives-ouvertes.fr/hal-01253712 Submitted on 11 Jan 2016 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Three-dimensional Nonrigid Medical Image Registration Using a New Information Theoretic Measure Bicao Li1, 2, 3, Guanyu Yang1, 2, 3, Jean Louis Coatrieux1, 4, 5, 6, Baosheng Li1, 7 and Huazhong Shu1, 2, 3 1 Laboratory of Image Science and Technology, School of Computer Science and Engineering, Southeast University, 210096 Nanjing, China 2 Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, 210096 Nanjing, China 3 Centre de Recherche en Information Médicale Sino-français (CRIBs), Nanjing, 210096, China 4 INSERM, U1099, Rennes, F-35000, France 5 LTSI, Université de Rennes 1, Rennes, F-35042, France 6 Centre de Recherche en Information Médicale Sino-français (CRIBs), Rennes, F-35000, France 7 Department of Radiation Oncology, Shandong Cancer Hospital, 250117 Jinan, China E-mail: [email protected] Abstract This work presents a novel method for nonrigid registration of medical images based on the Arimoto entropy, a generalization of the Shannon entropy. The proposed method employed the Jensen-Arimoto (JA) divergence measure as a similarity metric to measure the statistical dependence between the medical images. Free-form deformations were adopted as the transformation model and the Parzen window estimation was applied to compute the probability distributions. A penalty term is incorporated into the objective function to smooth the nonrigid transformation. The goal of registration is to optimize an objective function consisting of a dissimilarity term and a penalty term, which would be minimal when two deformed images are perfectly aligned using the limited memory BFGS optimization method, and thus to get the optimal geometric transformation. To validate the performance of the proposed method, experiments on both simulated 3D brain MR images and real 3D thoracic CT data sets were designed and performed on the open source elastix package. For the simulated experiments, the registration errors of 3D brain MR images with various magnitudes of known deformations and different levels of noise were measured. For the real data tests, four data sets of 4D thoracic CT from four patients were selected to assess the registration performance of the method, including ten 3D CT images for each 4D CT data covering an entire respiration cycle. These results were compared with the normalized cross correlation and the mutual information methods and show a slight but true improvement in registration accuracy. Keywords: Arimoto entropy, divergence measure, free-form deformations, nonrigid registration. 1 1. Introduction Image registration is a very common and useful technique in numerous fields including medical imaging, computer vision, remote sensing, etc. It plays a major role in medicine for multimodal diagnosis and computer-aided surgery (Maintz et al 1998, Hill et al 2001). Due to the motion of tissue and organ of the patients, nonrigid deformations must be taken into account. This problem can be addressed by using nonrigid geometric registration methods aimed at finding the optimal elastic transformation between the two images to be aligned. Feature-based and intensity-based alignment methods (Khader et al 2011) are classically considered for such purpose. For the feature-based registration methods, their accuracy largely depends on the choice and the extraction of features, which usually include salient image elements such as corner points, edges, ridges, surfaces (Maintz et al 1996), and others. Let us mention, among many other schemes, SUSAN (Smith et al 1997) based on the minimization of a local image region and SIFT (Lowe 2004) that are invariant to scale and orientation. Szeliski and Coughlan introduced a fast registration algorithm for 3D anatomical surfaces (Szeliski and Coughlan 1997). In their approach, a distance criterion was minimized to search the optimal mapping between two surfaces. Another feature-based approach consists of using the moment invariants. Flusser and Suk described a blurring geometric moment invariant (Flusser and Suk 1998) and deduced a combined blurring and rotation invariant of geometric moment which were applied to feature extraction in image registration. On the basis of Flusser’s work, Bentoutou et al (2002) utilized the combined invariants for image registration in digital subtraction angiography (Bentoutou et al 2002). A robust discriminative clustering method based on mutual information of supervoxels is presented to extract the features of brain MRI (Kong et al 2015). However, the registration accuracy of feature-based approaches is directly determined by the feature extraction quality and completeness. This preprocessing step is not trivial in particular in medical imaging where low contrast and noise are observed. Another kind of registration methods accounts for intensity-based approaches, in which the spatial mapping of the images to be registered relies on the image intensity values, without any feature extraction or prior segmentation. Hoh et al (1993) used the sum of absolute differences (SAD) and the stochastic sign change (SSC) similarity measures for rigid registration of cardiac PET images (Hoh et al 1993). Similarly, the sum of squared differences (SSD) was also applied to intra-modality image registration. They all assume that the corresponding intensity values of the two images to be aligned are similar. To improve the robustness of image registration, the correlation coefficient was proposed by Lemieux et al (1994). However, it was pointed out that the correlation coefficient is sensitive to outliers (Huber 1981). Even a few outliers may greatly degrade the registration quality as it was shown later (Kim and Fessler 2002). To overcome this problem, a robust correlation coefficient (Kim and Fessler 2004), serving as registration metrics, was described. However, the correlation methods, if of interest for monomodal registration, are not suitable for the registration of multisensor (i.e. multimodal) images due to their totally different physics used (Maurer et al 1993). Another popular framework for intensity-based registration methods is based on the concept of information theory. The maximization of mutual information (MI) for multimodal image registration was independently introduced by Collignon et al (1995), Wells et al (1996), Maes et al (1997), Viola and Wells (1997). Collignon et al (1995) applied the MI registration algorithm to human head images and achieved a sub-voxel registration accuracy. In the paper by Maes et al (1997), a partial volume interpolation was proposed to estimate the marginal and joint image intensity distributions, and so, smoothing the objective function and 2 reducing the number of local maxima. Sub-voxel accuracy, high robustness and good immunity to noise and geometric distortion were shown. A new way to compute the image entropy and mutual information was reported by Viola and Wells (1997). The MI based methods do not require any preprocessing and they have found a large variety of applications. However, in some applications, mutual information may fail in registering images, especially when the image misalignment is due to a change in the field of view (FOV). To address this issue, a method called normalized mutual information (NMI) (Studholme et al 1999), robust to a partial overlap of the registered images, was proposed. NMI was defined as a ratio of the sum of the marginal entropies and the joint entropy. This approach was validated on MRI-CT images with different FOVs. Pluim et al (2000) presented a new information measure by combining the MI with gradient information. The matching results of MR-T1 and PET images indicated that this new method yielded a better registration than MI and NMI. A generalized method using the so-called f-information measure was later reported by the same group (Pluim et al 2001) and further extended to a series of other similarity measures, including V-information, I!-information, M!-information, "!-information and Rényi measure (Pluim et al 2004). Another registration criterion, based on cumulative probability distributions, called cumulative residual entropy (CRE) (Wang et al 2003) was proposed whose theoretical relations (Rao et al 2004) with Shannon entropy were further analyzed and compared with MI-based methods (Wang et al 2007). The CRE measure, being based on the cumulative distribution functions (CDF) rather than the probability distribution functions (PDF), was shown more robust to noise. An alternative information-based approach was introduced by Studholme et al (2006). They defined a nonrigid viscous fluid registration scheme, which uses a new similarity measure calculated over a set of overlapping sub-regions of the image. A third channel, representing a spatial label to extend the intensity joint histogram, was estimated by B-spline Parzen windows (Thévenaz et al 2000). A similarity measure called conditional mutual information (cMI) (Loeckx et al 2010) was presented. Similarly, cMI was calculated by extending the joint histogram to a third dimension, representing the spatial distribution of the joint intensities. The Shannon entropy possesses the additivity property, which means that the joint entropy of a pair of independent random variables is the sum of the individual entropies. However, as correctly indicated by Antolin et al (2010), this property does not take the correlation between the variables into account. To overcome this problem, they introduced the pseudo-additive Tsallis entropy. Inspired by Antolin’s work, we have presented a novel divergence measure called the Jensen-Arimoto (JA) divergence as the registration metric for 2D-2D rigid medical registration (Li et al 2014). It was shown that the classic mutual information is a particular case of JA divergence as ! = 1, where ! is a parameter appeared in JA divergence measuring the departure degree from pseudo-additivity. In this paper, we further investigate the concavity of Arimoto entropy and boundedness property of JA divergence and apply it for 3D nonrigid image registration. We introduce a 3D nonrigid registration technique with JA as the similar measure term and bending energy selected as the regularization term. Free-form deformations (FFDs) are employed as the transformation model and the limited memory Broyden-Fletcher-Goldfarb -Shanno (L-BFGS) optimization method is used to optimize the JA divergence. In our optimization scheme, we estimate the continuous probability distributions using the B-splines Parzen window, so that the analytical gradient of JA divergence can be obtained. The rest of this paper is organized as follows. In section 2, we introduce the new divergence measure based on Arimoto entropy. The subsequent nonrigid registration method 3 is detailed in section 3. Section 4 provides our experimental results on simulated 3D MR brain images and real 3D thoracic CT data with a comparison to the standard nonrigid methods based on the mutual information. Concluding remarks and perspectives are sketched in section 5. 2. Theory In this section, the Arimoto entropy is introduced and its properties are examined. Then, a new divergence called the Jensen-Arimoto divergence is presented and some basic properties are derived. 2.1. Arimoto Entropy Given a random variable X, the Shannon entropy of X is a measure of the amount of uncertainty included in the random variable (Cover et al 2006), H ( X ) = "# x!X p( x) log p( x). Consider another random variable Y, the conditional entropy of X given Y is defined as the amount of uncertainty left in X when knowing Y (Maes et al 1997). The mutual information is related to the reduction of the entropy of X by the conditional entropy of X given Y, and accounts for the degree of disparity between the two random variables. That is, when two random variables are independent from each other, the entropy of X is equal to the conditional entropy of X given Y so that the value of MI equals to zero, while MI achieves its maximum if Y is completely dependent on X. MI has been widely used in statistical applications. The Arimoto entropy (Arimoto 1971), a generalization of Shannon entropy, was introduced by Arimoto and further developed by other researchers (Boekee et al 1980, Liese et al 2006). Its definition is given by 1 M ! " ! ! # (1) A! ( X ) = &1 $ (* pi ) ' ! > 0,! % 1 ! $1 ( i =1 ) where P = (p1, p2, …, pM) denotes the M probability distributions on X. Some significant properties of the Arimoto entropy were deduced (Boekee et al 1980). Here, we only consider three of them. Non-negativity: (2) A! ( X ) " 0 ! > 0,! # 1 Pseudo-additivity: A! ( X , Y ) = A! ( X ) + A! (Y ) " Concavity: ! "1 A! ( X ) A! (Y ) ! (3) (4) A! (tX 1 + (1 " t ) X 2 ) # tA! ( X 1 ) + (1 " t ) A! ( X 2 )t $ (0,1) The detailed proof of these properties can be found in Liese et al (2006). In (3), X and Y are two independent random variables, A!(X, Y) is the joint Arimoto entropy between X and Y, for !!(0, 1) " (1, +#). Unlike the pseudo-additivity of the Arimoto entropy, the joint Shannon entropy has the property of additivity when two random variables are independent, i.e., H(X, Y) = H(X) + H(Y). Pseudo-addivity means that the Arimoto entropy is a nonextensive entropy and the nonextensive property signifies that the correlations between X and Y are considered. To demonstrate the utility of this property in image registration problem, an illustration will be given in section 4.2. As mentioned in section 1, the parameter ! in (3) weights the degree of nonextensivity. The nonextensivity gradually becomes weak as the parameter is tending to 1, and the Arimoto 4 entropy also approximates to Shannon entropy that accounts for the complete extensive entropy. Figure 1 illustrates the Arimoto entropy of a Bernoulli distribution, P = (p, 1–p), for various ! values. It was shown that Arimoto entropy is concave if !!(0, 1) or !!(1, +#). Moreover, the limit of the Arimoto entropy is equal to the Shannon entropy when !$1. The curve for ! = 1 in figure 1 represents the Shannon entropy, computed using the natural logarithm. As shown in figure 1, the measure of uncertainty by Arimoto entropy increases as ! decreases and is not less than Shannon entropy for 0 < ! < 1, and decreases as ! increases and is equal or less than Shannon entropy for ! > 1. Figure 1. Arimoto entropy A!(p) of a Bernoulli distribution p=(p, 1–p) for various ! values. 2.2. The Jensen-Arimoto Divergence Following the derivation reported in Lin et al (1991), we define a new divergence measure based on the Arimoto entropy called the Jensen-Arimoto (JA) divergence and study its properties in the sequel. The Jensen-Shannon (JS) divergence was used to measure the distance of two random variables and, as we will see, the JA divergence has the same nature as JS divergence. Definition 1: Let X(x1, x2, …, xM) be a random variable, and P( p1 , p2 , !!!, pM ) be M probability distributions on X. The Jensen-Arimoto divergence is defined as #M $ M (5) JA! ( p1 , p2 , %%%, pM ) = A! ' +"i pi ( & +"i A! ( pi ) ) i =1 * i =1 where A! (") is the Arimoto entropy for ! > 0, ! % 1 and "i are weight factors such that " M i =1 !i = 1 and "i & 0. When the nonextensive parameter ! tends to 1, Arimoto entropy converges to the standard Shannon entropy and the limit of JA divergence in (5) is identical to the traditional mutual information using L’Hopital’s rule. In the following, we study the properties of the JA divergence. Proposition 1: The JA divergence is nonnegative, symmetric and equal to zero if and only if all probability distributions are identical to each other for ! !(0,1) " (1,+#). Proof: For a real concave function ! , with numbers x1, x2, …, xM in its domain, and positive weights !i, the Jensen inequality can be stated as ! ( $"i xi ) # $"i! ( xi ) . 5 (6) Due to the concavity of Arimoto entropy, we replace the function " in (6) by the Arimoto entropy function and we can easily verify the non-negativity of JA divergence over the interval !!(0, 1) " (1, +#). In addition, the JA value will not be changed when we exchange any two probabilities of equation (5), which leads to the symmetry of JA divergence. And it vanishes (with the identity axiom as a distance) if and only if all probability distributions are equal to each other. Here, we can set every pi equal to 1/M, and substitute them into (5), we obtain JA! ( p1 , p2 , """, pM ) = 0. Four requirements must be fulfilled for a distance (Li et al 2004): non-negativity, identity axiom, symmetry axiom and triangle inequality. The JA divergence is not a true distance since it does not satisfy the triangle inequality. Nevertheless, this does not degrade the performance of JA as a measure of the disparity among probability distributions. JA has similar properties as the mutual information, which measures the amount of information that one random variable includes in another random variable. Thus, we can use the JA divergence to evaluate the similarity between the two images to be registered. Proposition 2: For ! ! (1, 2], the JA divergence is a convex function of p1, p2,…, pM. A detailed proof can be found in Li et al (2014). In the optimization scheme of JA divergence, its derivative can be analytically calculated. To ensure the correct image alignment during the optimization and to guarantee the feasibility of the optimization method, ! is restricted to the interval (1, 2] in the rest of this paper. Proposition 3: The maximum of JA divergence is obtained when p1, p2, …, pM are degenerated distributions, that is, pi = #ij, #ij denoting the Kronecker symbol, and this maximum equals to A$(!). Since JA divergence is a convex function of p1, p2, …, pM, The proof can be easily verified (He et al 2003). Combining propositions 1 and 3, we can see that the JA divergence is bounded, 0 # JA! ( p1 , p2 , $$$, pM ) # A! (" ) . 3. Description of the proposed method We detail here our nonrigid registration method. Firstly, we need to choose an appropriate transformation model. Then, in this transformation space, the JA divergence measure used as the registration criterion is derived together with the L-BFGS optimization method. (a) (b) (c) (d) Figure 2. Two corresponding slices in MR T1 and MR T2 volumes and the deformation between them. (a) The reference image; (b) the deformed float image; (c) a known smooth deformation field to generate (b); (d) the corresponding deformation vector. 3.1. Registration Framework Given two misaligned images to be registered, one is selected as the float image F, and the other considered as the reference image R. As an example, figure 2 depicts the corresponding slices in two volumes to be registered: there are, from left to right, the T1-weighted MR and 6 T2-weighted MR images, the true deformation field and the vector field between the two images. The image registration process consists to search the optimal transformation between F and R. Let x and x' be an arbitrary point in R and its corresponding point in F, respectively. The spatial mapping function between the corresponding points in R and F can be formulated by x! = Tµ (x) , (7) where Tµ is the transformation function (the global function for rigid or affine registration and the local function in case of deformation) with parameters µ. For the 3D case, x = (x, y, z)T and x' = (x', y', z')T, the registration of F to R can be simply described as an optimization problem T * = arg min D( FoTµ , R ) µ (8) = arg min D( F (Tµ (x)), R(x)), µ where D denotes a suitable dissimilarity measure that achieves its minimum when R(x) and F(Tµ(x)) are completely aligned. The symbol “ o ” stands for an operation such that the float image is transformed by the space mapping Tµ. However, the nonrigid deformation should be generally characterized by a smooth transformation. For that, a penalty term is introduced for regularizing the local deformation. Taking the regularization term into account, the objective function E is written as (9) E = D( F (Tµ (x)), R(x))+! S Tµ (x) , ( ) where # is the weight parameter balancing the tradeoff between D and the smoothness term S. To align the deformed float image F(Tµ(x)) with the reference image R(x), the optimal parameter set µ minimizing the objective function E must be searched. So, in our registration framework, three crucial issues are concerned: the transformation model, the registration metrics (i.e. objective function) and the optimization scheme. In the next sections, these factors are examined in detail. 3.2. Transformation Model Several popular transformation models including rigid, affine, perspective and curve (elastic) mapping models have been reported (Pluim et al 2003). In this paper, we address a nonrigid transformation dealing with organ or tissue motion. The local deformations though parameterized transformations can be preferably described by the free form deformation (FFD) model. In addition, a number of desirable properties of B-splines for modeling the deformation, including the inherent control of smoothness and separability in the multidimensional case, have been introduced (Wang et al 2007). Therefore, to simulate this elastic deformation, the FFD (Mattes et al 2003) based on cubic B-splines as the transformation model, is employed. For 3D images, let ( be a nx)ny)nz mesh along the x, y and z directions of control points ["i, "j, "k]T with a uniform spacing $. Note that to improve the registration efficiency and save calculation time, a rigid or affine registration is often applied as a rough initial step before implementing nonrigid registration (Klein et al 2010), especially in the presence of a large global displacement (translation or rotation) between two images to be registered. In our work, we have chosen two phases of one respiration cycle as two registered images, and the large global displacement between them does not exist. So a rigid registration is not necessary prior to the nonrigid registration. The mesh points serve as parameters of the free form transformation based on B-splines and the choice of mesh points determine the amount of nonrigid deformation. Usually, a large 7 spacing of mesh points allows for global nonrigid deformation, while a small spacing accounts for local nonrigid deformation. Therefore, we need to choose the mesh point set according to the degree of nonrigid deformation between the two test images. Then, the 3D transformation at point x = [x, y, z]T can be stated using a linear combination of cubic B-spline kernels as % x $ !i & (3) % y $ ! j & (3) % z $ !k & (10) Tµ (x) = + µijk " (3) ' ( " ' # ( " ' # (, ) # * ) * ijk ) * where µijk denotes the vector of deformation coefficients related to the mesh and the control points (for its computation, the reader can refer to (Mattes et al 2003)). In the following, we adopt the transformation model defined by (10). 3.3. Registration Criteria In the image registration process, the value of the similarity measure for two images is expected to be maximum when they are perfectly aligned. However, as stated in section 3.1, our purpose is searching for the minimum of a dissimilarity measure. Therefore, in accordance with section 3.1, a negative sign is assigned to the JA divergence measure in (5) and a new dissimilarity measure D is produced. (11) D( F (Tµ (x)), R(x)) = " JA! ( p1 , p2 , ###, pM ) Combining (11) and (9), the optimal spatial mapping between F(Tµ(x)) and R(x) can be formulated as (12) µ * = arg min E F (Tµ (x)), R(x) = arg min # JA! ( p1 , p2 , $$$, pM ) + " S (Tµ ( x)) ( ) { µ } where µ* is the estimated parameters set, pi = pi(F(Tµ(x))|R(x)), 1 * i * M, represents the conditional probability distribution of the transformed float image given the reference image. Let f = (f1, f2, …, fM) and r = (r1, r2, …, rM) be two sets of the intensity values in F(Tµ(x)) and R(x) respectively and M the number of the chosen intensity bins to estimate the histogram. Then, the conditional probability can be rewritten as pi = p(F=fj|R=ri) = p(fj|ri). The dissimilarity measure D shown in (11) represents the opposite number of the JA divergence with different weights and ! values. In (5), we use the marginal probability distribution of the reference image as the weight (a prior), that is, "i = p(R(x)) = p(R=ri) = p(ri). Substituting "i and pi into (11), the new dissimilarity measure D used in our method is given by D( p1 , p2 , """, pM ) = # JA! ( p1 , p2 , """, pM ) 1 1 $ % ! ! M M M ! & ' &M ' ! ( & ' ( ! = + ) 1 1 p(ri ) p( f j | ri ) * * #1 p(ri ) )1 p( f j | ri ) * , ! # 1 ( -) j =1 )- i =1 . .* i =1 - j =1 . ( / 0 $ 1 1 (13) % M & M ! ! '! ! (& M !' ( = + ) 1 p ( f j ) * # 1 ) 1 p(ri , f j ) * , ! # 1 ( - j =1 i =1 - j =1 . . ( 0 / S(Tµ(x)) in (12) is the penalty term aimed at smoothing the transformation and its expression is 8 2 2 2 1 X Y Z !" $ 2T # " $ 2T # " $ 2T # S (Tµ ( x)) = . . . %& 2 ' + & 2 ' + & 2 ' V 0 0 0 %( $x ) ( $y ) ( $z ) * (14) 2 2 2 " $ 2T # " $ 2T # " $ 2T # + +2 & ' + 2& ' + 2& ' , dxdydz, ( $x$y ) ( $x$z ) ( $y$z ) ,where V denotes the volume of image domain in which the deformation T defined in (10) is estimated. X, Y and Z denote the dimensions of the image domain along the three directions, respectively. Generally, the histogram method is used to calculate the marginal probability p(ri) and the conditional probability p(fj|ri) involved in (13). Then, the new registration criterion defined in (12) is minimized using an optimization method. 3.4. Optimization The search algorithm mainly influences the accuracy and calculation time of a registration method. Hence, selecting an appropriate optimization method plays a crucial role in a registration framework. Compared to the gradient descent methods, the use of second-order information in Newton Raphson algorithm can provide better theoretical convergence properties (Klein et al 2007). However, the computation of the Hessian matrix and its inverse is time-consuming. As for the quasi-Newton methods, the inverse of Hessian matrix is approximated and the second order derivatives of the objective function are not calculated. In this paper, we adopt the L-BFGS (Nocedal et al 1980), a popular variant of the quasi-Newton approach, to minimize the objective function shown in (12). A second-order Taylor approximation (Press et al 2007) of the objective function E at µ is derived as follows 1 (15) E ( µ + !µ ) " E ( µ ) + !µ T # $E ( µ )+ !µ T # $ 2 E ( µ ) # !µ , 2 where E(µ) denotes the objective function in (12) for simplicity, #µ represents the increment of the parameter vector µ and superscript T is the transpose operator. The second term on the right hand side of (15) is the directional derivative of E at µ in the direction #µ. For the L-BFGS optimization algorithm, the parameter vector µ is updated at iteration number k+1 using the previous iteration number k as (16) µ ( k +1) = µ ( k ) ! (H ( k ) )!1 "#E(µ ( k ) ), (k) -1 where the superscript k stands for the iteration number, (H ) is the inverse of Hessian matrix for every iteration, and $ denotes the gradient operator. Here, note that the elements of (H(k))-1 are computed using an approximation instead of the actual Hessian matrix at each iteration. In the sequel, we need to compute the derivative of the objective function E with respect to µ. An important prerequisite of this method is the calculation of gradients of the objective function. The derivative of E is given by #E ! #E #E #E " (17) , , $$$, =% &, #µ ' #µ1 #µ2 #µn ( where n denotes the number of parameters (degree of freedom, DoF) in the transformation model. In our scheme, for the non-rigid registration, the amount of DoFs depends on the size of the chosen mesh in FFD model. For example, if the mesh is selected with the size 10 ) 10 ) 10, the number of DoFs can reach about 3000 for 3D images as reported by Rueckert et al (1999). The derivative of the penalty term S in the objective function has been already reported in Rohlfing et al (2003). Therefore, we only provide in this work the derivative calculation of the dissimilarity measure. Note that the L-BFGS optimization scheme is stopped if either the difference of objective 9 function value between two consecutive iterations is less than 0.01 or the number of iterations reaches a prefixed value NMAX. In our work, NMAX was set to 100. 3.4.1. Probability estimation To avoid the discretization errors and to carry out a robust optimization scheme, the analytical gradient of the new dissimilarity measure D needs to be calculated. As the form of the probability distributions in (13) is not continuous and differentiable, we adopted a Parzen-window to estimate the continuous probability distribution function from the image data. Many kernel functions can be used as far as they meet the requirements to be smooth, symmetric, to have zero mean and integrate to one (Khader et al 2011). Some popular kernel functions involve Gaussian (Tustison et al 2011) and B-spline functions (Thévenaz et al 2000). In this study, the B-spline functions are used to estimate the marginal and conditional probabilities. Let %(0) be the zero-order B-Spline function and %(3) the cubic B-Spline function. Due to the independence of the reference image to the transformation parameters µ, a zero-order B-spline was used and the marginal probability distributions of R(x) is expressed by $ 1 R(x) & R0 % (18) p!(ri ) = , ! (0) ' ri & (. )bR V x"# * + For the transformed float image, the cubic B-spline function was used to estimate the probability distribution as previously proposed (Wang et al 2005, Khader et al 2011). In this case, the estimated probability of F(Tµ(x)) is given by $ F (Tµ ( x)) & F 0 % 1 (3) (19) p!( f j ) = , ! ' f j & ((, ' )bF V x"# * + and the joint probability of F(Tµ(x)) and R(x) by F (Tµ (x)) & F 0 % $ 1 R(x) & R 0 % (3) $ (20) ' & p!(ri , f j ; µ ) = - ! (0) ( ri & f , ! ) (( j )) *bR , * V x"# b F + + , where + denotes the image domain in which the marginal and joint probability density functions were estimated. V is the total number of voxels in +. In (18), (19) and (20), intensity values of the two images are normalized by the minimal intensity values, R0, F0 and the range of the chosen bins width ,bR, ,bF of the reference and float image, respectively. 3.4.2. Derivative of the proposed dissimilarity measure The derivatives of the marginal probability p!( f j ) and of the joint one p!(ri , f j ; µ ) with respect to µ are $ &p!( f j ) F (Tµ (x)) ' F 0 % $ &F (t ) 1 % & (Tµ (x)) (3 ) (21) =' ' ' f |t =Tµ (x ) * ( ! ) **( ) . j ) &µ +bF &µ V ( +bF x"# , - , &t and $p!(ri , f j ; µ ) % 1 R (x) ' R 0 & =' ! (0) ( ri ' ) / $µ V * +bF x"# +bR , (22) 0 % F (Tµ (x)) ' F & % $F (t ) $ (Tµ (x)) & (3) |t =Tµ (x ) ) * *!. ( fj ' ))* ( ( + b $ t $µ , F , where %'(3) is the derivative of the cubic B-Spline function, !F (t ) !t denotes the simple gradient of the transformed float image F(Tµ(x)), ! (Tµ (x)) !µ is calculated by the selected 10 3D FFD transformation model. Referring to the expression of dissimilarity measure in (13) and combining it with (21) and (22), we can easily show that the derivative of D is 1 "1 # ! $p!( f j ) & ! '% $D ! ! "1 = () 1 p!( f j ) * 1 p!( f j ) $µ ! " 1 '+ j $µ j , (23) 1 "1 . $p!(ri , f j ) ' % &! "1 ) 1 p!(ri , f j )! * 1 p!(ri , f j )! "1 / $µ ' i + j j , 0 where !p!( f j ) !µ and !p!(ri , f j ) !µ represent respectively the derivatives of the estimated marginal probability of the transformed float image in (21) and the joint probability stated in (22). 4. Experiments and results To evaluate the nonrigid registration method, two groups of experiments were conducted on simulated 3D brain MRI images and real 3D thoracic CT data, respectively. In section 4.1, these datasets are described. The simple registration experiments are carried out on several model images in section 4.2. The performance of our method in terms of registration accuracy and noise immunity is exemplified through simulations in section 4.3 and compared with MI technique. Its interest in solving the elastic registration of real CT data is shown in Section 4.4. Finally, the overall results are analyzed. Our nonrigid registration method based on JA divergence was implemented in the registration package elastix (Klein et al 2010) available at http://elastix.isi.uu.nl. This package is based on the Insight Toolkit (ITK) (Ibáñez et al 2005). In this work, all experiments were performed on a 2.33 GHz and 6 GB memory PC. 11 Figure 3. Three orthogonal central planes in the simulated MR brain volumes. Columns: axial, sagittal and coronal slices of the MR volumes. Rows 1 to 3 denote MR T1, MR T2 and MR PD images, respectively. 4.1. Test data 4.1.1 Simulated brain MRI images The simulated data were obtained from the Brainweb database of the Montreal Neurological Institute (Cocosco et al 1997). They contain 3D brain MRI volumes simulated through three different protocols: T1-weighted, T2-weighted and Proton Density-weighted (PD-weighted) with various slice thicknesses, noise levels and intensity non-uniformities. The size of the volumes (with voxel coding on 8 bits) is 181 ) 217 ) 181 voxels with a voxel size 1mm ) 1mm ) 1mm. All corresponding slices in these volumes were aligned with each other. In the simulated experiments, the corresponding volumes from MR T1 & MR PD pairs were chosen for evaluation. MR T1 volume served as the reference image and MR PD volume as the float image to be aligned using a known 3D nonrigid transformation (with the true value T0). Figure 3 depicts three orthogonal central slices (i.e. axial, sagittal and coronal) used in our protocol. 4.1.2 Real 3D thoracic CT data Experiments were also conducted on forty 3D thoracic CT images acquired from four 4D CT sequences of four patients. These data were obtained on a Brilliance Big Bore 16-slice CT scanner (Philips Medical Systems, Cleveland, OH) and were part of a radiotherapy planning process at the Léon Bérard Cancer Center in Lyon, France. The 4D datasets were reconstructed to yield forty phases of 3D CT images (Vandemeulebroucke et al 2011, Vandemeulebroucke et al 2012). The first three datasets have 512 ) 512 pixels in plane. The number of slices varies from 141 to 170, with an in-plane resolution ranging from 0.8789 ) 0.8789 mm2 to 0.9766 ) 0.9766 mm2 and 2mm interslice. The fourth dataset has the size of 482 ) 360 ) 141 with the voxel dimension of 0.9766 ) 0.9766 ) 2mm3. For each phase image of the data, a number of landmarks features were labeled manually by an expert. The image information and landmarks for each case are shown in Table 1. More detailed descriptions of the data can be found in (Vandemeulebroucke et al 2011, Vandemeulebroucke et al 2012). Figure 4 displays three orthogonal planes of the maximal exhale and inhale phases of 4D CT image of the forth dataset. 12 Figure 4. Two frames of 3D thoracic CT images extracted from a 4D CT scan. The top row represents the maximum exhalation frame and the bottom row, the maximum inhalation frame. Columns 1 to 3 in each row display the axis, coronal and sagittal slices of two phases of the CT volumes, respectively. 4.2. Illustration To illustrate the effect of the pseudo-addivity property in image registration, we applied our method as well as the mutual information method to register the MR T1 images shown in figure 5, with different degrees of deformations (various levels of correlation) between two images to be registered. Here, we chose figure 5 (a) as the reference image and the others as float images. Note that when warping indexes are increased, the correlation coefficients between them are gradually decreased. Figure 6 exhibits the registration results using mutual information (MI) based on the standard Shannon entropy and our similarity measure based on the Arimoto nonextensive entropy. Only slight differences between our measure and MI are observed when registering figure 5 (a) and (b), (c), (d) with relative small deformations. However, due to the non-extensibility of the Arimoto entropy, our measure (figure 6 (d), (e) and (f)) gives a much more accurate alignment in the presence of large deformations than MI (figure 6 (g), (k) and (l)). This is particularly visible at the level of the cerebral ventricles, a major structure of the brain (see red arrows). (a) (d) (b) (e) (c) (f) (g) Figure 5. The MR T1 images. (a) Reference image, (b)-(g) float images, with different deformations with respect to (a), along with the correlation coefficients between them being 0.9055, 0.8492, 0.8059, 0.7745, 0.7543 and 0.7399. 13 (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) Figure 6. Registration results of the MR T1 images displayed in figure 5. (a)-(f) The transformed float images superimposed with the edges (Canny’s operator) of the reference image (a) when registering figure 5 (a) and (b)-(g) using JA measure, (g)-(l) the corresponding results when employing MI method. 4.3. Simulated Data Experiments The performance of our JA registration method is first assessed using the simulated 3D brain data through a three-level hierarchical multiresolution scheme. The role of the several parameters involved in our method is examined and a comparison with mutual information (MI) is also reported. Table 1. CT images information and landmarks of four datasets. CT properties Dataset 1 Dataset 2 Dataset 3 Dataset 4 Volume sizes Voxel dimensions(mm) Landmarks 512 ) 512 ) 141 482 ) 360 ) 141 512 ) 512 ) 169 512 ) 512 ) 170 0.9766 ) 0.9766 ) 2 0.9766 ) 0.9766 ) 2 0.9766 ) 0.9766 ) 2 0.8789 ) 0.8789 ) 2 100 41 100 100 4.3.1. Parameters setting The registration parameters were chosen by means of trial-and-error on the simulated data. To demonstrate the effect of these parameters, the experiment process was performed as follows. Using the pair of MR T1 and MR PD volumes mentioned in Section 4.1.1, the former volume was chosen as reference image while the latter volume was elastically deformed and considered as the float image. This deformation is based on the following warping function being analogous as that of (Lu et al 2008) " "!x# "! y # " ! z ## (24) x$ = x + m % sin % & + sin % & +sin % && ' 48 ( ' 48 ( ( ' ' 48 ( " "!x# "! y # " ! z ## y$ = y + m % sin % & + sin % & + sin % & & ' 48 ( ' 48 ( ( ' ' 48 ( 14 (25) " "!x# "! y # " ! z ## z $ = z + m % sin % & + sin % & + sin % && ' 48 ( ' 48 ( ( ' ' 48 ( (26) where (x, y, z) and ( x! , y ! , z ! ) denote respectively the original coordinates before and after deformation. The parameter m accounts for the warping index, and also determines the magnitude of the deformation field. To evaluate quantitatively the registration results, the difference between the true values and the estimated values (magnitude of the displacement vector) was calculated as the registration error. In all experiments of this section, sixty warping indexes were selected randomly from a wide-range interval [1, 7], along with sixty pairs of test images being produced consequently. We set the weighting parameter # = 0.005 that provides a good tradeoff between the dissimilarity measure D and the regularization term S in the objective function E. These studied parameters include the nonextensive parameter !, the number of bins M, the spacing of mesh points $ and the amount of random samples N. Figure 7 displays the mean and standard deviation of the registration errors when varying these four parameters. As illustrated in figure 7 (a), the lowest mean of registration error is obtained when ! = 1.50 (here, M = 32, $ = 16)16)16, N = 2000). The difference in mean running time (about 4min) is not so large when varying ! values, hence, we set ! = 1.50 for the remaining experiments. The average calculation times when varying the other three parameters are tabulated in Table 2. It can be observed from figure 7 (b) and Table 2 that the errors and the computation times gradually increase with a higher number of bins M. Although a larger value of M leads to a better estimation of the probability density functions, a compromise has to be set. The value M = 16 has been retained here. The mesh spacing with value of 16)16)16 provides the better registration results in figure 7 (c) and has been selected. Figure 7 (d) shows that the lowest registration error is obtained when N = 5000. However, the computation times increase rapidly with a higher number of random samples (Table 2), thus the value N = 2000 was preferred. (a) (b) 15 (c) (d) Figure 7. The non-rigid registration results of the simulated 3D brain MR T1 and MR PD images when choosing various values of four parameters, !, M, $ and N. The deformations parameters were generated randomly from the range [1, 7], with the unit of deformation and errors being also millimeters. (a) Registration errors for several various ! values, ! ! {1.01, 1.10, 1.25, 1.50, 1.75, 2.00} and M = 32, $ = 16)16)16, N = 2000; (b) The results of different bins size, M ! {8, 16, 32, 64, 128} with ! = 1.50, $ = 16)16)16, N = 2000; (c) The effect of mesh points with $ ! {5)5)5, 8)8)8, 12)12)12, 16)16)16, 20)20)20},! = 1.50, M = 32, N = 2000; (d) Statistics of registration errors for the number of random samples, N ! {2000, 3000, 5000, 8000, 10000} and ! = 1.50, M = 32, $ = 16)16)16. Table 2. Average calculation times of the registration shown in Figure.5. Here, cases 1 to 5 denote {8, 16, 32, 64, 128} for the number of bins M, {5)5)5, 8)8)8, 12)12)12, 16)16)16, 20)20)20} for the spacing of mesh points $ and {2000, 3000, 5000, 8000, 10000} of random samples N. Parameters Case 1 Case 2 Case 3 Case 4 Case 5 Bins M Spacing $ Samples N 3min20s 13min 4min 3min30s 6min 6min 4min 4min30s 8min 5min 4min 14min 9min 3min40s 17min 4.3.2. Accuracy To illustrate the registration accuracy of our method, MI is compared to the JA divergence on three groups of 3D brain data sets, MR T1 & MR T2, MR T1 & MR PD and MR T2 & MR PD. In each group, the former volume is used as reference image. We selected sixty warping indexes generated randomly from the same range [1, 7] as in section 4.3.1, resulting in sixty pairs of test images produced for each group of data and 360 nonrigid registrations of these three groups of data sets for the two methods in total. 16 Figure 8. The registration results of the simulated 3D brain MR T1 & MR T2, MR T1 & MR PD and MR T2 & MR PD volumes using two algorithms. Experimental parameters are set by ! = 1.50, M = 16, $ = 16)16)16, N = 2000. (a) (b) (c) (d) (e) (f) (g) (h) Figure 9. The checkboard of non-rigid registration results using JA method (! =1.50). (a) central slice of the reference MR T1 volume; (b) central slice of the deformed float MR PD volume; (c) checkboard of middle axial slice in reference and float volumes before registration; (d) checkboard of middle axial slice in reference and float volumes after registration; (e)-(f) and (g)-(h) shows the checkboard of the sagittal and coronal planes, respectively. Figure 8 displays the resulting box-and-whisker plots of the registration errors for all 360 trials. For each box, the central red mark is the median of the registration errors of the ten tests, with two edges of each box denoting the 25% and 75%, the upper and below whisker being the maximum and minimum without considering outliers. In addition, these outliers are represented by the red color crosses for each box. As it can be observed, the registration errors for MR T1 & MR PD are relative high when compared to those of other two groups while the lowest errors are obtained for MR PD to MR T2. The JA method leads to a slight improvement in registration accuracy. Figure 9 depicts the checkboard of one non-rigid registration example resulting from our method for a MR T1 & MR PD registration case when the warping index m =5 (! = 1.50). The shape of the registered float image appears visually close to the reference image. 17 4.3.3. Noise immunity In this section, we assess the performance of our method and MI-based algorithms in presence of different levels of noise. The three groups of data sets are those already exploited in section 4.3.2. The warping index m has been set to 5. Sixty levels of Gaussian noise with zero-mean were randomly generated along with their standard deviations chosen from the range [0, 60]. These different levels of Gaussian noise were added to the deformed float images, and sixty pairs of test images were obtained for each group of data (MR T1 & MR T2, MR T1 & MR PD and MR T2 & MR PD). Altogether, 360 registrations were performed. In order to accelerate the optimization process, we adopted a three-level multi-resolution strategy. For each resolution level, the L-BFGS optimization method was used to minimize the objective function with the registration parameters specified in section 4.3.1. The calculation times for registration are almost the same: 3min 20s for our method and 3min 30s for the MI method. Figure 10 shows the box-and-whisker plots of the resulting registration errors. As it can be seen, the average registration errors obtained using the two methods are less than one millimeter, i.e. a subvoxel accuracy. The difference between MI and our method is higher than the one reported figure 10 which means that the JA approach is more robust to noise effect. Figure 10. The registration errors of the simulated 3D brain MR T1 & MR T2, MR T1 & MR PD and MR T2 & MR PD volumes using MI and our approach in the presence of different levels of noise. The parameters have the same values than before (! = 1.50, M = 16, $ = 16)16)16, N = 2000). 4.4. Real Data Experiments In this section, a series of experiments on real 3D thoracic CT data were carried out. Forty phases of four 4D CT from four patients were selected as test images. For each phase image, a certain number of landmark features (Table 1) were labeled manually by an expert, and these landmarks correspond to each other during the forty phase images. A three-level hierarchical method was used to achieve the best compromise of registration accuracy and calculation time. The ten frames for each 4D CT, designated as T00-T90, cover the time sequence between the maximum inhalation (marked as T10) and the maximal exhalation (indicated by T60). To evaluate the performance of our method, the experiments were designed as follows: for each 4D CT set, the T60 phase served as the reference image and the float image was selected from the residual nine phases. Thus, 36 pairs of nonrigid registrations were performed for the 18 total four data sets, resulting in 108 registrations using our approach and the normalized cross correlation (NCC) and MI methods. The deformation registration accuracy was evaluated using the target registration error (TRE) (Vandemeulebroucke et al 2011, Vandemeulebroucke et al 2012) calculated as the 3D Euclidean distance between the manually marked landmarks in the maximum exhalation image and the ones estimated by exploiting the registration approach to the corresponding location in the inhale image (Castillo et al 2010). Figure 11 illustrates the box-and-whisker plots of TREs for 108 registrations. For visual inspection, figure 12 displays an example of registration using our method for T10 and T60 images of the dataset 1. The mean and standard deviation of 3D TREs before and after registration are exhibited in figure 13. To evaluate the significance of differences in registration errors of JA, MI and NCC, a statistical test based on the Wilcoxon signed-ranks was conducted on the TREs of four real CT datasets. At the significance level of 0.05, a p value less than 0.05 indicates a rejection of null hypothesis, and a p value larger than 0.05 means an acceptation of null hypothesis. The results of Wilcoxon signed-ranks tests between JA and the other two methods on four datasets are reported in Table 3. They show that the registration errors of four groups of data when adopting JA are significantly lower than NCC (p < 0.05). From Table 3, a significant difference between JA and MI on TREs of datasets 1 and 3 is found (p < 0.05); no significance for datasets 2 and 4 (p > 0.1). However, for datasets 2 and 4, if we only consider those cases in which the deformations between the two 3D CT images to be registered are relatively large, a significant difference of JA with MI can be observed (p < 0.05). Table 3. p-values of Wilcoxon signed-ranks tests of JA with MI and NCC on four sets of CT data. The significance level is 0.05. A p-value less than 0.05 means that the difference between two methods is significant, otherwise no significance is found. Methods Dataset 1 Dataset 2 Dataset 3 Dataset 4 JA and MI JA and NCC 0.0195 0.0477 0.1641 0.0273 0.0442 0.0391 0.2031 0.0441 Figure 11. The target registration errors (TRE) obtained when employing JA algorithm, MI and NCC method for 3D thoracic CT images from four 4D CT data sets. The limited memory BFGS scheme was applied as in section 4.3 with a three-level multiresolution strategy, the maximum number of iterations being 100 per level. Besides, 19 3000 samples were selected randomly by the image sampler at each iteration. The runtimes of our method, MI and NCC were approximately 6, 5.5 and 6.5 minutes, respectively. Figure 12. Difference images in orthogonal central planes before registration (row 1) and after registration (row 2) between T10 and T60 phase images of data set 1 using the JA algorithm (! = 1.50). From left to right in each row, the transverse, coronal and sagittal slices. Figure 13. The mean and standard deviation of target registration errors before registration and after registration exploiting the JA, MI and NCC methods for the four groups of 3D thoracic CT images. 4.5. Analysis of Results The objective of the nonrigid registrations performed on simulated data was to evaluate the effect of the parameters on the performance of the JA method. These parameters include ! values, the number of bins, spacing of mesh points and random samples. Several experiments on three pairs of simulated MR volumes, MR T1 & MR T2, MR T1 & MR PD and MR T2 & MR PD were conducted and the results were compared with the classic information based measure (i.e. MI method). The test datasets were initially aligned with each other, that is to say, the true transformations (the “gold standard”) were known. In our case, we utilized in each image pair, the former MR T1 or MR T2 volume and the latter MR T2 or MR PD volume as the reference and float images, respectively. To compare the registration accuracy of our method, sixty trials were carried out based on deformations using distortion parameters randomly generated within a large range. Smaller errors and a better behavior with respect to convergence were observed for the JA method. To assess the immunity of noise, sixty levels 20 of noise were added to the float images and 60 trials were performed. For these sixty tests, in which we fixed the magnitude of elastic deformations, a better robustness to noise was observed, showing thus the effectiveness of our approach. The goal of the second set of experiments was to assess its performance on real data. 3D thoracic CT data sets corresponding to an entire breath cycle were used. We considered the geometric transformation between two phases of 3D CT images. In such case, the so-called “ground truth” is unknown. To estimate the registration accuracy of our method, a number of landmarks were first defined in four groups of 4D CT data. This allowed exploiting the 3D target registration errors (TRE) between the corresponding landmark locations as a measure of performance. In all ten frames of the breath cycle of one 4D data, T60 was chosen as the reference image, the remainder frames being the float images. Nine groups of experiments were generated for each 4D CT data and 36 pairs of test images in total were considered. The experimental results show a better behavior of our JA algorithm with low TREs when compared to MI and NCC methods, achieving a sub-voxel registration accuracy in almost all cases. 5. Conclusion In this work, a novel similarity measure based on the Arimoto entropy, called the Jensen-Arimoto divergence (JA) has been proposed for nonrigid registration of medical images. This new similarity measure is a generalization of the well-known Jensen-Shannon divergence measure. Based on the properties introduced in section 2.2 of the JA divergence, a new non-rigid registration technique has been presented. We employed free-form deformations as the parameter space along with the negative JA divergence as the dissimilarity measure and a penalty term to smooth the deformation constituting the entire registration criteria (objective function). To search the minima of this objective function, the L-BFGS optimization approach was used. We applied the Parzen window estimation algorithm with zero order and cubic order B-spline function as the kernel function to estimate the marginal probability distribution of the two volumes to be registered. The joint probability distribution between them was obtained in the same way. Experiments were carried out on two groups of data sets, the former using simulated brain data (MR T1 & MR T2, MR T1 and MR PD and MR T2 & MR PD volumes), the latter including four time sequences of real 3D thoracic CT data highlighting an entire respiratory cycle. The results obtained on simulated sets have shown that our alignment technique leads to lower registration errors in the presence of various magnitudes of deformation and levels of noise when compared to the standard MI method. It has been assessed through the tests made on real 3D data and using the 3D target registration error as quality criterion that the JA method provides sub-voxel registration accuracy. As mentioned above, the parameter ! in Arimoto entropy weights the nonextensive degree. The nonextensivity gradually becomes weak as ! tending to 1, and the JA divergence also approximates to mutual information that accounts for the complete extensivity property. While in this paper ! value is a self-defined constant by real experiments, its choice determines in some extent the quality of the registrations. So ! value is a sensitive factor and how to select a suitable ! value is an existing difficulty when choosing JA as registration criteria. In the next future, we will extend our applications to other multimodal medical images such as ultrasound and MR images and also to other organs. 21 Acknowledgments This work was supported by the National Basic Research Program of China under Grant 2011CB707904, by the National Natural Science Foundation of China under Grants 61201344, 61271312, 61401085, 81101104 and 61073138, by the Ministry of Education of China under Grants 20110092110023 and 20120092120036, the Project-sponsored by SRF for ROCS, SEM, and by Natural Science Foundation of Jiangsu Province under Grant BK 2012329, BK2012743, DZXX-031, BY2014127-11, by the ‘333’ project under Grant BRA2015288 and by the Qing Lan Project. References Antolín J, López-Rosa S, Angulo J C and Esquivel R O 2010 Jensen–Tsallis divergence and atomic dissimilarity for position and momentum space electron densities J. Chem. Phys. 132 044105. Arimoto S 1971 Information-theoretical considerations on estimation problems Information and Control 19 181-194 Bentoutou Y, Taleb N, El Mezouar M C, Taleb M and Jetto L 2002 An invariant approach for image registration in digital subtraction angiography Pattern Recognit. 35 2853-2865 Boekee D E and Van Der Lubbe J C A 1980 The R-norm information measure Information and Control 45 136–155 Castillo E, Castillo R, Martinez J, Shenoy M and Guerrero T 2010 Four dimensional deformable image registration using trajectory modeling Phys. Med. Biol. 55 305–327 Cocosco C A, Kollokian V, Kwan R K S, Pike G B and Evans A C 1997 Brainweb: Online interface to a 3D MRI simulated brain database NeuroImage 5 Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P and Marchal G 1995 Automated multi-modality image registration based on information theory 14th Int. Conf. on IPMI, Ile De Berder, France 263–274 Cover T M and Thomas J A 2006 Elements of Information Theory 2nd ed., New York: Wiley Flusser J and Suk T 1998 Degraded image analysis: an invariant approach IEEE Trans. Pattern Anal. Mach. Intell. 20 590–603 He Y, Ben Hamza A and Krim H A 2003 Generalized Divergence Measure for Robust Image Registration IEEE Trans. Signal Process. 51 1211–1220 Hill D L G, Batchelor P G, Holden M and Hawkes D J 2001 Medical image registration Phys. Med. Biol. 46 R1–R45 Hoh C K, Dahlbom M, Harris G, Choi Y, Hawkins R A, Philps M E and Maddahi J 1993 Automated iterative three-dimensional registration of positron emission tomography images J. Nucl. Med. 34 2009–2018 Huber P J 1981 Robust Statistics. New York: Wiley Ibáñez L, Schroeder W, Ng L and Cates J 2005 The ITK Software Guide Kitware ISBN 1–930934-15–7 Khader M and Hamza A B 2011 Nonrigid image registration using an entropic similarity IEEE Trans. Inf. Technol. Biomed. 15 681-690 Kim J and Fessler J 2002 A Image registration using robust correlation in Proc. IEEE Int. Symp. Biomedical Imaging 353–356 Kim J and Fessler J A 2004 Intensity-based image registration using robust correlation coef-cients IEEE Trans. Med. Imaging 23 Klein S, Staring M and Pluim J P W 2007 Evaluation of Optimization Methods for Nonrigid Medical Image Registration Using Mutual Information and B-Splines IEEE Trans. Image Process. 16 2879–2890 Klein S, Staring M, Murphy K, Viergever M A and Pluim J P W 2010 elastix: A Toolbox for Intensity-Based Medical Image Registration IEEE Trans. Med. Imaging 29 196-205 Kong Y Y, Deng Y and Dai Q H 2015 Discriminative Clustering and Feature Selection for Brain MRI Segmentation IEEE Signal Process. Lett. 22 573-577 Lemieux L, Jagoe R, Fish D R, Kitchen N D and Thomas D G T 1994 A patient-to-computed-tomography image registration method based on digitally reconstructed radiographs Med. Phys. 21 1749–1760 Li B C, Yang G Y, Shu H Z and Coatrieux J L 2014 A New Divergence Measure Based on Arimoto Entropy for Medical Image Registration 22nd ICPR Stockholm Sweden Aug. 24-28 3197-3202 Li M, Chen X, Li X, Ma B and Vitányi P M B 2004 The Similarity Metric IEEE Trans. Inf. Theory 50 3250-3264 Liese F and Vajda I 2006 On Divergences and Informations in Statistics and Information Theory IEEE Trans. Inf. Theory 52 4394–4412 Lin J H 1991 Divergence Measures Based on the Shannon Entropy IEEE Trans. Inf. Theory 37 145–151 Loeckx D, Slagmolen P, Maes F, Vandermeulen D and Suetens P 2010 Nonrigid Image Registration Using Conditional Mutual Information IEEE Trans. Med. Imaging 29 19–29 Lou Y F, Irimia A, Vela P A, Chambers M C, Van Horn J D, Vespa P M and Tannenbaum A R 2013 Multimodal Deformable Registration of Traumatic Brain Injury MR Volumes via the Bhattacharyya Distance IEEE Trans. Biomed. Eng. 60 2511-2520 Lowe D 2004 Distinctive Image Features from Scale-Invariant Keypoints Int. J. Comput. Vis. 60 91-110 Lu X S, Zhang S, Su H and Chen Y Z 2008 Mutual information-based multimodal image registration using a novel joint histogram estimation Comput. Med. Imaging Graph. 32 202-209 Maes F, Collignon A, Vandermeulen D, Marchal G and Suetens P 1997 Multimodality image registration by maximization of mutual information IEEE Trans. Med. Imaging 16 187–198 Maintz J B A, van den Elsen P A and Viergever M A 1996 Comparison of edge-based and ridge-based registration of CT and MR brain images Med. Image Anal. 1 Maintz J B A and Viergever M A 1998 A survey of medical image registration Med. Image Anal. 2 1–36 22 Mattes D, Haynor D R, Vesselle H, Lewellen T K and Eubank W 2003 PET-CT image registration in the chest using free-form deformations IEEE Trans. Med. Imaging 22 120–128 Maurer C and Fizpatrick J M 1993 A review of medical image registration Interactive Image-Guided Neurosurg. Park Ridge, IL: American Association of Neurological Surgeons 17–44 Nocedal J 1980 Updating quasi-Newton matrices with limited storage Math. Comput. 35 773–782 Pluim J P W, Maintz J B A and Viergever M A 2000 Image registration by maximization of combined mutual information and gradient information IEEE Trans. Med. Imaging 19 809–814 Pluim J P W, Maintz J B A and Viergever M A 2001 f-Information measures in medical image registration Proc. SPIE Medical Imaging: Image Processing M. Sonka and K. M. Hanson, Eds., San Diego, CA 2 579–587. Pluim J P W, Maintz J B A and Viergever M A 2003 Mutual-Information Based Registration of Medical Images: A Survey IEEE Trans. Med. Imaging 22 986–1004 Pluim J P W, Maintz J B A and Viergever M A 2004 f-Information Measures in medical image registration IEEE Trans. Med. Imaging 23 1508–1516 Press W H, Flannery B P, Teukolsky S A and Vetterling W T 2007 Numerical Recipes in C 3rd ed. Cambridge, U. K.: Cambridge Univ. Press, , ch. 10 521–526 Rao M, Chen Y, Vemuri B C and Wang F 2004 Cumulative residual entropy: A new measure of information IEEE Trans. Inf. Theory 50 1220–1228 Rohlfing T, Maurer C R, Bluemke D A and Jacobs M A 2003 Volume-Preserving Nonrigid Registration of MR Breast Images Using Free-Form Deformation With an Incompressibility Constraint IEEE Trans. Med. Imaging 22 730-741 Rueckert D, Sonoda L I, Hayes C, Hill D L G, Leach M O and Hawkes D J 1999 Nonrigid Registration Using Free-Form Deformations: Application to Breast MR Images IEEE Trans. Med. Imaging 18 712–721 Smith S M and Brady J M 1997 SUSAN - a new approach to low level image processing Int. J. Comput. Vis. 23 45-78 Studholme C, Hill D L G and Hawkes D J 1999 An overlap invariant entropy measure of 3D medical image alignment Pattern Recognit. 32 71–86 Studholme C, Drapaca C, Iordanova B and Cardenas V 2006 Deformation- based mapping of volume change from serial brain MRI in the presence of local tissue contrast change IEEE Trans. Med. Imaging 25 626–639 Szeliski R and Coughlan J 1997 Spline-based image registration Int. J. Comput. Vis. 22 199–218 Thévenaz P and Unser M 2000 Optimization of mutual information for multiresolution image registration IEEE Trans. Image Process. 9 2083–2099 Tustison N J, Awate S P, Song G, Cook T S and Gee J C 2011 Point set registration using Havrda-Charvat-Tsallis entropy measures IEEE Trans. Med. Imaging 30 451-460 Vandemeulebroucke J, Rit S, Kybic J, Clarysse P and Sarrut D 2011 Spatiotemporal motion estimation for respiratory-correlated imaging of the lungs Med. Phys. 38 166-178 Vandemeulebroucke J, Bernard O, Rit S, Kybic J, Clarysse P and Sarrut D 2012 Automated segmentation of a motion mask to preserve sliding motion in deformable registration of thoracic CT Med. Phys. 39 1006-1015 Viola P and Wells III W M 1997 Alignment by maximization of mutual information Int. J. Comput. Vis. 24 137-154 Wang F, Vemuri B C, Rao M and Chen Y 2003 Cumulative residual entropy, a new measure of information & its application to image alignment Proc. of 9th IEEE ICCV, Nice, France, Oct. 13-16, 548–553 Wang F and Vemuri B C 2007 Non-rigid multi-modal image registration using cross-cumulative residual entropy Int. J. Comput. Vis. 74 201–215 Wells W M, Viola P, Atsumi H, Nakajim S and Kiknis R 1996 Multi-modal volume registration by maximization of mutual information Med. Image Anal. 1 35–51 23
© Copyright 2026 Paperzz