Title: Scalable High-Performance Image Registration Framework by Unsupervised Deep Feature Representations Learning Paper Link

Authors: Guorong Wu, Minjeong Kim, Qian Wang, Brent C. Munsell, Dinggang Shen

Association: The University of North Carolina at Chapel Hill, Chapel Hill

Submission: 2016

…………………………………………………………………………………………………………………

»> Background of Deep Learning in Medical Image Analysis

[NEED A Reading Notes]

»> A Review: Robot-Assisted Endovascular Catheterization Technologies:

[NEED A Reading Notes]

…………………………………………………………………………………………………………………

»> Image Registration Basic Knowledge

[Basic Knowledge]

»> Image Registration Literature Review

[Literature Review]

»> Slice-To-Volume Medical Image Registration Background

[NEED A Reading Notes]

…………………………………………………………………………………………………………………

Contributions

  1. learn the hierarchical feature representations (discover compact and highly discriminative (区别的) features upon observed imaging data) directly from the observed medical images by using unsupervised deep learning paradigm.

  2. introduce a stacked autoencoder (SAE) with convolutional network architecture to identify intrinsic deep feature representations in image patches.

  3. in order to accurately recognize complex morphological patterns in 3-D medical image patches, a deep learning feature selec- tion method is proposed.

1) it does not require manually labeled ground-truth data (that typically is a laborious, subjective, and error-prone process), so it does not suffer from the same limita- tions as those found in the supervised methods, and

2) it offers a hierarchical learning paradigm to learn not only low-level but also high-level features that are more flexible than conven- tional handcrafted features, or even the best features found by the existing supervised learning-based feature selection methods.

  1. Feature representations can be directly learned from the observed imaging data in a very short amount of time. (the proposed image registration framework can be quickly deployed to per- form deformable image registration on new image modalities or new imaging applications with little to even no human intervention.)

  2. difficulty in finding the optimal alignment parameters

  3. lack of global regularization and complicated parameter tuning

Motivation

Recent advances in deep neural networks:

  1. Convolutional neural networks and their variants have shown potential by largely outperforming conventional computer vision algorithms

  2. Spatial Transformer Networks (Max Jaderberg) uses a differentiable network module inside a CNN to overcome the drawbacks of CNNs (i.e., lack of scale- and rotation-invariance).

  3. a novel deep network model specifically designed for ssEM image registration.

  4. a novel combination of an STN and a convolutional autoencoder that generates a deformation map (i.e., vector map) for the entire image alignment via backpropagation of the network.

  5. propose a feature-based image similarity measure

  6. can easily extend to various applications by employing different feature encoding networks

Method

Method 1. Feature Generation Using a Convolutional Autoencoder

[GOAL] compute similarities between adjacent EM sections

[Architecture] a convolutional encoder comprised of convolutional layers with ReLU activations and a deconvolutional decoder comprised of deconvolutional layers with ReLU activations

[Output] similarities between adjacent EM sections

[Results] Figure 1 shows that our autoencoder feature-based registration gen-erates more accurate results compared to the conventional pixel intensity-based registration, i.e., (d) shows the smaller normalized cross correlation (NCC) error between aligned images than (c).

Method 2. Deformable Image Registration Using a STN

[GOAL] is intended to find the proper deformation of the input image via an ST by minimizing the registration error measured by the pre-trained autoencoder.

[Architecture]

However: the resolution of vector map v is usually coarser (简略的) than that of the input image.

Need smooth interpolation of a coarse vector map to obtain a perpixel moving vector for actual deformation of the moving image.

smooth deformable transform:

  1. thin plate spline (TPS)
  2. bilinear (with better results compared to the TPS) [Reading Note]
  3. bicubic
  4. B-spline

To increase the robustness of alignment, extend the objective function (Eq.4) to leverage multiple neighbor sections. Let the moving image be I0, its neighbor reference n images be I1 to In, and their corresponding weights be wi.

(Eq. 5) combines the registration errors across neighbor images, which can lessen strong registration errors from images with artifacts and avoid large deformation. M(T) the autoencoder feature map using a bilinear interpolation.

Method 3. empty space mask

[GOAL] To accumulate the registration error only within the image after deformation

After image deformation, we collect the pixels outside the valid image region and make a binary mask image. We resize this mask image to match the size of the autoencoder feature map using a bilinear interpolation.

Method 4. loss drop

[GOAL] Prevented local minimums and obtained smoother registration results.

First dropped the top 50% of high error features, and then reduced the dropping rate by half per every iteration.

Performance