Naoaki Naoaki

Anterior-Posterior Depth Estimation from a Single X-ray Image単一方向X線画像からの前後面深度推定

Focus研究焦点

Anterior-Posterior Depth Estimation from a Single X-ray Image単一方向X線画像からの前後面深度推定

This project addresses the problem of recovering three-dimensional structural information from a single frontal X-ray image, with a specific focus on the vertebral column. While a single anteroposterior radiograph provides rich two-dimensional information about the size, shape, and alignment of vertebral bodies, it fundamentally lacks the depth dimension — the extent of each structure along the anterior-posterior (AP) axis, i.e., the axis perpendicular to the imaging plane. This missing dimension is critical for a complete understanding of spinal anatomy, particularly for assessing pathological conditions such as vertebral fractures, spondylolisthesis, or degenerative disc disease, where the three-dimensional geometry of the spine plays a central role in both diagnosis and treatment planning.

The proposed method, referred to as dual-face depth estimation, is designed to recover this missing dimension by simultaneously predicting two depth images from a single X-ray: one corresponding to the front surface (anterior face) of each vertebral body and one corresponding to the back surface (posterior face). These two complementary depth maps together define the full anterior-posterior extent of each vertebra along each projection ray, effectively encoding both surface geometry and local thickness. This representation is physically motivated and directly compatible with X-ray imaging geometry, as the front and back depth values correspond to the points where each ray enters and exits the bone.

The prediction pipeline begins with a segmentation step that identifies individual vertebral bodies in the X-ray image. A segmentation network produces labeled bone masks, separating each vertebra from the surrounding structures and providing a geometric prior that constrains the subsequent depth estimation. The segmentation output is then used alongside the original X-ray image as input to the main depth estimation network, which predicts the front and back depth maps jointly. The network architecture is designed to leverage both the global context provided by the full X-ray image and the precise spatial boundaries defined by the segmentation.

Once the dual-face depth maps are predicted, three-dimensional reconstruction is performed using the known X-ray imaging geometry. The X-ray source position and detector plane define a set of projection rays, one per pixel. The predicted front and back depth values for each pixel define two points along the corresponding ray. These points are back-projected into three-dimensional space, yielding a dense point cloud representation of each vertebral surface. This point cloud can then be used for 2D-3D registration with reference CT models, enabling the alignment of a statistical or patient-specific 3D shape model to the X-ray data and thereby recovering a complete and anatomically coherent 3D reconstruction.

The training data for this approach is generated from CT volumes using a pipeline that pairs simulated X-ray projections with ground-truth depth maps computed from the CT segmentations. This allows the model to be trained in a fully supervised manner without requiring actual depth measurements at inference time. The method is evaluated both qualitatively — by visually comparing predicted and ground-truth depth maps — and quantitatively — by measuring the accuracy of the resulting 3D reconstructions against CT reference data. The approach offers a compelling direction for enabling three-dimensional vertebral analysis from routine, low-cost radiographic examinations.

本研究は、単一の正面X線画像から脊椎の三次元構造情報を復元する問題に取り組んでいます。正面X線画像は椎体の大きさ・形状・アライメントに関する豊富な二次元情報を提供しますが、前後方向（AP方向、すなわち撮影面に垂直な軸）の奥行き次元が根本的に欠如しています。この欠損次元は脊椎解剖の完全な理解に不可欠であり、椎体骨折・脊椎すべり症・変性椎間板疾患などの病的状態の診断と治療計画において特に重要です。

提案手法「デュアルフェイス深度推定」は、単一X線画像から二種類の深度画像を同時予測することでこの欠損次元を復元します。一方は各椎体の前面（腹側）に対応する深度マップ、もう一方は後面（背側）に対応する深度マップです。この二つの補完的な深度マップが合わさって各椎体の前後方向の全範囲を定義し、表面形状と局所的な厚さの両方をエンコードします。前後の深度値は投影光線が骨に入射・出射する点に対応し、X線撮影幾何学と物理的に整合した表現です。

予測パイプラインはセグメンテーションステップから始まり、X線画像内の個々の椎体を識別します。セグメンテーションネットワークが各椎体のラベル付き骨マスクを生成し、後続の深度推定を制約する幾何学的事前情報を提供します。セグメンテーション出力は元のX線画像と合わせて深度推定ネットワークの入力となり、前後の深度マップを共同で予測します。

デュアルフェイス深度マップが予測されると、既知のX線撮像幾何学を使って三次元再構成が実行されます。X線源位置とデテクタ平面が各ピクセルに対応する投影光線を定義し、各ピクセルの前後深度値が光線上の2点を特定します。これらの点を三次元空間に逆投影することで各椎体表面の密な点群表現が得られ、参照CTモデルとの2D-3Dレジストレーションに活用できます。

学習データはCT体積データから生成され、シミュレーションX線投影とCTセグメンテーションから算出した正解深度マップのペアが使用されます。これにより、推論時に実際の深度計測を必要とせず完全な教師あり学習が可能です。本手法は定性的な深度マップ比較と定量的な三次元再構成精度評価の両面で検証されており、通常の低コスト放射線検査から三次元脊椎解析を可能にする有望な方向性を示しています。

Project 7
Age-Specific Analysis of Cortical and Trabecular Bone Mineral Density in the Femur and Lumbar Spine Using a Large-Scale CT Database
大規模CTデータベースを用いた大腿骨および腰椎における皮質骨・海綿骨密度の年代別解析
Bone Mineral Density
CT Analysis
Cortical Bone
Trabecular Bone
Osteoporosis

Anterior-Posterior Depth Estimation from a Single X-ray Image