Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose Estimation of Surgical Instruments

1Research in Orthopedic Computer Science, Balgrist University Hospital, University of Zurich, Switzerland, 2Computer Vision and Geometry Group, ETH Zurich, Switzerland, 3Balgrist University Hospital, University of Zurich, Switzerland, 4OR-X Translational Center for Surgery, Balgrist University Hospital, University of Zurich, Switzerland, 5Computer Aided Medical Procedures, Technical University Munich, Germany
Graphical Abstract

A multi-view RGB-D video dataset of ex-vivo spine surgeries.

Millimeter-accurate marker-less 6DoF pose estimation of surgical instruments.

Abstract

State-of-the-art research of traditional computer vision is increasingly leveraged in the surgical domain. A particular focus in computer-assisted surgery is to replace marker-based tracking systems for instrument localization with pure image-based 6DoF pose estimation using deep-learning methods. However, state-of-the-art single-view pose estimation methods do not yet meet the accuracy required for surgical navigation. In this context, we investigate the benefits of multi-view setups for highly accurate and occlusion-robust 6DoF pose estimation of surgical instruments and derive recommendations for an ideal camera system that addresses the challenges in the operating room.

Our contributions are threefold. First, we present a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured with static and head-mounted cameras and including rich annotations for surgeon, instruments, and patient anatomy. Second, we perform an extensive evaluation of three state-of-the-art single-view and multi-view pose estimation methods, analyzing the impact of camera quantities and positioning, limited real-world data, and static, hybrid, or fully mobile camera setups on the pose accuracy, occlusion robustness, and generalizability. Third, we design a multi-camera system for marker-less surgical instrument tracking, achieving an average position error of 1.01 mm and orientation error of 0.89° for a surgical drill, and 2.79 mm and 3.33° for a screwdriver under optimal conditions. Our results demonstrate that marker-less tracking of surgical instruments is becoming a feasible alternative to existing marker-based systems.

Dataset

We provide download and visualization scripts, and a Python wrapper for our dataset on Github: https://github.com/jonashein/mvpsp_dataset

Video

OR-X Bright Test Set OR-X Dark Test Set
Synthetic Training
Synth-Real Training

Qualitative Baselines Comparison

*yellow triangles indicate frames that were not part of the input to the pose estimation method.

OR-X Bright Test Set

Qualitative comparison on the bright OR-X subset.

OR-X Dark Test Set

Qualitative comparison on the dark OR-X subset.

BibTeX

@misc{hein2023nextgeneration,
      title={Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose Estimation of Surgical Instruments},
      author={Jonas Hein and Nicola Cavalcanti and Daniel Suter and Lukas Zingg and Fabio Carrillo and Lilian Calvet and Mazda Farshad and Marc Pollefeys and Nassir Navab and Philipp Fürnstahl},
      year={2023},
      eprint={2305.03535},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}