Weakly Semi-supervised Tool Detection in Minimally Invasive Surgery Videos

Keio University
ICASSP 2024

Refine pseudo-label categories using image-level labels to reduce misclassification.

One-Sentence Summary

By leveraging image-level labels and the co-occurrence tendency in surgical video, we build a surgical tool detector with low annotation cost (-55% cost).

Abstract

Surgical tool detection is essential for analyzing and evaluating minimally invasive surgery videos. Current approaches are mostly based on supervised methods that require large, fully instance-level labels (i.e., bounding boxes). However, large image datasets with instance-level labels are often limited because of the burden of annotation. Thus, surgical tool detection is important when providing image-level labels instead of instance-level labels since image-level annotations are considerably more time-efficient than instance-level annotations. In this work, we propose to strike a balance between the extremely costly annotation burden and detection performance. We further propose a co-occurrence loss, which considers a characteristic that some tool pairs often co-occur together in an image to leverage image-level labels. Encapsulating the knowledge of co-occurrence using the co-occurrence loss helps to overcome the difficulty in classification that originates from the fact that some tools have similar shapes and textures. Extensive experiments conducted on the Endovis2018 dataset in various data settings show the effectiveness of our method.

BibTeX

@article{fujii2024weakly,
  author    = {Fujii, Ryo and Hachiuma, Ryo and Saito, Hideo},
  title     = {Weakly Semi-Supervised Tool Detection in Minimally Invasive Surgery Videos},
  journal   = {ICASSP},
  year      = {2024},
}