Published on Mon Feb 13 2017

Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis

Sebastian Bodenstedt, Martin Wagner, Darko Katić, Patrick Mietkowski, Benjamin Mayer, Hannes Kenngott, Beat Müller-Stich, Rüdiger Dillmann, Stefanie Speidel

Computer-assisted surgery (CAS) aims to provide the surgeon with the right type of assistance at the right moment. Such assistance systems are especially relevant in laparoscopic surgery. For many assistance functions, e.g. displaying the location of a tumor at the appropriate time, analyzing the surgical workflow is a prerequisite.

0
0
0
Abstract

Computer-assisted surgery (CAS) aims to provide the surgeon with the right type of assistance at the right moment. Such assistance systems are especially relevant in laparoscopic surgery, where CAS can alleviate some of the drawbacks that surgeons incur. For many assistance functions, e.g. displaying the location of a tumor at the appropriate time or suggesting what instruments to prepare next, analyzing the surgical workflow is a prerequisite. Since laparoscopic interventions are performed via endoscope, the video signal is an obvious sensor modality to rely on for workflow analysis. Image-based workflow analysis tasks in laparoscopy, such as phase recognition, skill assessment, video indexing or automatic annotation, require a temporal distinction between video frames. Generally computer vision based methods that generalize from previously seen data are used. For training such methods, large amounts of annotated data are necessary. Annotating surgical data requires expert knowledge, therefore collecting a sufficient amount of data is difficult, time-consuming and not always feasible. In this paper, we address this problem by presenting an unsupervised method for training a convolutional neural network (CNN) to differentiate between laparoscopic video frames on a temporal basis. We extract video frames at regular intervals from 324 unlabeled laparoscopic interventions, resulting in a dataset of approximately 2.2 million images. From this dataset, we extract image pairs from the same video and train a CNN to determine their temporal order. To solve this problem, the CNN has to extract features that are relevant for comprehending laparoscopic workflow. Furthermore, we demonstrate that such a CNN can be adapted for surgical workflow segmentation. We performed image-based workflow segmentation on a publicly available dataset of 7 cholecystectomies and 9 colorectal interventions.

Mon Jun 18 2018
Computer Vision
Temporal coherence-based self-supervised learning for laparoscopic workflow analysis
Convolutional networks provide the best performance for video-based analysis. For training such networks, large amounts of annotated data are necessary. We were able to achieve an increase of the F1score of up to 10 points when compared to a non-pretrained neural network.
0
0
0
Tue Feb 09 2016
Computer Vision
EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos
phase recognition has been studied in the context of several kinds of surgeries, such as cataract, neurological, and laparoscopic surgeries. Two types of features are typically used to perform this task: visual features and tool usage signals. We present a novel CNN architecture, called EndoNet,
0
0
0
Tue Sep 01 2020
Artificial Intelligence
Aggregating Long-Term Context for Learning Laparoscopic and Robot-Assisted Surgical Workflows
Analyzing surgical workflow is crucial for surgical assistance robots to understand surgeries. We propose a new temporal network structure that leverages task-specific network representation. We demonstrate superior results over existing and novel state-of-the-art segmentation techniques.
0
0
0
Wed May 22 2019
Machine Learning
LapTool-Net: A Contextual Detector of Surgical Tools in Laparoscopic Videos Based on Recurrent Convolutional Neural Networks
LapTool-Net is a multilabel classifier based on aRecurrent Convolutional Neural Network (RCNN) architecture to simultaneously extract the spatio-temporal features. LapTool-net was trained using a publicly available dataset of laparoscopic cholecystectomy.
0
0
0
Thu Nov 08 2018
Computer Vision
Prediction of laparoscopic procedure duration using unlabeled, multimodal sensor data
The course of surgical procedures is often unpredictable, making it difficult to estimate the duration of procedures beforehand. A context-aware method that analyses the workflow of an intervention online and automaticallyPredicts the remaining duration would alleviate these problems.
0
0
0
Tue Dec 04 2018
Machine Learning
Weakly Supervised Convolutional LSTM Approach for Tool Tracking in Laparoscopic Videos
Real-time surgical tool tracking is a core component of the future of the operating room. Current methods for surgical tooltracking in videos need to be trained on data in which the spatial positions of surgical tools are manually annotated. We propose to use solely binary presence annotations to train a tool tracker for laparoscopic videos.
0
0
0