PhD opportunity on "Exploiting multi-task learning for endoscopic vision in robotic surgery"

Project overview:

  • Title: Exploiting multi-task learning for endoscopic vision in robotic surgery
  • First supervisor: Miaojing Shi
  • Second supervisor: Tom Vercauteren
  • Clinical Supervisor: Asit Arora
  • Start date: October 2022
Overview of the project objective. Laparoscopic image courtesy of [ROBUST-MIS](https://robustmis2019.grand-challenge.org/).
Overview of the project objective. Laparoscopic image courtesy of ROBUST-MIS.

Project summary

Multi-task learning is common in deep learning, where clear evidence shows that jointly learning correlated tasks can improve on individual performances. Notwithstanding, in reality, many tasks are processed independently. The reasons are manifold:

  • many tasks are not strongly correlated, benefits might be obtained for only one or none of the tasks in joint learning;
  • the scalability of learning multiple tasks is limited with the number of tasks in terms of both network optimization and practical implementation.

Having a scalable and robust multi-task learning strategy however is very meaningful and of substantial potential in many real applications, i.e. endoscopic image processing. This project studies multi-task learning in endoscopic vision for robotic surgery with a particular focus on depth and optical flow estimation, surgical instrument detection and anatomy recognition, as well as surgical action recognition. The aim is to design effective multi-task learning strategies to improve the performance on all tasks.

Project description

Multi-task learning is common in deep learning: For similar tasks like detection and segmentation, or detection and counting, this has already been achieved given the supervision of one for the other. There exist clear evidence that adding one side task would help the improvement of the main task, yet it is unclear how much benefits both tasks can get in these combinations, especially if they are not strongly correlated. For this reason, multiple tasks are normally processed independently in the current fashion. Another main obstacle lies in the scalability of learning multiple tasks together in terms of both network optimization and practical implementation. To tackle this, careful designs of the conjunction of multiple tasks are needed; novel methodologies of learning paradigms are also expected. This project is placed in the endoscopic image processing domain. We aim to develop a machine learning model with general visual intelligence capacity in robotic surgery, which includes depth and optical flow estimation, surgical instrument detection and anatomy recognition, as well as surgical action recognition. Depth and optical flow estimation as well as anatomy recognition are key requirement to develop autonomous robotic control schemes that are cognizant of the surgical scene. Automatic detection and tracking of surgical instruments from laparoscopic surgery videos further plays an important role for providing advanced surgical assistance to the clinical team given the uncertainties associated with surgical robots kinematic chains and the potential presence of tools not directly manipulated by the robot. Being able to know how many and where the instruments find its applications such as: placing informative overlays on the screen; performing augmented reality without occluding instruments; visual servoing; surgical task automation; etc. Surgical action recognition is also critical to advance autonomous robotic assistance during the procedure and for automated auditing purposes.

More information about the PhD project and how to apply on the website of the EPSRC Centre for Doctoral Training in Smart Medical Imaging.

Tom Vercauteren
Tom Vercauteren
Professor of Interventional Image Computing

Tom’s research interests include machine learning and computer assisted interventions