Abstract
Deep neural networks have shown to be promising approaches for medical image analysis. However, their training is most effective when they learn robust data representations using large-scale annotated datasets, which are tedious to acquire in clinical practice. As medical annotations are often limited, there has been an increasing interest in making data representations robust in case of data lack. In particular, a spate of research focuses on constraining the learned representations to be interpretable and able to separate out, or disentangle, the data explanatory factors. This chapter discusses recent disentanglement frameworks, with a special focus on the image segmentation task. We build on a recent approach for disentanglement of cardiac medical images into disjoint patient anatomy and imaging modality dependent representations. We incorporate into the model a purposely designed architecture (which we term “temporal transformer”) which, from a given image and a time gap, can estimate anatomical representations of an image at a future time-point within the cardiac cycle of cine MRI. The transformer's role is to introduce a self-supervised objective to encourage the emergence of temporally coherent data representations. We show that such a regularization improves the quality of disentangled representations, ultimately increasing semi-supervised segmentation performance when annotations are scarce. Finally, we show that predicting future representations can be potentially used for image synthesis tasks.
Original language | English |
---|---|
Title of host publication | Biomedical Image Synthesis and Simulation |
Subtitle of host publication | Methods and Applications |
Publisher | Elsevier |
Pages | 325-346 |
Number of pages | 22 |
ISBN (Electronic) | 9780128243497 |
ISBN (Print) | 9780128243503 |
DOIs | |
Publication status | Published - 1 Jul 2022 |
Keywords / Materials (for Non-textual outputs)
- Disentangled representations
- Semi-supervised segmentation
- Temporal consistency