Developing novel data synthesis methods for diagnostic data using state-of-the-art generative models.

PI: Claes Lundström
Ongoing: Yes
Start: 2024-11-21

In research of AI for medical applications, a widespread problem is a lack of data for training and validating algorithms. For rare diseases, the challenge is particularly severe. To this end, the project aims to develop novel data synthesis methods for diagnostic data. We will build on state-of-the-art methods in image synthesis such as generative adversarial networks and diffusion generative models.

A key idea we will explore is to utilize related data sets as templates for the synthesis. To begin with, within-discipline datasets of different organs/diseases, such as using histopathology imaging from common cancers and normal tissue to synthesize rare cancer training data. We will also explore cross-discipline “related data templates” such as using radiology (primarily high-resolution CT) for histopathology synthesis and introducing connections to genomics data. We will in all method development aim to include uncertainty estimation through, for instance, ensemble approaches.

Project Plan

Phase 1: Domain-Specific Fine-Tuning

Phase 2: ControlNet Integration

Phase 3: Scaling to Larger Images

Phase 4: Evaluation Framework

Phase 5: Refinement and Specialization

Phase 6: Documentation and Publication