ZMIC Journal Club

1️⃣ DiffAtlas: GenAI-fying Atlas Segmentation via Image-Mask Diffusion

2️⃣ Anomaly Detection by Clustering DINO Embeddings using a Dirichlet Process Mixture

Presenter: 张杨
School of Data Science, Fudan University
2025-11-14

ZMIC Journal Club

Their Posters

Thanks to @高一博

Intro
ZMIC Journal Club

Introduction

  • Deep model like U-Net: depend heavily on large(?) datasets and struggle with domain adaptation.
  • Atlas-based method: struggle with fine-grained structures and require labor-intensive, domain-specific atlases.

DiffAtlas leverages modern generative AI techniques to "GenAI-fy" the registration + propagation process

DiffAtlas
ZMIC Journal Club

DiffAtlas

DiffAtlas
ZMIC Journal Club

From Discriminative to Generative

  • Discriminative models e.g. UNet () are parameterized estimation for the mapping from image to mask:

  • Generative models e.g. Diffusion () are estimation for the sample distrubution:

  • Conditional Generative models e.g. Mask Diffusion() are estimation for the conditional distrubution:

DiffAtlas
ZMIC Journal Club

Conditional Diffusion

Mask diffusion decomposes the generation of the segmentation mask into a step-by-step denoising process:

  • forward

  • backward

use

to estimate

DiffAtlas
ZMIC Journal Club

DiffAtlas: training

DiffAtlas models the input image and segmentation mask as a pair within a generative atlas space:

DiffAtlas
ZMIC Journal Club

Details

# self.denoise_fn is a UNet3D
recon = self.denoise_fn(
    **dict(
        x=torch.cat((x_noisy, m_noisy), dim=1), 
        time=t, 
        cond=cond, 
        **kwargs
    )
)
x_recon = recon[:,0,:,:,:]
m_recon = recon[:,1:(recon.size()[1]),:,:,:]
if self.loss_type == 'l1':
    loss = F.l1_loss(noise_x, x_recon) + F.l1_loss(noise_m, m_recon)
elif self.loss_type == 'l2':
    loss = F.mse_loss(noise_x, x_recon) + F.mse_loss(noise_m, m_recon)

DiffAtlas represents the mask S using a signed distance function (SDF) to improve the capture of fine anatomical details and ensuring smooth transitions between regions, which is based on FlowSDF.

DiffAtlas
ZMIC Journal Club

DiffAtlas: inference

  1. Randomly initialized noisy image-mask pair
  2. Refines it iteratively through the reverse diffusion process over T timesteps.
    • At each timestep , the noisy image is replaced with the noisy version of the input image .
DiffAtlas
ZMIC Journal Club

Visualization

Method Comparison:

  • ICF (ImageConditional Feedforward): nnU-Net
  • RBA (Registration-Based Atlas): CMMAS
  • ICMD (Mask Diffusion): MedSegDiffv2
  • DA (Ours): DiffAtlas
DiffAtlas
ZMIC Journal Club

Experiments 1: full training setting

With TotalSegmentator(746 CT) and MM-WHS(20CT + 20MR) dataset, train:test = 8:2.

DiffAtlas
ZMIC Journal Club

Experiments 2: few shot

  • 2-shot (2 training samples) and 4-shot (4 training samples).
  • Test samples remain the same.

DiffAtlas
ZMIC Journal Club

Experiments 3: cross modality

DiffAtlas
ZMIC Journal Club

Introduction

There are two line of works to tackle unsupervised anomaly detection (UAD) problems.

  • reconstruction-based approaches, where a generative model is trained to reconstruct normal images.
  • model the distribution of the features extracted from normal samples.
AnomalyDINO-DPMM
ZMIC Journal Club

DINO v2

  • Left: DINO framework
  • Right: DINOv2, more powerful and robust for many tasks.
AnomalyDINO-DPMM
ZMIC Journal Club

AnomalyDINO

AnomalyDINO-DPMM
ZMIC Journal Club

AnomalyDINO-DPMM

AnomalyDINO-DPMM
ZMIC Journal Club

Intuition

My personal insight

What AnomalyDINO-DPMM have done is similar to vector database indexing (i.e. some kind of compression or clustering, aimed to acculerate vector comparison):

AnomalyDINO-DPMM
ZMIC Journal Club

DPMM

DPMM is a Gaussian mixture model without a fixed number of components that instead determines the necessary number of components based on the data, allowing for a more flexible, data-driven approach.

AnomalyDINO-DPMM
ZMIC Journal Club

EM iteration for DPMM

AnomalyDINO-DPMM
ZMIC Journal Club

Anomaly Scoring

  • likelihood:

  • Euclidean distance:

  • Cosine similarity:

To suppress all components with vanishing weights, we only consider these k:

AnomalyDINO-DPMM
ZMIC Journal Club

Experiments

AnomalyDINO-DPMM
ZMIC Journal Club

Efficiency-Performance Trade-off

AnomalyDINO-DPMM
ZMIC Journal Club

THANKS

THANKS