Mask diffusion decomposes the generation of the segmentation mask into a step-by-step denoising process:
use
to estimate
DiffAtlas models the input image and segmentation mask as a pair

# self.denoise_fn is a UNet3D
recon = self.denoise_fn(
**dict(
x=torch.cat((x_noisy, m_noisy), dim=1),
time=t,
cond=cond,
**kwargs
)
)
x_recon = recon[:,0,:,:,:]
m_recon = recon[:,1:(recon.size()[1]),:,:,:]
if self.loss_type == 'l1':
loss = F.l1_loss(noise_x, x_recon) + F.l1_loss(noise_m, m_recon)
elif self.loss_type == 'l2':
loss = F.mse_loss(noise_x, x_recon) + F.mse_loss(noise_m, m_recon)
DiffAtlas represents the mask S using a signed distance function (SDF) to improve the capture of fine anatomical details and ensuring smooth transitions between regions, which is based on FlowSDF.


Method Comparison:
With TotalSegmentator(746 CT) and MM-WHS(20CT + 20MR) dataset, train:test = 8:2.




There are two line of works to tackle unsupervised anomaly detection (UAD) problems.




My personal insight
What AnomalyDINO-DPMM have done is similar to vector database indexing (i.e. some kind of compression or clustering, aimed to acculerate vector comparison):

DPMM is a Gaussian mixture model without a fixed number of components that instead determines the necessary number of components based on the data, allowing for a more flexible, data-driven approach.



likelihood:
Euclidean distance:
Cosine similarity:
To suppress all components with vanishing weights, we only consider these k:



THANKS