Model Details

Model Description

MorphEm is a self supervised learning framework trained with the DINO Bag of Channels recipe on the entire CHAMMI-75 dataset. It serves as a benchmark for performance for self-supervised models.

  • Developed by: Vidit Agrawal, John Peters, Juan Caicedo
  • Shared by: Caicedo Lab
  • Model type: Vision Transformer Small
  • License: MIT License

Model Sources

Uses

Direct Use

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModel
import torch
import torch.nn as nn
import torchvision
from torchvision import transforms as v2
import numpy as np

# Noise Injector transformation
class SaturationNoiseInjector(nn.Module):
    def __init__(self, low=200, high=255):
        super().__init__()
        self.low = low
        self.high = high

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        channel = x[0].clone()
        noise = torch.empty_like(channel).uniform_(self.low, self.high)
        mask = (channel == 255).float()
        noise_masked = noise * mask
        channel[channel == 255] = 0
        channel = channel + noise_masked
        x[0] = channel
        return x


# Self Normalize transformation
class PerImageNormalize(nn.Module):
    def __init__(self, eps=1e-7):
        super().__init__()
        self.eps = eps
        self.instance_norm = nn.InstanceNorm2d(
            num_features=1,
            affine=False,
            track_running_stats=False,
            eps=self.eps,
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        if x.dim() == 3:
            x = x.unsqueeze(0)
        x = self.instance_norm(x)
        if x.shape[0] == 1:
            x = x.squeeze(0)
        return x


# Load model
device = "cuda"
model = AutoModel.from_pretrained("CaicedoLab/MorphEm", trust_remote_code=True)
model.to(device).eval()

# Define transforms
transform = v2.Compose([
    SaturationNoiseInjector(),
    PerImageNormalize(),
    v2.Resize(size=(224, 224), antialias=True),
])

# Generate random batch (N, C, H, W)
batch_size = 2
num_channels = 3
images = torch.randint(0, 256, (batch_size, num_channels, 512, 512), dtype=torch.float32)

print(f"Input shape: {images.shape} (N={batch_size}, C={num_channels}, H=512, W=512)")
print()

# Bag of Channels (BoC) - process each channel independently
with torch.no_grad():
    batch_feat = []
    images = images.to(device)
    
    for c in range(images.shape[1]):
        # Extract single channel: (N, C, H, W) -> (N, 1, H, W)
        single_channel = images[:, c, :, :].unsqueeze(1)
        
        # Apply transforms
        single_channel = transform(single_channel.squeeze(1)).unsqueeze(1)
        
        # Extract features
        output = model.forward_features(single_channel)
        feat_temp = output["x_norm_clstoken"].cpu().detach().numpy()
        batch_feat.append(feat_temp)

# Concatenate features from all channels
features = np.concatenate(batch_feat, axis=1)

print(f"Output shape: {features.shape}")
print(f"  - Batch size (N): {features.shape[0]}")
print(f"  - Feature dimension (C * feature_dim): {features.shape[1]}")

Training Details

Training Data

MorphEm was pre-trained on the entire CHAMMI-75 pre-training data. The CHAMMI-75 dataset consists of 75 heterogenous studies and 2.8 million multi-channel images.

Training Procedure

We have utilized the self-supervised learning framework called DINO. We pre-trained a model which inputs a single channel one at a time. For evaluation, you would concatenate each channel specifically.

Preprocessing

We used three transforms mainly for preprocessing: SaturationNoiseInjector(), SelfImageNormalize(), Resize(224,224)

# Noise Injector transformation
class SaturationNoiseInjector(nn.Module):
    def __init__(self, low=200, high=255):
        super().__init__()
        self.low = low
        self.high = high

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        channel = x[0].clone()
        noise = torch.empty_like(channel).uniform_(self.low, self.high)
        mask = (channel == 255).float()
        noise_masked = noise * mask
        channel[channel == 255] = 0
        channel = channel + noise_masked
        x[0] = channel
        return x


# Self Normalize transformation
class PerImageNormalize(nn.Module):
    def __init__(self, eps=1e-7):
        super().__init__()
        self.eps = eps
        self.instance_norm = nn.InstanceNorm2d(
            num_features=1,
            affine=False,
            track_running_stats=False,
            eps=self.eps,
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        if x.dim() == 3:
            x = x.unsqueeze(0)
        x = self.instance_norm(x)
        if x.shape[0] == 1:
            x = x.squeeze(0)
        return x

Evaluation

We have evaluated this model on 6 different benchmarks. The model is highly competitive in most of them. The benchmarks are listed below:

  1. CHAMMI
  2. HPAv23
  3. Jump-CP
  4. IDR0017
  5. CELLPHIE
  6. RBC-MC

More details can be found in the paper:

Summary

Environmental Impact

  • Hardware Type: Nvidia RTX A6000
  • Hours used: 2352
  • Cloud Provider: Private Infrastructure
  • Compute Region: Private Infrastructure
  • Carbon Emitted: 304 kg CO2

Technical Specifications

The model is a ViT Small trained on 2500 Nvidia A6000 GPU hours. The model was trained on a multi-node system with 2 nodes, each containing 7 GPUs.

Citation

Can be cited as the following:

Model Card Authors

Vidit Agrawal, John Peters, Juan C. Caicedo

Model Card Contact

[email protected], [email protected], [email protected]

Downloads last month
182
Safetensors
Model size
43.8M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support