2

LymphAware: Domain-Aware Bias Disruption for Reliable Lymphoma Cancer AI Diagnosis
This work introduces LymphAware, a domain-aware bias disruption framework designed to improve the robustness and clinical reliability of lymphoma histopathology AI systems. Modern classifiers often rely on shortcut signals such as scanner-specific color distributions, staining variability, and slide-preparation artifacts, which artificially inflate in-domain accuracy but collapse under cross-acquisition domain shift. LymphAware explicitly disentangles disease-relevant morphological features from acquisition-sensitive nuisance factors through three complementary mechanisms, morphology-centric feature isolation, adversarial and orthogonality-based shortcut suppression, and cross-domain stability regularization. To further expose hidden shortcut dependencies, the framework incorporates artifact-shift perturbations that simulate realistic staining and scanner variability while enforcing counterfactual consistency during training. Extensive evaluation on a heterogeneous multi-source lymphoma benchmark demonstrates improved cross-domain generalization, stable behavior under hyperparameter variation, and attribution maps better aligned with pathology-relevant regions. While these explanations reflect associative alignment rather than formal causal inference, the findings underscore the necessity of representation-level shortcut disruption for building clinically trustworthy lymphoma diagnostic AI systems.
SatDiff: A Stable Diffusion Framework for Inpainting Very High-Resolution Satellite Imagery
Satellite image inpainting is a critical task in remote sensing, requiring accurate restoration of missing or occluded regions for reliable image analysis. In this paper, we present SatDiff, an advanced inpainting framework based on diffusion models, specifically designed to tackle the challenges posed by very high-resolution (VHR) satellite datasets such as DeepGlobe and the Massachusetts Roads Dataset. Building on insights from our previous work, SatInPaint, we enhance the approach to achieve even higher recall and overall performance. SatDiff introduces a novel Latent Space Conditioning technique that leverages a compact latent space for efficient and precise inpainting. Additionally, we integrate Explicit Propagation into the diffusion process, enabling forward-backward fusion for improved stability and accuracy. Inspired by encoder-decoder architectures like the Segment Anything Model (SAM), SatDiff is seamlessly adaptable to diverse satellite imagery scenarios. By balancing the efficiency of preconditioned models with the flexibility of postconditioned approaches, SatDiff establishes a new benchmark in VHR satellite datasets, offering a scalable and high-performance solution for satellite image restoration. The code for SatDiff is publicly available at https://github.com/kaopanboonyuen/SatDiff.
DOTA: Deformable Optimized Transformer Architecture for End-to-End Text Recognition with Retrieval-Augmented Generation
In this paper, we present a novel end-to-end framework that integrates ResNet and Vision Transformer (ViT) backbones with cutting-edge techniques such as Deformable Convolutions, Retrieval-Augmented Generation, and Conditional Random Fields (CRF). These innovations work together to significantly improve feature representation and Optical Character Recognition (OCR) performance. By replacing the standard convolution layers in the third and fourth blocks with Deformable Convolutions, the framework adapts more flexibly to complex text layouts, while adaptive dropout helps prevent overfitting and enhance generalization. Moreover, incorporating CRFs refines the sequence modeling for more accurate text recognition. Extensive experiments on six benchmark datasets—IC13, IC15, SVT, IIIT5K, SVTP, and CUTE80—demonstrate the framework’s exceptional performance. Our method represents a significant leap forward in OCR technology, addressing challenges in recognizing text with various distortions, fonts, and orientations. The framework has proven not only effective in controlled conditions but also adaptable to more complex, real-world scenarios. The code for this framework is available at https://github.com/kaopanboonyuen/DOTA.