MeViT is a Vision Transformer (ViT) model tailored for semantic segmentation on medium-resolution Landsat satellite imagery of Thai agricultural regions. It classifies crops like para rubber, corn, and pineapple using a revised MixCFN block that balances depth-wise convolution paths for multi-scale feature extraction.
At AGL (Advancing Geoscience Laboratory), Chulalongkorn University, we focus on developing state-of-the-art AI models for satellite imagery and remote sensing applications. Our research spans Vision Transformers, stable diffusion, and weakly supervised learning for semantic segmentation, inpainting, and temporal forecasting.
- Precision: 92.22%
- Recall: 94.69%
- F1 Score: 93.44%
- Mean IoU: 83.63%