Remote Sensing

KAO: Kernel-Adaptive Optimization in Diffusion for Satellite Image

Satellite image inpainting is a crucial task in remote sensing, where accurately restoring missing or occluded regions is essential for robust image analysis. In this paper, we propose KAO, a novel framework that utilizes Kernel-Adaptive Optimization within diffusion models for satellite image inpainting. KAO is specifically designed to address the challenges posed by very high-resolution (VHR) satellite datasets, such as DeepGlobe and the Massachusetts Roads Dataset. Unlike existing methods that rely on preconditioned models requiring extensive retraining or postconditioned models with significant computational overhead, KAO introduces a Latent Space Conditioning approach, optimizing a compact latent space to achieve efficient and accurate inpainting. Furthermore, we incorporate Explicit Propagation into the diffusion process, facilitating forward-backward fusion, which improves the stability and precision of the method. Experimental results demonstrate that KAO sets a new benchmark for VHR satellite image restoration, providing a scalable, high-performance solution that balances the efficiency of preconditioned models with the flexibility of postconditioned models.

Teerapong Panboonyuen

2025 In IEEE Transactions on Geoscience and Remote Sensing (TGRS, Impact Factor 8.6)

KAO: Kernel-Adaptive Optimization in Diffusion for Satellite Image

GuidedBox: A segmentation-guided box teacher-student approach for weakly supervised road segmentation

Road segmentation in remote sensing is crucial for applications like urban planning, traffic monitoring, and autonomous driving. Labeling objects via pixel-wise segmentation is challenging compared to bounding boxes. Existing weakly supervised segmentation methods often rely on heuristic bounding box priors, but we propose that box-supervised techniques can yield better results. Introducing GuidedBox, an end-to-end framework for weakly supervised instance segmentation. GuidedBox uses a teacher model to generate high-quality pseudo-masks and employs a confidence scoring mechanism to filter out noisy masks. We also introduce a noise-aware pixel loss and affinity loss to optimize the student model with pseudo-masks. Our extensive experiments show that GuidedBox outperforms state-of-the-art methods like SOLOv2, CondInst, and Mask R-CNN on the Massachusetts Roads Dataset, achieving an AP50 score of 0.9231. It also shows strong performance on SpaceNet and DeepGlobe datasets, proving its versatility in remote sensing applications. Code has been made available at https://github.com/kaopanboonyuen/GuidedBox.

Teerapong Panboonyuen, C. Charoenphon, C. Satirapod

2025 In European Journal of Remote Sensing (Taylor & Francis)

GuidedBox: A segmentation-guided box teacher-student approach for weakly supervised road segmentation

Investigating the use of deep learning-derived weighted mean temperature for GPS-PWVs estimation

GNSS data offers a reliable alternative for estimating Precipitable Water Vapor (PWV), but accurate GPS-PWV determination in tropical climates requires weighted mean temperature (Tm). With traditional measurement methods often unavailable in Thailand, and existing empirical models showing low accuracy, we propose a deep learning approach. Our Bidirectional Learning with Attention (BLA) model incorporates GRUs and an attention mechanism for Tm modeling. Trained on ERA5 data (2017-2021) and evaluated on 2022 data, BLA-Tm achieved 76% improvement over conventional models, reducing biases significantly. Validation with 280 GNSS stations confirmed BLA-Tm’s superior accuracy in GPS-PWV estimation.

C. Charoenphon, Teerapong Panboonyuen, B. Zhang, C. Satirapod

2025 In Journal of Spatial Science (Taylor & Francis)

Investigating the use of deep learning-derived weighted mean temperature for GPS-PWVs estimation

MeViT: A Medium-Resolution Vision Transformer for Semantic Segmentation on Landsat Satellite Imagery for Agriculture in Thailand

In this paper, we present MeViT (Medium-Resolution Vision Transformer), designed for semantic segmentation of Landsat satellite imagery, focusing on key economic crops in Thailand para rubber, corn, and pineapple. MeViT enhances Vision Transformers (ViTs) by integrating medium-resolution multi-branch architectures and revising mixed-scale convolutional feedforward networks (MixCFN) to extract multi-scale local information. Extensive experiments on a public Thailand dataset demonstrate that MeViT outperforms state-of-the-art deep learning methods, achieving a precision of 92.22%, recall of 94.69%, F1 score of 93.44%, and mean IoU of 83.63%. These results highlight MeViT’s effectiveness in accurately segmenting Thai Landsat-8 data.

Teerapong Panboonyuen, C. Charoenphon, C. Satirapod

2023 In Remote Sensing (Supported by Ratchadapisek Somphot Postdoctoral Fund, 2022–2024)

MeViT: A Medium-Resolution Vision Transformer for Semantic Segmentation on Landsat Satellite Imagery for Agriculture in Thailand

A Performance Comparison between GIS-based and Neuron Network Methods for Flood Susceptibility Assessment in Ayutthaya Province

Flooding poses a significant challenge in Thailand due to its complex geography, traditionally addressed through GIS methods like the Flood Risk Assessment Model (FRAM) combined with the Analytical Hierarchy Process (AHP). This study assesses the efficacy of Artificial Neural Networks (ANN) in flood susceptibility mapping, using data from Ayutthaya Province and incorporating 5-fold cross-validation and Stochastic Gradient Descent (SGD) for training. ANN achieved superior performance with precision of 79.90%, recall of 79.04%, F1-score of 79.08%, and accuracy of 79.31%, outperforming the traditional FRAM approach. Notably, ANN identified that only three factors—flow accumulation, elevation, and soil types—were crucial for predicting flood-prone areas. This highlights the potential for ANN to simplify and enhance flood risk assessments. Moreover, the integration of advanced machine learning techniques underscores the evolving capability of AI in addressing complex environmental challenges.

T. Vajeethaveesin, Teerapong Panboonyuen

2022 In Trends in Sciences (Trends Sci. or TiS)

A Performance Comparison between GIS-based and Neuron Network Methods for Flood Susceptibility Assessment in Ayutthaya Province

Enhanced Feature Pyramid Vision Transformer for Semantic Segmentation on Thailand Landsat-8 Corpus

Semantic segmentation on Landsat-8 data is crucial in the integration of diverse data, allowing researchers to achieve more productivity and lower expenses. This research aimed to improve the versatile backbone for dense prediction without convolutions—namely, using the pyramid vision transformer (PRM-VS-TM) to incorporate attention mechanisms across various feature maps. Furthermore, the PRM-VS-TM constructs an end-to-end object detection system without convolutions and uses handcrafted components, such as dense anchors and non-maximum suspension (NMS). The present study was conducted on a private dataset, i.e., the Thailand Landsat-8 challenge. There are three baselines, DeepLab, Swin Transformer (Swin TF), and PRM-VS-TM. Results indicate that the proposed model significantly outperforms all current baselines on the Thailand Landsat-8 corpus, providing F1-scores greater than 80% in almost all categories. Finally, we demonstrate that our model, without utilizing pre-trained settings or any further post-processing, can outperform current state-of-the-art (SOTA) methods for both agriculture and forest classes.

Teerapong Panboonyuen, P. Rakwatin, K. Intarat

2020 In Information

Enhanced Feature Pyramid Vision Transformer for Semantic Segmentation on Thailand Landsat-8 Corpus

Semantic Segmentation on Remotely Sensed Images Using Deep Convolutional Encoder-Decoder Neural Network

My PhD thesis focuses on improving semantic segmentation of aerial and satellite images, a crucial task for applications like agriculture planning, map updates, route optimization, and navigation. Current models like the Deep Convolutional Encoder-Decoder (DCED) have limitations in accuracy due to their inability to recover low-level features and the scarcity of training data. To address these issues, I propose a new architecture with five key enhancements, a Global Convolutional Network (GCN) for improved feature extraction, channel attention for selecting discriminative features, domain-specific transfer learning to address data scarcity, Feature Fusion (FF) for capturing low-level details, and Depthwise Atrous Convolution (DA) for refining features. Experiments on Landsat-8 datasets and the ISPRS Vaihingen benchmark showed that my proposed architecture significantly outperforms the baseline models in remote sensing imagery.

Teerapong Panboonyuen

2020 In Chulalongkorn University Thesis Evaluation - Very Good Score (Outstanding Achievement)

Semantic Segmentation on Remotely Sensed Images Using Deep Convolutional Encoder-Decoder Neural Network

Semantic Labeling in Remote Sensing Corpora Using Feature Fusion-Based Enhanced Global Convolutional Network with High-Resolution Representations and Depthwise Atrous Convolution

This paper addresses improving semantic segmentation in remote sensing for aerial and satellite images, which is crucial for agriculture, map updates, route optimization, and navigation. We propose enhancements to the state-of-the-art Enhanced Global Convolutional Network (GCN152-TL-A) by introducing a High-Resolution Representation (HR) backbone for better feature extraction, Feature Fusion (FF) to capture low-level details, and Depthwise Atrous Convolution (DA) for refined multi-resolution features. Experiments on Landsat-8 and ISPRS Vaihingen datasets demonstrate our model’s superior performance, achieving over 90% accuracy in F1 scores and outperforming baseline models.

Teerapong Panboonyuen, P. Vateekul, S. Lawawirojwong

2019 In Remote Sensing

Semantic Labeling in Remote Sensing Corpora Using Feature Fusion-Based Enhanced Global Convolutional Network with High-Resolution Representations and Depthwise Atrous Convolution

Semantic Segmentation on Remotely Sensed Images Using an Enhanced Global Convolutional Network with Channel Attention and Domain Specific Transfer Learning

In the remote sensing domain, it is crucial to complete semantic segmentation on the raster images, e.g., river, building, forest, etc., on raster images. A deep convolutional encoder–decoder (DCED) network is the state-of-the-art semantic segmentation method for remotely sensed images. However, the accuracy is still limited, since the network is not designed for remotely sensed images and the training data in this domain is deficient. In this paper, we aim to propose a novel CNN for semantic segmentation particularly for remote sensing corpora with three main contributions. First, we propose applying a recent CNN called a global convolutional network (GCN), since it can capture different resolutions by extracting multi-scale features from different stages of the network. Additionally, we further enhance the network by improving its backbone using larger numbers of layers, which is suitable for medium resolution remotely sensed images. Second, “channel attention” is presented in our network in order to select the most discriminative filters (features). Third, “domain-specific transfer learning” is introduced to alleviate the scarcity issue by utilizing other remotely sensed corpora with different resolutions as pre-trained data. The experiment was then conducted on two given datasets (i) medium resolution data collected from Landsat-8 satellite and (ii) very high resolution data called the ISPRS Vaihingen Challenge Dataset. The results show that our networks outperformed DCED in terms of 𝐹1 for 17.48% and 2.49% on medium and very high resolution corpora, respectively.

Teerapong Panboonyuen, P. Vateekul, S. Lawawirojwong

2019 In Remote Sesning

Semantic Segmentation on Remotely Sensed Images Using an Enhanced Global Convolutional Network with Channel Attention and Domain Specific Transfer Learning

Semantic Segmentation On Medium-Resolution Satellite Images Using Deep Convolutional Networks With Remote Sensing Derived Indices

Semantic Segmentation is a fundamental task in computer vision and remote sensing imagery. Many applications, such as urban planning, change detection, and environmental monitoring, require the accurate segmentation; hence, most segmentation tasks are performed by humans. Currently, with the growth of Deep Convolutional Neural Network (DCNN), there are many works aiming to find the best network architecture fitting for this task. However, all of the studies are based on very-high resolution satellite images, and surprisingly; none of them are implemented on medium resolution satellite images. Moreover, no research has applied geoinformatics knowledge. Therefore, we purpose to compare the semantic segmentation models, which are FCN, SegNet, and GSN using medium resolution images from Landsat-8 satellite. In addition, we propose a modified SegNet model that can be used with remote sensing derived indices. The results show that the model that achieves the highest accuracy RGB bands of medium resolution aerial imagery is SegNet. The overall accuracy of the model increases when includes Near Infrared (NIR) and Short-Wave Infrared (SWIR) band. The results showed that our proposed method (our modified SegNet model, named RGB-IR-IDX-MSN method) outperforms all of the baselines in terms of mean F1 scores.

S. Chantharaj, K. Pornratthanapong, P. Chitsinpchayakun, Teerapong Panboonyuen

2018 In *15th International Joint Conference on Computer Science and Software Engineering * JCSSE 2018

Semantic Segmentation On Medium-Resolution Satellite Images Using Deep Convolutional Networks With Remote Sensing Derived Indices