The importance of road segmentation in remote sensing data cannot be overstated, as it underpins various critical applications such as urban planning, traffic moni- toring, and autonomous driving systems. Automatically labeling objects with pixel- wise segmentation is a labor-intensive task, especially when compared to bound- ing boxes. Many existing weakly supervised instance segmentation methods rely on heuristic losses derived from bounding box priors. However, we hypothesize that box-supervised techniques may yield high-quality segmentation masks, prompting us to investigate whether detectors can effectively learn from these masks while dis- regarding those of low quality. To address this inquiry, we introduce GuidedBox, an end-to-end training framework tailored for robust, weakly supervised instance segmentation. GuidedBox employs a sophisticated teacher model to generate pre- cise masks as pseudo-labels. Recognizing the potentially detrimental effects of noisy masks on training, we propose a mask-aware confidence scoring mechanism to as- sess the quality of pseudo-masks. Additionally, we introduce noise-aware pixel loss and noise-reduced affinity loss functions to optimize the student model with pseudo masks dynamically. Extensive experimentation demonstrates the efficacy of Guided- Box across multiple datasets. Notably, GuidedBox outperforms existing state-of-the- art methods such as SOLOv2, CondInst, and Mask R-CNN on the challenging Mas- sachusetts Roads Dataset, achieving an AP50 score of 0.9231. Furthermore, Guid- edBox shows competitive performance on the SpaceNet and DeepGlobe datasets, highlighting its robustness and adaptability across different remote sensing scenarios. Code has been made available at https://github.com/kaopanboonyuen/GuidedBox.