One of the fundamental tasks in remote sensing is the semantic segmentation of the aerial and satellite images. It plays a vital role in applications, such as agriculture planning, map updates, route optimization, and navigation. The state-of-the-art model is the Deep Convolutional Encoder-Decoder (DCED). However, the accuracy is still limited since the architecture is not designed for recovering low-level features, e.g., river, low vegetation on remotely sensed images, and the training data in this domain are deficient. In this dissertation, we aim to propose the semantic segmentation architecture in five aspects, designed explicitly for the remotely sensed field. First, we propose applying a modern Convolutional Neural Network (CNN) called a Global Convolutional Network (GCN). Second, “channel attention” is presented to select the most discriminative filters (features). Third, “domain-specific transfer learning” is introduced to alleviate the scarcity issue. Fourth, “Feature Fusion (FF)” is added to our network to capture low-level features. Finally, “Depthwise Atrous Convolution (DA)” is introduced to refine the extracted features. The experiment was conducted on three data sets, two private corpora from Landsat-8 satellite and one public benchmark from the “ISPRS Vaihingen” challenge. The results showed that our proposed architectures outperformed the baseline model on any remote sensing imagery.