Remote Sensing Research Masters Degree at Hohai University, exploring the direction of multi-source change detection based on deep learning Feeling a bit lost now, uncertain about future employment prospects

Should I switch to web GIS development or continue to tackle algorithm change detection? I hope you experts can give me some advice

Improving employability

Reduce anxiety and take more actions. I feel that focusing on multiple channels for change is a good direction after graduation. Seize the time to develop a strong model, which is sufficient for graduation. This field also yields results easily, so it would be great to present an achievement and compete for scholarships. Regarding job hunting, it is advisable to seek advice from senior alumni, study independently every day, accumulate internship experiences, and there should be no major issues in finding a good job.

Employment Prospects for Remote Sensing Intelligent Interpretation

Let’s start with a conclusion that may discourage you.

If you’re just choosing a field of study for graduation, it doesn’t really matter which direction you choose, because getting a master’s degree is not difficult.

But if you’re considering your future job prospects, then you definitely should not choose remote sensing intelligent interpretation. You probably don’t know the profitability of companies in this field right now. If you had asked me this question in 2019, I would have recommended remote sensing interpretation, as the national policies were favorable at that time. However, in recent years, governments in various regions have already completed many interpretation projects, and many remote sensing companies are actually losing money in this area, with no intention of investing further. Government projects have become smaller and smaller in scale. A few years ago, there were many projects, and they were all worth millions. Dharma Institute used to compete with us and take on projects together, but they have now withdrawn. SenseTime is still holding on, and its effectiveness is finally picking up in 2023, but current customers are not interested because ultimately, the government is not giving this field much emphasis. So, by the time you graduate… In these two years, the government is focusing on 3D imaging. If you’re interested, you might as well consider pursuing 3D reconstruction.

Therefore, overall, choosing development as your career path will provide stable job opportunities. However, if you have a strong interest in remote sensing interpretation, you can still pursue it, but be aware that it is a challenging field.

A New Approach to Change Detection Based on Semantic Guidance and Spatial Localization

Nowadays, it has become difficult to develop with the emergence of large models. Considering the rapid development and potential profitability of large models in the future, algorithm development seems to have better prospects. Compared to the oversaturated field of general computer vision, the intersection between remote sensing and deep learning seems more promising. On the one hand, top journals in remote sensing, such as TGRS, are recognized in terms of employment. The general computer vision field also pays attention to cross-disciplinary papers on remote sensing and deep learning. Currently, several big companies in Shanghai, such as Shanghai AI Lab and Alibaba DAMO Academy, have research directions in remote sensing. Publishing papers in this field should not be a problem. On the other hand, although change detection in remote sensing has been extensively studied in general scenes, it is still a relatively unexplored field in special scenes. Publishing several papers in special scenes of change detection can ensure smooth job prospects.

Therefore, the key now is to publish more papers in the field of change detection. Currently, there is limited research on change detection in special scenes, making it a blue ocean for publishing papers. To work in the field of change detection, you can use the architecture proposed below as a foundation. This architecture is specifically designed for change detection in special scenes and addresses the problems encountered in these scenes. Based on this architecture, you can quickly carry out work in change detection in special scenes.


When opening papers on change detection, it often starts with a certain team constructing a certain model with certain modules and tricks, achieving higher performance on a certain dataset. Many people engaged in change detection tasks can relate to this experience because change detection tasks seem similar to semantic segmentation tasks, where there is not much else to be done except achieving better results.

However, with the emergence of large models, it has become even more difficult to tweak models for better performance. With sufficient amounts of data, large models always perform better. Large models have become the future direction for achieving better results. However, large models come with high code complexity and resource consumption. It is difficult for small teams to compete with large teams in adopting large models. Blindly opting for large models may not be the best choice.

Therefore, our work should not be limited to simply improving results. But in what direction should we expand? A good choice is the scene. In the field of remote sensing, there are naturally many complex scenes. Currently, large models are mainly used for change detection in buildings. It is difficult to obtain sufficient data for other scenes. Therefore, choosing one or several specific change detection scenes can greatly benefit our work. In specific scenes, we can design special modules based on the characteristics of the scenes to improve the performance of the model. In contrast to other works that only introduce common techniques such as multi-scale and attention mechanisms, our contribution is fundamentally solving scientific problems in specific scenes. Our work has practical significance, making it easier to publish papers.

So, what change detection scenes should we focus on? I recommend intra-class change detection scenes and multi-view change detection scenes, as shown in the figure below. In the intra-class change detection scene, multiple types of objects have changed between different time points of remote sensing images, but the change label only indicates where the changes occur without specifying the types of changes. For example, the changed objects in the image include roads, buildings, and bare land, but the change labels do not specify the types of changes. In the multi-view change detection scene, the imaging angles of the satellite change between different times, resulting in different perspectives of the same building. For example, the first image captures the side view of a building, while the second image only captures the top view of the same building.

Special Change Detection Scenes

These two scenes are emerging scenes in change detection research. The complexity of these scenes often leads to unsatisfactory results of models. Therefore, constructing models and improving their performance in these scenes is very promising. Moreover, there are publicly available datasets (SYSU and NJDS) for both scenes, making it easy to construct models and obtain results effortlessly.

The article being introduced proposes a new basic architecture based on the characteristics of these two scenes, addressing the shortcomings of previous change detection architectures in these scenes. With this basic architecture as a foundation, there is a lot of work that can be done. Whether it is combining various existing modules or designing custom modules and combining them, they are all worth exploring. The code for the paper has also been open-sourced and is ready for use.


The core contribution of this research is to propose a novel basic framework for change detection that solves the structural problems of previous frameworks. It achieves the best results in multiple change detection scenes and can serve as the foundation for future change detection research.

The paper was published by the High-Resolution Remote Sensing Laboratory at Nanjing University in the top remote sensing journal “IEEE Transactions on Geoscience and Remote Sensing” on October 26, 2023. The first author is Sijie Zhao.




In recent years, a large number of deep learning methods have been applied to change detection tasks in remote sensing images. These methods are mainly based on two basic architectures: multiple encoders and single decoder (MESD) and dual encoder-decoder (DED). They achieve good results in a single scene, but their generality in multiple scenes is still limited.

In terms of network structure, as shown in Figure 1a, MESD consists of multiple encoders with shared weights and a single decoder. Two temporal remote sensing images are used to extract temporal features through multiple encoders, and then the features are fused in a single encoder to detect changed objects.

In terms of core ideas, as shown in Figure 2a, MESD uses the fused semantic features of two temporal remote sensing images for change detection. The feature fusion in MESD is at the feature level, which means that the feature of a changed object in one temporal image can be interfered by the background feature at the same spatial position in the other temporal image, making it difficult for the model to accurately determine the changed object.

In terms of network structure, as shown in Figure 1b, DED consists of two encoders-decoders with shared weights and a single decoder. It segments target object features from the remote sensing images through the dual encoder-decoder and performs decision-level fusion in the single decoder to solve the interference problem of the temporal features in MESD.

In terms of core ideas, as shown in Figure 2b, DED first segments the target object features from the two temporal remote sensing images and then compares the features to obtain the change detection result. It solves the problem in MESD by performing decision-level fusion on the temporal features. However, in the intra-class change detection scene where only change labels but not change type labels are available, DED cannot accurately segment the target object features from the two temporal remote sensing images. In the multi-view change detection scene, DED cannot distinguish between changed and unchanged objects due to pseudo-changes caused by different viewing angles.

Figure 1. Network structure diagrams of different basic architectures

Figure 2. Core idea diagrams of different basic architectures

To address the aforementioned problems, we propose a new approach to change detection based on semantic guidance and spatial localization, which solves the shortcomings of previous basic architectures and achieves superior performance in multiple change detection scenes (intra-class change detection, single-view change detection, multi-view change detection). The proposed approach provides a general deep learning architecture for change detection tasks.

Based on this basic architecture, as illustrated in the simple structure Figure 1c and the complex structure Figure 3, we construct an exchanging dual encoder-decoder (EDED) basic architecture.

In terms of network structure, the simple structure, illustrated in Figure 1c, is similar to DED. EDED consists of two encoders-decoders with shared weights and a single decoder. The difference lies in the complex structure, as shown in Figure 3. After extracting temporal features from the two temporal images in the dual encoder, the model performs channel-wise swapping of the encoder features, ensuring that both temporal features contain rich semantic information from the two temporal images. Therefore, the encoder can use the semantic features of both temporal images to roughly determine the changed areas, which serves as guidance for the subsequent precise localization of changed objects. Next, the temporal features are fused with the encoder features through skip connections, and the rich spatial features in each temporal encoder feature accurately locate the changed objects in the corresponding temporal image. Finally, since the decoder features of each temporal image accurately locate the changed objects in that image, the model fuses the decoder features of the two temporal images in the single decoder to accurately locate all changed objects.

In terms of core ideas, as shown in Figure 2c, the proposed approach uses the semantic-guided and spatially localized features of the two temporal remote sensing images for change detection. The model first uses the encoder branches to extract shared semantic features from the two temporal remote sensing images to roughly determine the changed areas, providing guidance for precise localization of changed objects. Then, it uses the spatial features of each temporal remote sensing image to accurately locate the changed objects in that image. Finally, by fusing the features of changed objects from both temporal images, the model accurately locates all changed objects.

EDED performs decision-level fusion of the temporal features of changed objects, thus solving the problem of interference between temporal features in MESD. Compared to DED, EDED can use change features instead of single temporal features to determine the changed objects in the two temporal remote sensing images in the intra-class change detection scene, enabling accurate identification of the changed objects. In the multi-view change detection scene, EDED can differentiate between real changes and pseudo-changes caused by viewing angle differences using the semantic features of the two temporal images, thus accurately identifying the real changed objects. Therefore, EDED solves the limitations of existing architectures and can be applied to multiple change detection scenes.

Figure 3. Network structure of the EDED basic architecture

Based on this basic architecture, as shown in Figure 4, we construct a Semantic Guidance and Spatial Localization Network (SGSLN) for change detection.

Figure 4. Network structure of SGSLN

We conducted extensive experiments on intra-class change detection datasets (CDD and SYSU), single-view change detection datasets (WHU, LEVIR CD, and LEVIR-CD+), and multi-view change detection dataset (NJDS). As shown in Figure 5, the results demonstrate that EDED outperforms MESD and DED in multiple scenes. The improvement of F1-Score achieved by EDED is 1.24%, 0.44%, and 3.5% in the intra-class change detection scene (SYSU), single-view change detection scene (LEVIR-CD), and multi-view change detection scene (NJDS), respectively.

Figure 5. Comparative experiments on multiple scenes with the EDED basic architecture

Through comparative experiments of SGSLN with other change detection methods in multiple scenes, we found significant improvements in its performance across multiple scenes (Figure 5). As shown in Figure 6, the F1-Score of the model increased by 0.60% and 2.04% in the intra-class change detection scenes (CDD and SYSU), respectively (Figures 6a-b); in the single-view change detection scenes (WHU, LEVIR-CD, and LEVIR-CD+), the F1-Score increased by 1.12%, 0.08%, and 5.10%, respectively (Figures 6c-e); and in the multi-view change detection scene (NJDS), the F1-Score increased by 4.82% (Figure 6f).

Figure 6. Comparative experiments of SGSLN in multiple scenes

EDED solves the problems of existing architectures MESD and DED and is a highly promising new architecture for deep learning-based change detection. SGSLN, based on EDED, demonstrates excellent performance in multiple scenes, but there is still untapped potential. We hope that future research can build upon EDED to develop even better and more versatile change detection models across multiple scenes.


The code released with this research is also suitable as a framework for change detection. It is user-friendly and can be easily modified for different models. For more details, please refer to the article.

If you find the code framework useful, please consider giving it a star. If you have any questions regarding this article, feel free to comment or send a private message. We welcome the use and citation of this general change detection framework and the reference to our paper.

Remote Sensing and Computer Vision

Actually, they are quite similar. Change detection is currently accepted quite well, and remote sensing conference, IGARSS, can definitely be submitted to. Top conferences in computer vision or AI also accept it, and relevant papers are accepted every year. It still has applied potential in real-world scenarios. I don’t know much about webGIS, but I always feel that if you want to develop it, you should start with a company that has strong technical expertise. This way, you can participate in it smoothly in advance.

Remote Sensing and Deep Learning

Remote sensing combined with deep learning is a powerful technique.

If you’re not into computer vision or natural language processing, and don’t have any top-tier publications, it can be difficult to find a job in algorithms.

In comparison, transitioning to development might be a better option.


It’s surprising to see answers praising IGARSS and TGRS, one giving away free water (likes) and the other boasting about being well-known in a specific area but lacking core competitiveness.

Algorithm route: Learn A, contribute to open-source projects, pursue a Master’s degree (preferably in a top research group), and gain internship experience.

Overall, the development path has a higher difficulty ceiling and a better cost-performance ratio.

Development Recommendations

When developing, it is recommended to use algorithms that have fewer pitfalls and are more efficient.

Doctorate to Computer Vision or Paper to Development

Pursuing a doctorate degree in Computer Vision, or transforming a mediocre paper into a remarkable development project.

Pursue a Ph.D. or Finish Writing Thesis and Find an Internship in Development

If you want to pursue a Ph.D., then focus on doing well. There is a lot of room for exploring change detection, and using large models would also be beneficial. If you don’t want to pursue a Ph.D., then finish writing your thesis and find an internship in development.

Studying Literature and GitHub

There are quite a few people working in this field, right?

Just work hard~ Read more literature, explore more on GitHub.

Internship is Key

Stop hesitating and hurry up to do an internship. What are you testing for?