超图自动学习与最优聚类框架结合的波段选择

发布时间：2023-04-02 18:10:10 浏览数：次

年米雪，聂萍，汪国强

(黑龙江大学电子工程学院, 哈尔滨 150080)

HSI processing has attracted considerable attention in recent years. HSI can provide rich band information from different wavelengths, and gets widely used in various research field, such as biological analysis and medical imaging processing. HSI records the reflectance of electromagnetic waves of different wavelengths, and the reflectance of each electromagnetic wave-lengths and the reflectance of each electromagnetic wave are stored in a 2-D image[1-3]. Hence, an HSI is a data cube containing of hundreds of 2-D images. Though significant successes in the field of HSI application have been obtained, how to deal with the large dimensional data is still a challenging problem so high correlation and dependence among them cause huge computational complexity as well as “Hughes”[4-5]. In view of this, the reduction in HSI is deemed to be a very important work. According to the involvement of the labeled and the unlabeled samples, band selection can be divided into supervised, semisupervised, and unsupervised methods[6- 8]. Supervised and semisupervised methods utilize the labeled samples to guide the selection process. However, the acquisitions of the labeled samples are a difficult task, sometimes they are not very practical in real application. With the development of imaging techniques, hyperspectral sensors are capable of deepening the characterization of various objects with hundreds of contiguous bands. For classification, a wealth of spectral bands not only increase the computational and storage burden of training a classifier but may also degrade the classification accuracy. For instance, due to the lack of labeled pixels, the generalization capability of the classifier is limited when high-dimensional bands are fed back. The problem is namely the curse of dimensionality. In addition, many adjacent bands may be heavily redundant and fail to provide additional discriminative information. Reducing the number of bands, that is, dimensionality reduction, is an effective strategy to solve the aforementioned challenges. In the field of HSI, three techniques are implemented including feature extraction (FE)[9], unmixing, and band selection (BS)[10]. Band selection has three advantages over the other two techniques. First, it only obtains a subset of the original bands and does not generate new features, thereby preserving the physical information from the selected bands. Second, the FE and unmixing techniques typically need all test samples to extract new features,endmembers and corresponding abundances during the test phase[11]. Compared with them, band selection only stores the information related to a few selected bands, which greatly reduces the storage and computational burden. Third, band selection can be combined with the feature extraction and unmixing techniques to improve the efficiency and performance of the latter. For HSI, it is a challenging task to select discriminative bands due to the lack of labeled samples and complex noise. To tackle these issues, we present a band selection method with hypergraph autolearning and optimal clustering framework.

1.1 Hypergraph autolearning

We use the method of randomly dividing band space as defined in the following to obtain subspaces. First, the dimension of thevth subspace is determined by the following formula:

dv=⎣[(1-σ1)τmin+σ1τmax]B」

(1)

wheredvrepresents the number of available bands for thevth subspace, andσ1∈[0,1] is a uniform random variable. Second, the band associated with this subspace is selected one by one, whose index is determined by the following formula:

ind=⎣1+σ2B」

(2)

where ind represents the index of the selected band, andσ2∈[0,1] is a uniform random variable. This step is repeated untildvbands selects for thevth subspace. The above-mentioned process of generating subspaces is repeated until all bands appear in one of subspaces at least. In this way, we can get a large number of labeled low-dimensional samples.

3)Hypergraph-based information sharing: First, subspaces reflect different representations of given training samples. In other words, representations from different subspaces have the same structural distribution. Such as if two representations belong to the same class in one of subspaces, they belong to the same class in other subspaces. Second, the correlation between representations is viewed dependent. For example, the representations of the two training samples in one of LVs are highly correlated; however, this may not be true in other subspaces. Although graph-based methods have been proposed to share information between views from the perspective of preserving local manifold structure, they force the representations from different views to share the same structural distribution and correlation.The method not only reduces the flexibility of information sharing but also is susceptible to unfriendly views and conveys unreliable information.

4)We propose a novel hypergraph-based information sharing model to solve the problems by dividing the information carried by subspaces into structure information and view-dependent information. The structure information, such as the label distribution, can be shared to convey reliable information. The view-dependent information, such as the difference between representations in the spectral dimension, can be used to preserve the specificity of subspaces. Considering that all subspaces have the same label distribution and share the common label matrix Y, we use label information to construct hyperedges so that representations from the same class are located in the same hyperedge. Hence, different subspaces share the same set of hyperedgesε={ε,…,εC} and incident matrixH∈Rn×C. The degreeecof the hyperedgeεc(1≤c≤C) is equal to the number of samples belonging to the classC, and the bands of representations from different subspaces are different. That is, although the representations from the same class belong to the same hyperedge, the compactness between them varies for different subspaces and is viewed dependent, which is utilized to preserve the specificity of subspaces. This can be done by automatically assigning different weights to the same hyperedge related to various subspaces. Hence, from the perspective of preserving manifold structure, a hypergraph autolearning-based information sharing model can be modeled as

(3)

(4)

1.2 OCF

(5)

Without loss of generality, we assume that functionDsis supposed to be maximized. So our optimization problem turns to be

(6)

After the optimization problem is clarified, the solution will be given in two steps, named as problem decomposition and subproblem combination, respectively. Mappingfhere is still a general form, which means that the solution will be available for arbitrary definition off.

(7)

Then, by enumerating all the possible value ofsk-1, (7) can be derived into

(8)

By substitutingk=1 into (7), we have

(9)

(10)

It is easy to see that there is

(11)

For more details about the framework, refer to the pseudocode shown in Algorithm 1.

Algorithm1OCF(DsIsMaximized)Input:SetofbandsXL1,mappingandclus-ternumberK.1:forl 1toLdo2:M1l=f(Ml1)3:Q1l←04:endfor5:fork 2toKdo6:forl ktoLdo7:M1l←-∞8:p∗←09:fork←2toKdo10:ifMkl

ContinuedAlgorithm1OCF(DsIsMaxi-mized)11:Mkl←Mk-1p+f(Xlp+1)12:p∗←p13:endif14:endfor15:Qkl←p∗16:endfor17:endfor18:s∗K←L19:fork←K-1toldo20:s∗k←Qk+1s∗k+121:endforOutput:CBIVcorrespondingtoMKL:s∗=(s∗1,s∗2,…,s∗K-1)T

To verify the feasibility and effectiveness, the proposed method is compared with scalable one-pass self-representation learning for hyperspectral band selection(SOP-SRL)[12], local-view-assisted discriminative band selection with hypergraph autolearning(LvaHAI)[13]and a fast neighborhood grouping method for hyperspectral band selection(FNGBS)[14].

2.1 Experimental data sets

The experimental environment is the 10th generation intelligent Intel six core processor with the main frequency of 2.60 Hz, and the memory is 16 GB. All the methods are implemented in MATLAB R2016b. Experimental data sets are Salinas Valley, Pavia University and Pavia Center from three public hyperspectral image data sets.

1)Pavia University image acquired with the Reflective Optics System Imaging Spectrometer (ROSIS) sensor have 1.3 m spatial resolutions. This data set consists of 610×340 pixels with 9 classes, in which each pixel has 115 spectrum bands ranging from 0.43 to 0.86 μm. After removing 12 noisy bands, the remaining 103 bands are used for BS and classification. Table 1 show the number of training samples and test samples on PaviaU.

2) Pavia Center image was also obtained by the ROSIS sensor over Pavia, northern Italy. Hence, it has the same spatial and spectral resolutions as the first data set. In this data set, 1 096×715 pixels from nine classes are included. After noisy spectra are removed, the number of available bands is 102 for the experiments. Table 2 show the number of training samples and test samples on Pavia Center

Table 1 Number of training samples and test samples on PaviaU

Table 2 Number of training samples and test samples on Pavia Center

3)Salinas valley image covers an area located in Salinas Valley, CA, USA. This image was obtained by the Airborne Visible/Infrared Imaging Spectrometer, having 3.7 m spatial resolutions. It consists of 517×217 pixels with 16 classes. When the 20 noise bands (108-112, 154-167, and 224) in terms of water absorption are removed, 204 bands are retained for experimental analysis. The number of training samples and test samples on Salinas are showed as Table 3.

Table 3 Number of training samples and test samples on Salinas

2.2 Experimental setup

K-nearest Neighbor (KNN) classification is adopted for experiment. KNN is the simplest classifier

in machine learning, which determines the sample category according to the category of K similar training data. The optimal K value is selected through cross-verification. Therefore, the K value is finally selected as 5. Additionally, since we mention that these classifiers are supervised,10% samples from each class based on selected bands are randomly chosen as the training set; the remaining 90% samples are used for test. Moreover, in order to reduce the influence of random selection of 10% samples, the algorithm runs ten times to obtain the average results. Because the desired number of bands that should be selected is unknown, we implement experiments in the range of 5～30 bands to explain the influence of different numbers of bands on classification accuracy. Overall accuracy (OA), Average accuracy (AA) and Kappa coefficient are used as evaluation indexes for hyperspectral image classification.

2.3 Analysis of experimental results

The whole band space is first randomly divided into several subspaces of different dimensions, then, for different subspaces, a robust hinge loss function for isolated pixels regularized by the row-sparsity is adopted to measure the importance of the corresponding bands. A hypergraph model that automatically learns the hyperedge weights preserves the local manifold structure of these projections, to ensure that samples of the same class have a small distance, and a consensus matrix is used to integrate the importance of bands corresponding to different subspaces resulting in the optimal selection of expected bands from a global perspective. Finally, a simple and effective clustering strategy is proposed to select bands，which is fed into a classifier for classification. Classification performance indexes of different number of bands in three data sets are shown in Fig.1～Fig.3. It can be seen that different number of bands have an impact on the performance of classification results. The method proposed in this paper has achieved satisfactory results on OA, AA and Kappa. When the number of bands selected is small, the accuracy of the algorithm is unstable, and when the number of bands is more than 25, the accuracy of the algorithm tends to be stable.

Fig.1 Relationship between the number of bands and Kappa coefficient

Fig.2 Relationship between the number of bands and AA

Fig.3 Relationship between the number of bands and OA

In order to better verify the effectiveness and superiority of this method, KNN is used as a classifier, and this method is compared with LvaHAI, SOP-SRL and FNGBS which are three latest algorithms. The experimental results are shown in Fig.4, Fig.5 and Fig.6. As can be seen from Fig.4, for Pavia data set, the OA coefficient of this algorithm on KNN classifier is always higher than that of other algorithms. By selecting different number of bands, the algorithm shows excellent classification performance when the number of bands is small. In the case of selecting 10 bands, the OA of this algorithm on Pavia data set is 84.69%, which has exceeded LvaHAI, SOP-SRL and FNGBS. But, as the number of bands continues to increase, when the number of bands increases to 15, the performance does not increase significantly, which may be related to the fact that the subspace contains fewer and fewer bands, so that the current band cannot be judged and updated with more favorable information, indicating that the method is more effective in low dimension.As can be seen from Fig.5, for the Pavia University data set, the OA performance of this algorithm is always higher than that of other algorithms, which further illustrates the superiority of this algorithm. Compared with LvaHAI and FNGBS, this algorithm has better stability. For Salinas data sets, when the number of bands is small, this algorithm performs better than FNGBS algorithm. To sum up, the overall performance of the algorithm is better than other algorithms, with better robustness, even in the case of small samples can also have a good performance.

Fig.4 OA metrics of the PaviaC dataset

Fig.5 OA metrics of the PaviaU dataset

Fig.6 OA metrics of the Salinas dataset

In order to verify the effectiveness and superiority of the algorithm, 15 bands are taken as examples to classify the ground objects in three data sets respectively, and the classification results are shown in Table 4. As can be seen from Table 2, OA index of this algorithm is higher than other algorithms in Pavia and Pavia University data sets, with an increase of 1.04% and 1.05% respectively compared with LvaHAI algorithm. In the Pavia University data sets, the Kappa index of this algorithm is 4.73% higher than that of SOP-SRL. For Salinas and Pavia University data sets, compared with SOP-SRL and LvaHAI algorithms, AA and Kappa in this paper have certain advantages.

Table 4 Classification results of different methods on three data sets

A band selection method with hypergraph autolearning and optimal clustering framework is proposed. The whole band space is randomly divided into several subspaces of different dimensions, each subspace denotes a set of low-dimensional representations of training samples consist of bands associated with it. A hypergraph model that automatically learns the hyperedge weights preserves the local manifold structure of these projections to ensure that samples of the same class have a small distance, and a consensus matrix is used to integrate the importance of bands corresponding to different subspaces. Finally, a simple and effective clustering strategy is proposed to select bands. Through experimental comparison and analysis on three public hyperspectral image data sets, the proposed method has good performance in OA, AA and Kappa, thus verifying the feasibility and effectiveness of the proposed band selection method.

猜你喜欢工程学院波段哈尔滨福建工程学院中国机械工程(2022年22期)2022-11-25湖南师范大学设计与工程学院作品大众文艺(2022年21期)2022-11-16哈尔滨国际冰雪节疯狂英语·初中天地(2022年2期)2022-07-07最佳波段组合的典型地物信息提取航天返回与遥感(2022年2期)2022-05-12化学与材料工程学院简介渤海大学学报（自然科学版）(2021年3期)2021-12-27基于机电工程学院的就业能力分析现代企业(2021年2期)2021-07-20静听花开伙伴(2019年3期)2019-06-11哈尔滨“8·25”大火烧出了什么劳动保护(2019年3期)2019-05-16基于PLL的Ku波段频率源设计与测试电子制作(2018年2期)2018-04-18小型化Ka波段65W脉冲功放模块制导与引信(2017年3期)2017-11-02

推荐访问:超图波段最优

栏目最新：

上一篇：我国减污降碳与地区经济发展水平差异研究
下一篇：《海洋技术学报》投稿须知