WHU-OHS Hyperspectral Dataset


J. Li, X. Huang, and L. Tu, “WHU-OHS : A benchmark dataset for large-scale Hersepctral Image classification,” Int. J. Appl. Earth Obs. Geoinf., vol. 113, no. September, p. 103022, 2022, doi: 10.1016/j.jag.2022.103022.


Dataset download:

Training set[Link]

Validation set[Link]

Test set[Link]


Zenodo Download link[Link]

Dataset introduction:

The WHU-OHS dataset is made up of 42 OHS satellite images acquired from more than 40 different locations in China (Fig. 1). The imagery has a spatial resolution of 10 m (nadir) and a swath width of 60 km (nadir). There are 32 spectral channels ranging from the visible to near-infrared range, with an average spectral resolution of 15 nm. We cropped each image into 512 × 512 pixels with a stride of 32. There are 4822, 513, and 2460 sub-images in the training, validation, and test sets, respectively.


Fig. 1. Left: The geographical locations of the 42 images in the WHU-OHS dataset. Right: Examples of local OHS parcels (true-color compositions with R: 670 nm; G: 566 nm; B: 480 nm) and their corresponding reference labels.

The dataset was organized in the format shown in Fig. 2.


Fig. 2. Data organization of the WHU-OHS dataset.

The correspondence of label IDs and categories:


For transferability test, we choose eight pairs of OHS images, and each pair contains one source image (S) and one target image (T):

S1: Changchun

T1: Jilin

S2: Wuxi

T2: Shanghai

S3: Guangzhou

T3: Zhongshan

S4: Xining

T4: Lanzhou

S5: Hetian

T5: Kelamayi

S6: Anyi

T6: Nanchang

S7: Changde

T7: Changsha

S8: Tianjin

T8: Tangshan

The 26 OHS images except for the eight pairs:

O1: Baoding

O2: Chongqing

O3: Fujin

O4: Huainan

O5: Huhehaote

O6: Jinzhong

O7: Luliang

O8: Manasi_1

O9: Manasi_2

O10: Nanmulin

O11: Neimenggu

O12: Qingdao

O13: Qinghuangdao

O14: Shawan

O15: Shenyang

O16: Shuozhou

O17: Songpan

O18: Taian

O19: Tongjiang_1

O20: Tongjiang_2

O21: Wuzhong

O22: Xundian

O23: Xuzhou

O24: Yidu

O25: Zangzu

O26: Zhongshan

The image patches have been normalized and scaled by 10000 to reduce storage cost. Divide the pixel values by 10000 and then the image patches can be used directly.