GLANCE 全球土地覆被训练数据集
GLANCE 全球土地覆被训练数据集
GLanCE 培训数据集向公众开放,专为区域到全球土地覆被和土地覆被变化分析而设计。该数据集的中等空间分辨率为 30 米,时间跨度为 1984 年至 2020 年,在地理和光谱上代表了全球所有生态区域。每个训练单元提供多达 23 种土地覆被特征,提供了一个统一、标准化和全面的数据库,其中包括有关土地覆被突变和渐变过程的信息,特别是在选定区域的长达 36 年的时间跨度。该数据集具有适应性强的特点,用户可根据自己的研究区域、分类算法和所需的分类图例对其进行子取样和定制,使其成为深入土地覆被调查的多功能资源。
谷歌地球引擎(GEE)等先进的云计算平台可利用机器学习算法绘制区域到全球的土地覆被和土地覆被变化图。然而,准确绘制土地覆被图所需的高质量训练数据的收集工作仍然成本高昂且劳动密集。为了满足这一需求,我们创建了一个包含近 200 万个训练单元的全球数据库,时间跨度从 1984 年到 2020 年,涵盖七个一级和九个二级土地覆被等级。我们的训练数据收集方法利用了 GEE 和机器学习算法,以确保数据质量和生物地理代表性。我们从大地遥感卫星图像的光谱-时间特征空间采样,以便在全球各生态区域有效分配训练数据,并将公开可用的数据集和合作者提供的数据集纳入我们的数据库。为了反映基本的区域类别分布和干扰后地貌,我们对数据库进行了战略性扩充。我们使用了基于机器学习的交叉验证程序来删除可能被错误标记的训练单元。我们的训练数据库适用于土地覆被变化、农业、林业、水文、城市发展等一系列研究。论文地址:
数据属性介绍
Column Name | Description |
---|---|
Lat | Latitude |
Lon | Longitude |
Start_Year | Start year of segment, ranging from 1984 to 2020 (integer) |
End_Year | End year of segment, ranging from 1984 to 2020 (integer) |
Glance_Class_ID_level1 | Level 1 land cover value (integer): 1 (Water), 2 (Ice/snow), 3 (Developed), 4 (Barren/sparsely vegetated), 5 (Trees), 6 (Shrub), and 7 (Herbaceous) |
Glance_Class_ID_level2 | Level 2 land cover value (integer): 1 (Water), 2 (Ice/snow), 3 (Developed), 4 (Soil), 5 (Rock), 6 (Beach/sand), 7 (Deciduous), 8 (Evergreen), 9 (Mixed), 10 (Shrub), 11 (Grassland), 12 (Agriculture), and 13 (Moss/lichen). NaN values present. |
Leaf_Type | Tree leaf type: broadleaf (1), needleleaf (2), and mixed (3). NaN values present. |
Impervious_Percent | Impervious percent for developed samples: low 0%-30% (1), medium 30%-60% (2), and high 60%-100% (3). NaN values present. |
Tree_Location | Binary integer indicating whether trees are on the interior (0) or edge (1) of a forest. NaN values present. |
Veg_Density | Vegetation density for trees and shrubs: sparse 0%-30% (1), open 30%-60% (2), and closed 60%-100% (3). NaN values present. |
Veg_Modifier | Vegetation modifiers, which can include one or more of the following: Cropland, Plantation, Wetland, Riparian/Flood, Mangrove, Greenhouse, and Trees/Shrub Present. NaN values present. |
Segment_Type | Indicates whether a segment is stable (0) or transitional (1). See Section 1 for a detailed description. Land cover for transitional segments is recorded at both the beginning and end of the time segment - typically the first and last three years. NaN values present. |
Change | Indicates presence (1) or absence (0) of land cover change for Level 1 land cover labels. Includes both abrupt change and gradual change (transitional segments (1) from the Segment_Type attribute) if it happened at any time for that training unit. |
LC_Confidence | Interpreter confidence in the Level 1 land cover label from 1 (lowest) to 3 (highest). NaN values present. |
Level1_Ecoregion | Ecoregion Level 1 number based on World Wildlife Fund definitions. For North America we used ecoregions based on the Environmental Protection Agency’s Ecoregions of North America product. |
Level2_Ecoregion | Ecoregion Level 2 number based on the Environmental Protection Agency’s Ecoregions of North America product. This field is available only for North America and is assigned a value of 0 for all other continents. |
Continent_Code | Assigned continent number: North America (1), South America (2), Africa (3), Europe (4), Asia (5), and Oceania (6). |
Dataset_Code | Assigned dataset number: 1, 2, 3, 4, 5, 902, 999, 700, 701, 702, 703, 704, 705, 706, and 707. Numbers correspond to each Dataset as follows: STEP, CLUSTERING, LCMAP, ABoVE, MapBiomas, Feedback, Training_augment, MODIS_algo, GeoWiki, RadEarth, Collaborator_data, BU_team_collected, GLC30, LUCAS, ASB_crop. For details see Scientific Data publication. |
Glance_ID | Unique ID for each sample. |
ID | ID for each unique combination of latitude and longitude. Change units have the same ID but different Glance_ID. |
具体分类
Level 1 | Level 2 | Description |
---|---|---|
Water (1) | Water (1) | Areas covered with water throughout the year: streams, canals, lakes, reservoirs, oceans. |
Ice/snow (2) | Ice/snow (2) | Land areas with snow and ice cover greater than 50% throughout the year. |
Developed (3) | Developed (3) | Areas of intensive use; land covered with structures, including any land functionally related to developed/built-up activity. |
Barren/sparsely vegetated (4) | Land comprised of natural occurrences of soils, sand, or rocks where less than 10% of the area is vegetated. | |
Soil (4) | Land covered with less than 10% vegetation and dominated by soil. | |
Rock (5) | Land covered with less than 10% vegetation and dominated by rocks. | |
Beach/sand (6) | Land covered with less than 10% vegetation and dominated by beach/sand. | |
Trees (5) | Land where tree cover is greater than 30%. Note that cleared trees (i.e., clear-cuts) are mapped according to current cover (e.g., barren/sparsely vegetated, shrubs, or herbaceous). | |
Deciduous (7) | Land with tree cover greater than 30% and all trees present are deciduous. | |
Evergreen (8) | Land with tree cover greater than 30% and all trees present are evergreen. | |
Mixed (9) | Land with tree cover greater than 30% and neither deciduous nor evergreen trees dominate. | |
Shrub (6) | Shrub (10) | Land with less than 30% tree cover, where total vegetation cover exceeds 10% and shrub cover is greater than 10%. |
Herbaceous (7) | Land covered by herbaceous plants. Total vegetation cover exceeds 10%, tree cover is less than 30%, and shrubs comprise less than 10% of the area. | |
Grassland (11) | Herbaceous land covered with grass. | |
Agriculture (12) | Herbaceous land covered with cultivated cropland. | |
Moss/lichen (13) | Herbaceous land covered with lichen and/or moss. |
数据引用
数据集引用
代码
代码链接
License
The dataset is provided under a Creative Commons Attribution 4.0 International Public License, unless otherwise noted.
Created by: Stanimirova et al, Boston University
Curated in GEE by: Samapriya Roy
Keywords: Glance, LULC, training dataset, Medium resolution, global dataset, land use, land cover
Last updated in GEE: 2024-01-02
1 week ago 1 week ago
Copyright © 2020 - 2024 Samapriya Roy
训练数据集合
GLanCE 培训数据由波士顿大学一组训练有素的图像分析师使用土地覆被关键字和一套在线工具( ,使用谷歌地球引擎 API)收集。图像分析师结合使用高分辨率谷歌地球图像、大地遥感卫星图像、光谱反射率时间序列、植被指数和大地遥感卫星衍生的缨帽变换,在屏幕上解释土地覆被。此外,图像分析师还使用谷歌地球照片和街景(如有)来帮助他们进行解释。以下我们将数据库中代表单个陆地卫星像素的每个条目称为一个训练单元。每个训练单元对应一位图像分析师对连续变化检测与分类 (CCDC) ( ) 时间片段的判读(解释见下文)(图 1)。每个单元都要进行质量评估,并有可能被标记,供第二位图像分析师审查。如果一位图像分析师与另一位图像分析师在给定训练单元的土地覆被标签上存在分歧,则由第三位团队成员进行审查,必要时重新解释或删除该单元。如果没有可用的高分辨率图像,且团队成员无法以较高的置信度确定土地覆被,则删除该单元。
图像分析师对三个来源的训练单元进行解释:(1)陆地生态系统参数化系统(STEP)训练数据库18;(2)通过对大地遥感卫星光谱-时间特征进行无监督聚类而生成的样本;(3)反馈训练单元样本,生成这些反馈训练单元是为了提高持续被错误分类的土地覆被类别或区域的准确性。STEP 数据库旨在提供全球所有生态区域的代表性土地覆被样本,其创建是为了支持 MODIS Collection 6 土地覆被类型产品(MCD12Q1)18。为了使 STEP 数据库适用于 GLanCE,我们在每个 MCD12Q1 500 m 像素内随机选取了 10 个 Landsat 像素,使用上述步骤对其进行了目视判读,并使用 GLanCE 土地覆被关键字对其进行了标注(图 2,表 2)。只收集代表同质土地覆被的训练单元作为训练数据;包含混合土地覆被的训练单元被删除。
1 级(a)和 2 级(b)土地覆被的全球土地覆被观测系统培训单元的全球空间分布。请注意,非洲和亚洲的培训单位总体密度较低,因此(a)中的 "发达 "等级在该地图中的比例过高。
为了确保研究的透明度和可复制性,我们使用了开源工具,包括 R (4.3.0)、Python 3.6.7 和 Google 地球引擎。用于收集训练数据的时间序列工具可在 GitHub (
) 上找到,用于筛选训练数据的存储库也是如此 ( )。自定义大陆定义可在该资源库中找到: 。连续变化检测与分类 (CCDC) 工具和应用程序可在谷歌地球引擎 ( ) 和 python ( ) 上找到。
- 点赞
- 收藏
- 关注作者
评论(0)