GLANCE 全球土地覆被训练数据集

举报
此星光明 发表于 2024/02/19 17:41:25 2024/02/19
【摘要】 ​GLANCE 全球土地覆被训练数据集GLanCE 培训数据集向公众开放,专为区域到全球土地覆被和土地覆被变化分析而设计。该数据集的中等空间分辨率为 30 米,时间跨度为 1984 年至 2020 年,在地理和光谱上代表了全球所有生态区域。每个训练单元提供多达 23 种土地覆被特征,提供了一个统一、标准化和全面的数据库,其中包括有关土地覆被突变和渐变过程的信息,特别是在选定区域的长达 36 ...

GLANCE 全球土地覆被训练数据集


GLanCE 培训数据集向公众开放,专为区域到全球土地覆被和土地覆被变化分析而设计。该数据集的中等空间分辨率为 30 米,时间跨度为 1984 年至 2020 年,在地理和光谱上代表了全球所有生态区域。每个训练单元提供多达 23 种土地覆被特征,提供了一个统一、标准化和全面的数据库,其中包括有关土地覆被突变和渐变过程的信息,特别是在选定区域的长达 36 年的时间跨度。该数据集具有适应性强的特点,用户可根据自己的研究区域、分类算法和所需的分类图例对其进行子取样和定制,使其成为深入土地覆被调查的多功能资源。前言 – 人工智能教程

谷歌地球引擎(GEE)等先进的云计算平台可利用机器学习算法绘制区域到全球的土地覆被和土地覆被变化图。然而,准确绘制土地覆被图所需的高质量训练数据的收集工作仍然成本高昂且劳动密集。为了满足这一需求,我们创建了一个包含近 200 万个训练单元的全球数据库,时间跨度从 1984 年到 2020 年,涵盖七个一级和九个二级土地覆被等级。我们的训练数据收集方法利用了 GEE 和机器学习算法,以确保数据质量和生物地理代表性。我们从大地遥感卫星图像的光谱-时间特征空间采样,以便在全球各生态区域有效分配训练数据,并将公开可用的数据集和合作者提供的数据集纳入我们的数据库。为了反映基本的区域类别分布和干扰后地貌,我们对数据库进行了战略性扩充。我们使用了基于机器学习的交叉验证程序来删除可能被错误标记的训练单元。我们的训练数据库适用于土地覆被变化、农业、林业、水文、城市发展等一系列研究。论文地址:A global land cover training dataset from 1984 to 2020 | Scientific Data

数据属性介绍 

Column Name Description
Lat Latitude
Lon Longitude
Start_Year Start year of segment, ranging from 1984 to 2020 (integer)
End_Year End year of segment, ranging from 1984 to 2020 (integer)
Glance_Class_ID_level1 Level 1 land cover value (integer): 1 (Water), 2 (Ice/snow), 3 (Developed), 4 (Barren/sparsely vegetated), 5 (Trees), 6 (Shrub), and 7 (Herbaceous)
Glance_Class_ID_level2 Level 2 land cover value (integer): 1 (Water), 2 (Ice/snow), 3 (Developed), 4 (Soil), 5 (Rock), 6 (Beach/sand), 7 (Deciduous), 8 (Evergreen), 9 (Mixed), 10 (Shrub), 11 (Grassland), 12 (Agriculture), and 13 (Moss/lichen). NaN values present.
Leaf_Type Tree leaf type: broadleaf (1), needleleaf (2), and mixed (3). NaN values present.
Impervious_Percent Impervious percent for developed samples: low 0%-30% (1), medium 30%-60% (2), and high 60%-100% (3). NaN values present.
Tree_Location Binary integer indicating whether trees are on the interior (0) or edge (1) of a forest. NaN values present.
Veg_Density Vegetation density for trees and shrubs: sparse 0%-30% (1), open 30%-60% (2), and closed 60%-100% (3). NaN values present.
Veg_Modifier Vegetation modifiers, which can include one or more of the following: Cropland, Plantation, Wetland, Riparian/Flood, Mangrove, Greenhouse, and Trees/Shrub Present. NaN values present.
Segment_Type Indicates whether a segment is stable (0) or transitional (1). See Section 1 for a detailed description. Land cover for transitional segments is recorded at both the beginning and end of the time segment - typically the first and last three years. NaN values present.
Change Indicates presence (1) or absence (0) of land cover change for Level 1 land cover labels. Includes both abrupt change and gradual change (transitional segments (1) from the Segment_Type attribute) if it happened at any time for that training unit.
LC_Confidence Interpreter confidence in the Level 1 land cover label from 1 (lowest) to 3 (highest). NaN values present.
Level1_Ecoregion Ecoregion Level 1 number based on World Wildlife Fund definitions. For North America we used ecoregions based on the Environmental Protection Agency’s Ecoregions of North America product.
Level2_Ecoregion Ecoregion Level 2 number based on the Environmental Protection Agency’s Ecoregions of North America product. This field is available only for North America and is assigned a value of 0 for all other continents.
Continent_Code Assigned continent number: North America (1), South America (2), Africa (3), Europe (4), Asia (5), and Oceania (6).
Dataset_Code Assigned dataset number: 1, 2, 3, 4, 5, 902, 999, 700, 701, 702, 703, 704, 705, 706, and 707. Numbers correspond to each Dataset as follows: STEP, CLUSTERING, LCMAP, ABoVE, MapBiomas, Feedback, Training_augment, MODIS_algo, GeoWiki, RadEarth, Collaborator_data, BU_team_collected, GLC30, LUCAS, ASB_crop. For details see Scientific Data publication.
Glance_ID Unique ID for each sample.
ID ID for each unique combination of latitude and longitude. Change units have the same ID but different Glance_ID.

具体分类

Level 1 Level 2 Description
Water (1) Water (1) Areas covered with water throughout the year: streams, canals, lakes, reservoirs, oceans.
Ice/snow (2) Ice/snow (2) Land areas with snow and ice cover greater than 50% throughout the year.
Developed (3) Developed (3) Areas of intensive use; land covered with structures, including any land functionally related to developed/built-up activity.
Barren/sparsely vegetated (4) Land comprised of natural occurrences of soils, sand, or rocks where less than 10% of the area is vegetated.
Soil (4) Land covered with less than 10% vegetation and dominated by soil.
Rock (5) Land covered with less than 10% vegetation and dominated by rocks.
Beach/sand (6) Land covered with less than 10% vegetation and dominated by beach/sand.
Trees (5) Land where tree cover is greater than 30%. Note that cleared trees (i.e., clear-cuts) are mapped according to current cover (e.g., barren/sparsely vegetated, shrubs, or herbaceous).
Deciduous (7) Land with tree cover greater than 30% and all trees present are deciduous.
Evergreen (8) Land with tree cover greater than 30% and all trees present are evergreen.
Mixed (9) Land with tree cover greater than 30% and neither deciduous nor evergreen trees dominate.
Shrub (6) Shrub (10) Land with less than 30% tree cover, where total vegetation cover exceeds 10% and shrub cover is greater than 10%.
Herbaceous (7) Land covered by herbaceous plants. Total vegetation cover exceeds 10%, tree cover is less than 30%, and shrubs comprise less than 10% of the area.
Grassland (11) Herbaceous land covered with grass.
Agriculture (12) Herbaceous land covered with cultivated cropland.
Moss/lichen (13) Herbaceous land covered with lichen and/or moss.

数据引用

Stanimirova, R., Tarrio, K., Turlej, K., McAvoy K., Stonebrook S., Hu K-T., Arévalo P., Bullock E.L., Zhang Y., Woodcock C.E., Olofsson P., Zhu Z.,
Barber C.P., Souza C., Chen S., Wang J.A., Mensah F., Calderón-Loor M., Hadjikakou M., Bryan B.A., Graesser J., Beyene D.L., Mutasha B., Siame S.,
Siampale A., and M.A. Friedl (2023) A global land cover training dataset from 1984 to 2020. Sci Data 10, 879
https://doi.org/10.1038/s41597-023-02798-5

数据集引用

Stanimirova R., Tarrio K., Turlej K., McAvoy K., Stonebrook S., Hu K-T., Arévalo P., Bullock E.L., Zhang Y., Woodcock C.E., Olofsson P., Zhu Z.,
Barber C.P., Souza C., Chen S., Wang J.A., Mensah F., Calderón-Loor M., Hadjikakou M., Bryan B.A., Graesser J., Beyene D.L., Mutasha B., Siame S.,
Siampale A., and M.A. Friedl (2023) "A Global Land Cover Training Dataset from 1984 to 2020", Version 1.0, Radiant MLHub. [Date Accessed]
https://doi.org/10.34911/rdnt.x4xfh3

代码

var glance_training = ee.FeatureCollection("projects/sat-io/open-datasets/GLANCE/GLANCE_TRAINING_DATA_V1")

代码链接

https://code.earthengine.google.com/?scriptPath=users/sat-io/awesome-gee-catalog-examples:global-landuse-landcover/GLANCE-TRAINING

License

The dataset is provided under a Creative Commons Attribution 4.0 International Public License, unless otherwise noted.

Created by: Stanimirova et al, Boston University

Curated in GEE by: Samapriya Roy

Keywords: Glance, LULC, training dataset, Medium resolution, global dataset, land use, land cover

Last updated in GEE: 2024-01-02

1 week ago 1 week ago

Copyright © 2020 - 2024 Samapriya Roy

训练数据集合


GLanCE 培训数据由波士顿大学一组训练有素的图像分析师使用土地覆被关键字和一套在线工具(https://github.com/parevalo/measures_collector,使用谷歌地球引擎 API)收集。图像分析师结合使用高分辨率谷歌地球图像、大地遥感卫星图像、光谱反射率时间序列、植被指数和大地遥感卫星衍生的缨帽变换,在屏幕上解释土地覆被。此外,图像分析师还使用谷歌地球照片和街景(如有)来帮助他们进行解释。以下我们将数据库中代表单个陆地卫星像素的每个条目称为一个训练单元。每个训练单元对应一位图像分析师对连续变化检测与分类 (CCDC) (https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_GLOBAL_CCDC_V1) 时间片段的判读(解释见下文)(图 1)。每个单元都要进行质量评估,并有可能被标记,供第二位图像分析师审查。如果一位图像分析师与另一位图像分析师在给定训练单元的土地覆被标签上存在分歧,则由第三位团队成员进行审查,必要时重新解释或删除该单元。如果没有可用的高分辨率图像,且团队成员无法以较高的置信度确定土地覆被,则删除该单元。

图像分析师对三个来源的训练单元进行解释:(1)陆地生态系统参数化系统(STEP)训练数据库18;(2)通过对大地遥感卫星光谱-时间特征进行无监督聚类而生成的样本;(3)反馈训练单元样本,生成这些反馈训练单元是为了提高持续被错误分类的土地覆被类别或区域的准确性。STEP 数据库旨在提供全球所有生态区域的代表性土地覆被样本,其创建是为了支持 MODIS Collection 6 土地覆被类型产品(MCD12Q1)18。为了使 STEP 数据库适用于 GLanCE,我们在每个 MCD12Q1 500 m 像素内随机选取了 10 个 Landsat 像素,使用上述步骤对其进行了目视判读,并使用 GLanCE 土地覆被关键字对其进行了标注(图 2,表 2)。只收集代表同质土地覆被的训练单元作为训练数据;包含混合土地覆被的训练单元被删除。 

1 级(a)和 2 级(b)土地覆被的全球土地覆被观测系统培训单元的全球空间分布。请注意,非洲和亚洲的培训单位总体密度较低,因此(a)中的 "发达 "等级在该地图中的比例过高。

为了确保研究的透明度和可复制性,我们使用了开源工具,包括 R (4.3.0)、Python 3.6.7 和 Google 地球引擎。用于收集训练数据的时间序列工具可在 GitHub (https://github.com/parevalo/measures_collector) 上找到,用于筛选训练数据的存储库也是如此 (https://github.com/ma-friedl/GlanceFiltering)。自定义大陆定义可在该资源库中找到:https://measures-glance.github.io/glance-grids/params。连续变化检测与分类 (CCDC) 工具和应用程序可在谷歌地球引擎 (https://glance.earthengine.app/view/fulltstools) 和 python (https://github.com/repository-preservation/lcmap-pyccd) 上找到。 

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。