HBase hbck详解

举报
FI小粉丝 发表于 2022/05/17 16:52:07 2022/05/17
【摘要】 HBaseFsck(hbck)是一种命令行工具,可检查region一致性和表完整性问题并修复损坏。目前HBCK工具有两个版本,HBCK1和HBCK2。两个版本的HBCK工具在设计上已经发生的非常大的变化,在使用方式上也有比较大的差异,两个版本的工具只能使用在对应的内核版本上,无法混用。hbck1(6.5.1版本及以前)hbck1主要用于在hbase1.x版本进行检查or修复。常用命令:-fi...

HBaseFsckhbck)是一种命令行工具,可检查region一致性和表完整性问题并修复损坏。

目前HBCK工具有两个版本,HBCK1HBCK2。两个版本的HBCK工具在设计上已经发生的非常大的变化,在使用方式上也有比较大的差异,两个版本的工具只能使用在对应的内核版本上,无法混用。

hbck16.5.1版本及以前)

hbck1主要用于在hbase1.x版本进行检查or修复。

常用命令:

-fixAssignments

-fixMeta

-fixHdfsHoles

-fixHdfsOrphans

-fixTableOrphans

-fixhdfsOverlaps

-sidelineBigOverlaps

-maxOverlapsToSideline <N>

-fixReferenceFiles

-repair

-help

hbck帮助,执行后反馈hbck的修复命令以及注释。

命令:hbase hbck -help

Usage: fsck [opts] {only tables}

 where [opts] are:

   -help Display help options (this)

   -details Display full report of all regions.

   -timelag <timeInSeconds>  Process only regions that  have not experienced any metadata updates in the last  <timeInSeconds> seconds.

   -sleepBeforeRerun <timeInSeconds> Sleep this many seconds before checking if the fix worked if run with -fix

   -summary Print only summary of the tables and status.

   -metaonly Only check the state of the hbase:meta table.

   -sidelineDir <hdfs://> HDFS path to backup existing meta.

   -boundaries Verify that regions boundaries are the same between META and store files.

   -exclusive Abort if another hbck is exclusive or fixing.

   -disableBalancer Disable the load balancer.

   -showOfflineRegions Show table regions which are in OFFLINE state in Assignment Manager's memory.


  Metadata Repair options: (expert features, use with caution!)

   -fix              Try to fix region assignments.  This is for backwards compatiblity

   -fixAssignments   Try to fix region assignments.  Replaces the old -fix

   -fixRITAssignment Try to fix region assignments which are in transition from longer duration

   -fixMeta          Try to fix meta problems.  This assumes HDFS region info is good.

   -noHdfsChecking   Don't load/check region info from HDFS. Assumes hbase:meta region info is good. Won't check/fix any HDFS issue, e.g. hole, orphan, or overlap

   -fixHdfsHoles     Try to fix region holes in hdfs.

   -fixHdfsOrphans   Try to fix region dirs with no .regioninfo file in hdfs

   -fixTableOrphans  Try to fix table dirs with no .tableinfo file in hdfs (online mode only)

   -fixHdfsOverlaps  Try to fix region overlaps in hdfs.

   -fixVersionFile   Try to fix missing hbase.version file in hdfs.

   -maxMerge <n>     When fixing region overlaps, allow at most <n> regions to merge. (n=5 by default)

   -sidelineBigOverlaps  When fixing region overlaps, allow to sideline big overlaps

   -maxOverlapsToSideline <n>  When fixing region overlaps, allow at most <n> regions to sideline per group. (n=2 by default)

   -fixSplitParents  Try to force offline split parents to be online.

   -removeParents    Try to offline and sideline lingering parents and keep daughter regions.

   -ignorePreCheckPermission  ignore filesystem permission pre-check

   -fixReferenceFiles  Try to offline lingering reference store files

   -fixEmptyMetaCells  Try to fix hbase:meta entries not referencing any region (empty REGIONINFO_QUALIFIER rows)


  Datafile Repair options: (expert features, use with caution!)

   -checkCorruptHFiles     Check all Hfiles by opening them to make sure they are valid

   -sidelineCorruptHFiles  Quarantine corrupted HFiles.  implies -checkCorruptHFiles


  Metadata Repair shortcuts

   -repair           Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans -fixHdfsOverlaps -fixVersionFile -sidelineBigOverlaps -fixReferenceFiles -fixTableLocks -fixOrphanedTableZnodes

   -repairHoles      Shortcut for -fixAssignments -fixMeta -fixHdfsHoles


  Table lock options

   -fixTableLocks    Deletes table locks held for a long time (hbase.table.lock.expire.ms, 10min by default)


  Table Znode options

   -fixOrphanedTableZnodes    Set table state in ZNode to disabled if table does not exists


 Replication options

   -fixReplication   Deletes replication queues for removed peers


 Clean Table options

   -cleanUpTables     Deletes the corrupted or normal tables. This command should have atleast one table name as argument and it shouldn't be combined with other commands.

-fixAssignments

使用场景:

region不在线。

Hbck报错:

ERROR: Region { meta => null, hdfs => hdfs://hacluster/hbase/data/default/xxxxxxxx/xxxxxxxxxxxxx, deployed => , replicaId => 0 } on HDFS, but not listed in hbase:meta or deployed on any region server

注意点:

         修复前,可以先检查下表region是否连续,如果连续,执行命令修复后,可能出现overlaps问题。


-fixMeta

使用场景:

         Meta表数据异常。

HBCK报错:

         ERROR: Region { meta => XXXXXXXXXXX,4150,1634403123676.xxxxxxxxxxxxxxxxxxx., hdfs => null, deployed => , replicaId => 0 } found in META, but not in HDFS or deployed on any region server.

命令一般可以结合-fixAssignments一起使用:hbase hbck -fixMeta -fixAssignments tableName


-fixHdfsHoles

使用场景:

         Region不连续,中间存在空洞(简单理解就是:12345,中间突然缺了3);

Hbck报错:

         ERROR: There is a hole in the region chain between 5980 and 6000.  You need to create a new .regioninfo and region dir in hdfs to plug the hole.

注意点:

         执行修复前,先查看是否有region不在线,如果有则先使用-fixAssignments修复。

-fixHdfsOrphans

使用场景:

         regioninfo丢失。

Hbck报错:

         ERROR: Orphan region in HDFS: Unable to load .regioninfo from table XXXXXX

 in hdfs dir hdfs://hacluster/hbase/data/default/ XXXXXX /xxxxxxxxxxxxxxx!  It may be an invalid format or version file.  Treating as an orphaned regiondir.

注意点:

         丢块、人为删除,都可能导致该问题,多修复几次。

-fixTableOrphans

使用场景:

         tableinfo丢失。

Hbck报错:

         TableInfoMissingException: No table descriptor file under hdfs://hacluster/hbase/data/default/XXXXXXX

注意点:

         丢块、人为删除,都可能导致该问题,一般都会修复失败,可以在相同版本集群建同名表,复制tableinfo进行恢复。

-fixhdfsOverlaps

使用场景:

         Region之间出现重叠的情况。

Hbck报错:

         ERROR: (region XXXXXX,994,1599460846542.xxxxxxxxxx.) Multiple regions have the same startkey: 994

ERROR: (region XXXXXX,994,1599543035805.xxxxxxxxxx.) Multiple regions have the same startkey: 994

ERROR: (regions XXXXXX,994,1599543035805.xxxxxxxxxx. and XXXXXX,998,1571500798247.xxxxxxxxxx.) There is an overlap in the region chain.

注意点:

         一般一次修复不成功,需要多执行几次。

         该修复可能影响业务(修复过程,会把重叠的region下线,合并为1region)。

-sidelineBigOverlaps -maxOverlapsToSideline <N>

使用场景:

         sidelineBigOverlaps:修复overlap问题过程中,允许跟其他region重叠次数最多的一些region不参与(修复后,可以把没有参与的数据通过bulk load加载到相应的region

       maxOverlapsToSideline修复overlap问题过程中,一组里最多允许多少个region不参与

使用命令:

         hbase hbck -repair -sidelineBigOverlaps -maxOverlapsToSideline 10 tableName

        

-fixReferenceFiles

使用场景:

         修复残留的reference文件

Hbck报错:

         ERROR: Found lingering reference file hdfs://hacluster/hbase4/data/default/XXXXXX/xxxxxxxxxx/F/3f7f9c436845499eaea025965d6528e3.75b81cdf8bd8e218f581dc274423be1c

-repair

         该命令是多个命令的集合,包含:

         -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans -fixHdfsOverlaps -fixVersionFile -sidelineBigOverlaps -fixReferenceFiles -fixTableLocks –fixOrphanedTableZnodes

         一般overlap问题,可以简单粗暴的直接repair

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。