hudi 0.9.0适配hbase 2.2.6

举报
从大数据到人工智能 发表于 2022/03/28 00:29:00 2022/03/28
【摘要】 总览在hudi中,hbase可以作为索引数据的存储,hudi默认使用的hbase版本为1.2.3。在hbase从1.x升级到2.x之后,其api发生了较大的变化,直接修改hudi中hbase的版本是不合适的,即会发生编译错误。本文对部分源码进行修改以使hbase 2.2.6适配hudi 0.9.0 编译报错如果我们直接修改hbase的版本为2.2.6的话,会出现如下编译错误:[ERROR]...

总览

在hudi中,hbase可以作为索引数据的存储,hudi默认使用的hbase版本为1.2.3。

在hbase从1.x升级到2.x之后,其api发生了较大的变化,直接修改hudi中hbase的版本是不合适的,即会发生编译错误。

本文对部分源码进行修改以使hbase 2.2.6适配hudi 0.9.0

编译报错

如果我们直接修改hbase的版本为2.2.6的话,会出现如下编译错误:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile (default-compile) on project hudi-common: Compilation failure: Compilation failure: 
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/common/bootstrap/index/HFileBootstrapIndex.java:[181,34] no suitable method found for createReader(org.apache.hadoop.fs.FileSystem,org.apache.hudi.common.bootstrap.index.HFileBootstrapIndex.HFilePathForReader,org.apache.hadoop.hbase.io.hfile.CacheConfig,org.apache.hadoop.conf.Configuration)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.FSDataInputStreamWrapper,long,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/common/bootstrap/index/HFileBootstrapIndex.java:[309,93] cannot find symbol
[ERROR]   symbol:   method getKeyValue()
[ERROR]   location: variable scanner of type org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/common/bootstrap/index/HFileBootstrapIndex.java:[534,51] incompatible types: org.apache.hudi.common.bootstrap.index.HFileBootstrapIndex.HoodieKVComparator cannot be converted to org.apache.hadoop.hbase.CellComparator
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/common/bootstrap/index/HFileBootstrapIndex.java:[537,51] incompatible types: org.apache.hudi.common.bootstrap.index.HFileBootstrapIndex.HoodieKVComparator cannot be converted to org.apache.hadoop.hbase.CellComparator
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[72,24] no suitable method found for createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.hfile.CacheConfig,org.apache.hadoop.conf.Configuration)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.FSDataInputStreamWrapper,long,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[80,24] no suitable method found for createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.FSDataInputStreamWrapper,int,org.apache.hadoop.hbase.io.hfile.CacheConfig,org.apache.hadoop.conf.Configuration)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.FSDataInputStreamWrapper,long,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR]     method org.apache.hadoop.hbase.io.hfile.HFile.createReader(org.apache.hadoop.fs.FileSystem,org.apache.hadoop.fs.Path,org.apache.hadoop.hbase.io.hfile.CacheConfig,boolean,org.apache.hadoop.conf.Configuration) is not applicable
[ERROR]       (actual and formal argument lists differ in length)
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[114,56] incompatible types: org.apache.hadoop.hbase.io.hfile.HFileBlock cannot be converted to java.nio.ByteBuffer
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[149,27] cannot find symbol
[ERROR]   symbol:   method getKeyValue()
[ERROR]   location: variable scanner of type org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[180,54] cannot find symbol
[ERROR]   symbol:   method getKeyValue()
[ERROR]   location: variable scanner of type org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[200,50] cannot find symbol
[ERROR]   symbol:   method getKeyValue()
[ERROR]   location: variable scanner of type org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] /root/hudi-0.9.0/hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java:[224,28] cannot find symbol
[ERROR]   symbol:   method getKeyValue()
[ERROR]   location: variable keyScanner of type org.apache.hadoop.hbase.io.hfile.HFileScanner
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hudi-common

针对上述问题,我们发现主要的兼容性问题有4个:

  1. HFileBootstrapIndex#createReader方法参数问题
  2. HFileScanner#getKeyValue方法在hbase 2.2.6中已经不存在了
  3. HFile#withComparator方法的入参为CellComparator类型,而源码中的则是HoodieKVComparator
  4. HFileReaderImpl#getMetaBlock方法返回参数的变化由ByteBuffer变为HFileBlock

hudi源码修改

针对上述问题,我们进行如下修改:

  1. 查看Hbase源码,我们发现HFileBootstrapIndex#createReader增加了一个类型为boolean的参数primaryReplicaReader,该参数说明如下:
  /**
  * Creates reader with cache configuration disabled
  * @param fs filesystem
  * @param path Path to file to read
  * @return an active Reader instance
  * @throws IOException Will throw a CorruptHFileException
  * (DoNotRetryIOException subtype) if hfile is corrupt/invalid.
  */
  public static Reader createReader(FileSystem fs, Path path, Configuration conf)
      throws IOException {
    // The primaryReplicaReader is mainly used for constructing block cache key, so if we do not use
    // block cache then it is OK to set it as any value. We use true here.
    return createReader(fs, path, CacheConfig.DISABLED, true, conf);
  }

所以我们可以对HFileBootstrapIndex类的182-183行进行如下修改从而解决这个问题(增加了一个true参数):

      HFile.Reader reader = HFile.createReader(fileSystem, new HFilePathForReader(hFilePath),
          new CacheConfig(conf), true, conf);

(在编译中可能会遇到其他相同的createReader参数问题,用上述方法进行修改即可)

  1. HFileScanner#getKeyValue方法在hbase升级之后已经替换为HFileScanner#getCell,具体可参考HFile的提交历史.

所以我们可以对HFileBootstrapIndex类的309行修改为:

          keys.add(converter.apply(getUserKeyFromCellKey(CellUtil.getCellKeyAsString(scanner.getCell()))));

(在编译中可能会遇到其他相同的getKeyValue方法问题,用上述方法进行修改即可)

  1. 在hbase升级之后,我们可以看到HFile#withComparator需要的参数为CellComparator:

所以我们可以通过修改HFileBootstrapIndex的585-586行,使HoodieKVComparator继承CellComparatorImpl

  public static class HoodieKVComparator extends CellComparatorImpl {
  }

(在编译中可能会遇到其他相同问题,用上述方法进行修改即可)

  1. HFileReaderImpl#getMetaBlock方法返回参数具体问题如下:

我们再来看一下这个方法的变更历史:

由此我们知道可以通过修改HoodieHFileReader的115行,改为如下:

      ByteBuff serializedFilter = reader.getMetaBlock(KEY_BLOOM_FILTER_META_BLOCK, false).getBufferWithoutHeader();

(在编译中可能会遇到相同问题,用上述方法进行修改即可)

其他问题

由于hudi 0.9.0的jetty版本和hbase 2.2.6的jetty版本存在冲突,所以我们需要排除掉hbase的jetty,使用hudi要求的jetty版本。
这个问题可能在编译阶段不会报错,但是在运行阶段是会报错的。

那么解决了上述问题之后,就可以让hudi使用hbase 2.2.6啦!

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。