华为云下安装部署Hive环境
在安装hive之前,hadoop 2.8,MySql 8.0已部署完毕,hadoop 安装目录为/opt/hadoop2.8.0
一、下载解压
到apache hive 官网 下载http://hive.apache.org/
本次安装下载是Hive-2.3.3版本到/opt目录下,解压:
tar xzvf apache-hive-2.3.3-bin.tar.gz.
mv apache-hive-2.3.3-bin
二、 配置环境变量
在/etc/profile文件中加入HIVE_HOME变量,
export HIVE_HOME=/opt/hive-2.3.3
在PATH中加入$HIVE_HOME/bin
source /etc/profile
三、配置Hive
cd /opt/hive-2.3.3/conf
cp hive-env.sh.template hive-env.sh
加入HADOOP_HOME变量
HADOOP_HOME=/opt/hadoop-2.8.0
拷贝hive-site配置文件
cp hive-default.xml.template hive-site.xml
下面分别介绍hive的三种安装模式
3.1 单用户模式
即元数据使用Derby存储的模式
直接在命令行下输入hive,出现以下错误信息:
Exception in thread "main"java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative pathin absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
在文件hive-site.xml中配置以下两个参数的路径解决此问题:
hive.exec.local.scratchdir
hive.downloaded.resources.dir
这两个参数是Hive的缓存目录,如果不配置,使用Hive时将报错,配置之后即单用户模式
mkdir /opt/hive-2.3.3/iotmp
配置两个参数路径为:/opt/hive-2.3.3/iotmp
... ...
<property>
<name>hive.exec.local.scratchdir</name>
<value>/opt/hive-2.3.3/iotmp</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/opt/hive-2.3.3/iotmp</value>
... ...
输入hive,成功进入提示符hive>,输入简单cli命令 show tables
hive>SHOW TABLES;
FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
原因是没有初始化derby database
[root@cdm3 bin]# schematool -initSchema -dbType derby
.....
Initialization script hive-schema-2.3.0.derby.sql
Error: FUNCTION 'NUCLEUS_ASCII' already exists. (state=X0Y68,code=30000)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
......
后来在hive官网查到解决办法:
1、首次运行hive,可以用下面命令初始化数据库;
schematool -initSchema -dbType derby
2、如果已经运行过了hive,试图使用 initSchema命令初始化将会报错 ,需要将当前目录下的metastore_db更名或删除;
mv metastore_db metastore_db.tmp
3、重新运行
schematool -initSchema -dbType derby
这时可以看到,在/opt/hive-2.3.3/bin/下又新建了一个目录,metastore_db
再次执行
hive> show tables;
OK
Time taken: 7.539 seconds
3.2、多用户模式
即元数据使用非Derby,而使用其他的关系型数据库存储(例如MySQL、Oracle等),本次使用Mysql
Hive的元数据:表信息,表属性,分区,列等等信息,Owner
Hive的实际数据:在HDFS上
多用户模式的Hive-site.xml配置,添加4个配置项即可
1. javax.jdo.option.ConnectionURL(jdbc:mysql://centos:3306/hive?createDatabaseIfNotExist=true),指定的是Hive连接数据库的连接字符串;
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://172.16.248.8:3306/hive?createDatabaseIfNotExist=true</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
2. javax.jdo.option.ConnectionDriver(com.mysql.jdbc.Driver),指定的是MySQL驱动类的入口;
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
3. javax.jdo.option.ConnectionUserName(hive),指定了数据库的用户名;
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>Username to use against metastore database</description>
</property>
4. javax.jdo.option.ConnectionPassword,指定了数据库的密码;
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>H*******</value>
<description>password to use against metastore database</description>
</property>
运行hive,出现如下错误信息:
Caused by:org.datanucleus.exceptions.NucleusException: Attempt to invoke the"BONECP" plugin to create a ConnectionPool gave an error : Thespecified datastore driver ("com.mysql.jdbc.Driver") was not found inthe CLASSPATH. Please check your CLASSPATH specification, and the name of thedriver.
此问题是因为缺少数据库对应的驱动包造成的,将MySQL数据库的驱动包拷贝到$HIVE_HOME/lib/下
cp mysql-connector….jar $HIVE_HOME/lib/
cp mysql-connector-java-8.0.12.jar $HIVE_HOME/lib/
可能遇到的ERROR3:
Caused by:org.apache.hadoop.hive.ql.metadata.HiveException:org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Hivemetastore database is not initialized. Please use schematool (e.g. ./schematool-initSchema -dbType ...) to create the schema. If needed, don't forget toinclude the option to auto-create the underlying database in your JDBCconnection string (e.g. ?createDatabaseIfNotExist=true for mysql))
此问题是因为没有初始化元数据库
schematool -initSchema –dbType mysql
错误:Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Could not crea
te connection to database server.
解决办法:
原来使用的是驱动 mysql-connector-java-5.1.32.jar,mysql 是8.0版本,需要下载
mysql-connector-java-8.0.12.jar
可能遇到的ERROR4:
MySQL远程访问受限,需要授权
grant all privileges on *.* to hive@"%" identified by "Hive@8848" with grant option;
flush privileges;
CREATE USER 'user'@'localhost' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON database.table TO 'user'@'localhost';
在mysql 创建用户hive,密码Hive@8848,给hive用户授权,
mysql 8.0 无法grant 任意主机给hive 用户,直接操作数据库,具体如下:
mysql> use mysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select host,user,authentication_string,plugin from user;
+-----------+------------------+---------------------------------------------------------------
---------+-----------------------+
| host | user | authentication_string
| plugin |
+-----------+------------------+---------------------------------------------------------------
---------+-----------------------+
| % | root | $A$005$qDrc COOsm=P{ MDx{MzxbNEYao2mxJH3E/PLvjJKz2RwAHA8wzCf
AvQ5pvy2 | caching_sha2_password |
| localhost | hive | $A$005$[f;FT`hpMI]6%%>CYxJoojdS7w57bx5ohpWqqXU5dt3qrbLkOuy4onK
3tj2 | caching_sha2_password |
| localhost | mysql.infoschema | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVER
BRBEUSED | caching_sha2_password |
| localhost | mysql.session | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVER
BRBEUSED | caching_sha2_password |
| localhost | mysql.sys | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVER
BRBEUSED | caching_sha2_password |
+-----------+------------------+---------------------------------------------------------------
---------+-----------------------+
5 rows in set (0.00 sec)
mysql> update user set host='%' where user='hive';
schematool -initSchema –dbType mysql
第一次出错
。。。no dbType supplied
。。。
mysql -uhive -p
schematool -initSchema -dbType mysql --verbose
正常
mysql> use hive
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+---------------------------+
| Tables_in_hive |
+---------------------------+
| AUX_TABLE |
| BUCKETING_COLS |
| CDS |
| COLUMNS_V2 |
| COMPACTION_QUEUE |
| COMPLETED_COMPACTIONS |
| COMPLETED_TXN_COMPONENTS |
| DATABASE_PARAMS |
| DBS |
| DB_PRIVS |
| DELEGATION_TOKENS |
| FUNCS |
| FUNC_RU |
| GLOBAL_PRIVS |
| HIVE_LOCKS |
| IDXS |
| INDEX_PARAMS |
| KEY_CONSTRAINTS |
| MASTER_KEYS |
| NEXT_COMPACTION_QUEUE_ID |
| NEXT_LOCK_ID |
| NEXT_TXN_ID |
| NOTIFICATION_LOG |
| NOTIFICATION_SEQUENCE |
| NUCLEUS_TABLES |
| PARTITIONS |
| PARTITION_EVENTS |
| PARTITION_KEYS |
| PARTITION_KEY_VALS |
| PARTITION_PARAMS |
| PART_COL_PRIVS |
| PART_COL_STATS |
| PART_PRIVS |
| ROLES |
| ROLE_MAP |
| SDS |
| SD_PARAMS |
| SEQUENCE_TABLE |
| SERDES |
| SERDE_PARAMS |
| SKEWED_COL_NAMES |
| SKEWED_COL_VALUE_LOC_MAP |
| SKEWED_STRING_LIST |
| SKEWED_STRING_LIST_VALUES |
| SKEWED_VALUES |
| SORT_COLS |
| TABLE_PARAMS |
| TAB_COL_STATS |
| TBLS |
| TBL_COL_PRIVS |
| TBL_PRIVS |
| TXNS |
| TXN_COMPONENTS |
| TYPES |
| TYPE_FIELDS |
| VERSION |
| WRITE_SET |
+---------------------------+
57 rows in set (0.00 sec)
3.3 远程模式
启动metastore server
Hive –service metastore &
- 点赞
- 收藏
- 关注作者
评论(0)