Hive安装及与Spark的集成配置

来源:转载


本博客主要介绍SparkSql基于Hive作为元数据的基本操作,包括以下内容:


1、Hive安装


2、Spark与Hive的集成


3、SparkSql的操作


注:在操作本博客的内容时,需要安装Hadoop和Spark。


其中hadoop安装可参考:https://my.oschina.net/u/729917/blog/1556872


spark安装可参考:https://my.oschina.net/u/729917/blog/1556871


1、Hive安装


a)、安装Mysql数据库,此步骤自行百度。


b)、官网下载Hive:http://mirror.bit.edu.cn/apache/hive/,作者下载后放在了目录/home/hadoop/tools/apache-hive-2.2.0-bin.tar.gz下。


c)、移动到指定目录并解压:作者是解压到/usr/local/目录下,并且作者的Hadoop和Spark均是安装在此目录下。


sudo mv /home/hadoop/tools/apache-hive-2.2.0-bin.tar.gz /usr/local/apache-hive-2.2.0-bin/
sudo tar -zxvf apache-hive-2.2.0-bin.tar.gz

d)、配置环境变量


vim ~/.bashrc
export HIVE_HOME=/usr/local/apache-hive-2.2.0-bin
export PATH=$PATH:${HIVE_HOME}/bin

环境变量生效


source ~/.bashrc

e)、在conf目录下新建一个hive-site.xml,配置hive信息,使用mysql保存hive元数据信息


[email protected]:/usr/local/apache-hive-2.2.0-bin/conf$ touch hive-site.xml

下面是hive-site.xml的信息


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>


javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true


javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver


javax.jdo.option.ConnectionUserName
root


javax.jdo.option.ConnectionPassword
0000


hive.metastore.schema.verification
false

Enforce metastore schema version consistency.
True: Verify that version information stored in metastore matches with one from Hive jars.Also disable automatic
schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.


f)、启动hive:输入hive命令即可


启动成功显示:


[email protected]:~$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.2.0-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/usr/local/apache-hive-2.2.0-bin/lib/hive-common-2.2.0.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>

2、Spark集成Hive


a)、在spark的conf目录下新建hive-site.xml文件


[email protected]:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ touch hive-site.xml


hive.metastore.uris
thrift://Master:9083

b)、启动hadoop和spark


c)、启动hive service metastore服务


[email protected]:/usr/local/spark-2.2.0-bin-hadoop2.7/bin$ hive --service metastore&

d)、启动spark-sql进行测试


[email protected]:/usr/local/spark-2.2.0-bin-hadoop2.7/bin$ ./spark-sql

启动成功后部分截屏


17/11/19 21:50:37 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/ce95f463-74ca-42de-ac85-3a283aa1520a
17/11/19 21:50:37 INFO SessionState: Created local directory: /tmp/hadoop/ce95f463-74ca-42de-ac85-3a283aa1520a
17/11/19 21:50:37 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/ce95f463-74ca-42de-ac85-3a283aa1520a/_tmp_space.db
17/11/19 21:50:37 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is file:/usr/local/spark-2.2.0-bin-hadoop2.7/bin/spark-warehouse
17/11/19 21:50:37 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
17/11/19 21:50:38 INFO SessionState: Created local directory: /tmp/2110b645-b83e-4b65-87a8-5e9f1482699e_resources
17/11/19 21:50:38 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/2110b645-b83e-4b65-87a8-5e9f1482699e
17/11/19 21:50:38 INFO SessionState: Created local directory: /tmp/hadoop/2110b645-b83e-4b65-87a8-5e9f1482699e
17/11/19 21:50:38 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/2110b645-b83e-4b65-87a8-5e9f1482699e/_tmp_space.db
17/11/19 21:50:38 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is file:/usr/local/spark-2.2.0-bin-hadoop2.7/bin/spark-warehouse
spark-sql>

分享给朋友:
您可能感兴趣的文章:
随机阅读: