Hadoop - Hive (하이브) 설치

category DevOps/Hadoop 2021. 12. 24. 15:08
728x90
반응형

MySQL 설치

1. MySQL 패키지 설치

hadoop@hadoop-001:~$ sudo apt-get update
...
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage
84 updates can be applied immediately.

hadoop@hadoop-001:~$ sudo apt-get install mysql-server -y
...
Created symlink /etc/systemd/system/multi-user.target.wants/mysql.service → /lib/systemd/system/mysql.service.
mysql-server (8.0.27-0ubuntu0.20.04.1) 설정하는 중입니다 ...
Processing triggers for systemd (245.4-4ubuntu3.11) ...
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for libc-bin (2.31-0ubuntu9.2) ...

2. MySQL root 초기화

# 1.mysqld 데몬을 중지하고 mysqld_safe 모드로 데몬 실행
root@hadoop-001:~# sudo systemctl stop mysql
root@hadoop-001:~# sudo mkdir -p /var/run/mysqld && sudo chown -R mysql:mysql /var/run/mysqld && sudo /usr/bin/mysqld_safe --skip-grant-tables &

# 2.mysql 접속
root@hadoop-001:~# mysql

# 3.패스워드 공백으로 변경
mysql> UPDATE mysql.user SET authentication_string=null WHERE user='root';
Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.01 sec)

# 4. mysqld 데몬 재시작
root@hadoop-001:~# sudo systemctl start mysql

# 5. root 패스워드 변경
root@hadoop-001:~# mysql -uroot -p
# 그냥 엔터키를 입력
Enter password: 

mysql> ALTER USER 'root'@'localhost' IDENTIFIED BY '1111';
Query OK, 0 rows affected (0.10 sec)

mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.09 sec)

# 6. root 패스워드 확인
root@hadoop-001:~# mysql -uroot -p
Enter password: 1111

3. MySQL hive 유저 생성

hadoop@hadoop-001:~$ mysql -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 306
Server version: 8.0.27-0ubuntu0.20.04.1 (Ubuntu)

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE USER 'hive'@'%' IDENTIFIED BY '1234';
Query OK, 0 rows affected (0.01 sec)

mysql> GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'%';
Query OK, 0 rows affected (0.10 sec)

mysql> 
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.10 sec)

mysql> exit
Bye

 

Hive 설치

1. 다운로드 및 압축해제

hadoop@hadoop-001:~$ sudo curl -o /usr/local/apache-hive-3.1.2-bin.tar.gz https://dlcdn.apache.org/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  265M  100  265M    0     0  8415k      0  0:00:32  0:00:32 --:--:-- 8583k

hadoop@hadoop-001:~$ sudo mkdir -p /usr/local/hive && sudo tar -xvzf /usr/local/apache-hive-3.1.2-bin.tar.gz -C /usr/local/hive --strip-components 1
...
apache-hive-3.1.2-bin/hcatalog/share/webhcat/svr/lib/commons-exec-1.1.jar
apache-hive-3.1.2-bin/hcatalog/share/webhcat/svr/lib/jul-to-slf4j-1.7.10.jar

2. 환경변수 설정

hadoop@hadoop-001:~$ vi /etc/profile
# hadoop cluster
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
HADOOP_HOME=/usr/local/hadoop
HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
HIVE_HOME=/usr/local/hive

export JAVA_HOME HADOOP_HOME HADOOP_CONF_DIR HIVE_HOME
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin

hadoop@hadoop-001:~$ vi ~/.bashrc
# /etc/profile 동일하게 설정

3. HDFS 디렉토리 생성 및 권한 할당

hadoop@hadoop-001:~$ hdfs dfs -mkdir -p /bigdata/tmp
hadoop@hadoop-001:~$ hdfs dfs -mkdir -p /bigdata/hive/warehouse
hadoop@hadoop-001:~$ hdfs dfs -chmod g+w /bigdata/tmp
hadoop@hadoop-001:~$ hdfs dfs -chmod g+w /bigdata/hive/warehouse
hadoop@hadoop-001:~$ hdfs dfs -ls /bigdata
Found 2 items
drwxr-xr-x   - hadoop supergroup          0 2021-12-24 11:32 /bigdata/hive
drwxrwxr-x   - hadoop supergroup          0 2021-12-24 11:33 /bigdata/tmp

4. Hive 환경 설정

hadoop@hadoop-001:~$ vi $HIVE_HOME/conf/hive-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration> 
    <property> 
        <name>javax.jdo.option.ConnectionURL</name> 
        <value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true&amp;serverTimezone=Asia/Seoul</value> 
        <description>metadata is stored in a MySQL server</description> 
    </property> 
    <property> 
        <name>javax.jdo.option.ConnectionDriverName</name> 
        <value>com.mysql.cj.jdbc.Driver</value> 
        <description>MySQL JDBC driver class</description> 
    </property> 
    <property> 
        <name>javax.jdo.option.ConnectionUserName</name> 
        <value>hive</value> 
        <description>user name for connecting to mysql server</description> 
    </property> 
    <property> 
        <name>javax.jdo.option.ConnectionPassword</name> 
        <value>1234</value> 
        <description>hivepassword for connecting to mysql server</description> 
    </property>    
</configuration>

4. JDBC jar 파일 다운로드 및 복사

hadoop@hadoop-001:~$ curl -o $HIVE_HOME/lib/mysql-connector-java-8.0.22.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.22/mysql-connector-java-8.0.22.jar
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2333k  100 2333k    0     0  1175k      0  0:00:01  0:00:01 --:--:-- 1174k

hadoop@hadoop-001:~$ ls -al $HIVE_HOME/lib/mysql-connector-java-8.0.22.jar
-rw-rw-r-- 1 hadoop hadoop 2389216 12월 24 12:38 /usr/local/hive/lib/mysql-connector-java-8.0.22.jar

6. guava-xx.jar 파일 교체

Hadoop 3, Hive 3에서 TEZ를 연동할 때 라이브러리 정합성이 맞지 않아 NoSuchMethodError가 발생합니다.

Hadoop 3는 guava가 27버전, Hive 3는 guava가 19버전이라서 발생합니다.

다음과 같이 라이브러리 정합성을 맞추도록 합니다.

# 오류 발생
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
hadoop@hadoop-001:~$ ls -al $HIVE_HOME/lib/gu*
-rw-r--r-- 1 hadoop hadoop 2308517  9월 27  2018 /usr/local/hive/lib/guava-19.0.jar
hadoop@hadoop-001:~$ ls -al $HADOOP_HOME/share/hadoop/common/lib/gu*
-rw-r--r-- 1 hadoop hadoop 2747878  7월  7  2020 /usr/local/hadoop/share/hadoop/common/lib/guava-27.0-jre.jar

hadoop@hadoop-001:~$ sudo rm -rf $HIVE_HOME/lib/guava-19.0.jar
hadoop@hadoop-001:~$ cp $HADOOP_HOME/share/hadoop/common/lib/guava-27.0-jre.jar $HIVE_HOME/lib/

hadoop@hadoop-001:~$ ls -al $HIVE_HOME/lib/gu*
-rw-r--r-- 1 hadoop hadoop 2747878 12월 24 11:49 /usr/local/hive/lib/guava-27.0-jre.jar

7. Metastore 데이터 베이스 초기화

hadoop@hadoop-001:~$ hive --service schemaTool -dbType mysql -initSchema
Initialization script completed
schemaTool completed

hadoop@hadoop-001:~$ hive --service schemaTool -dbType mysql -info
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:	 jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true
Metastore Connection Driver :	 com.mysql.cj.jdbc.Driver
Metastore connection User:	 hive
Hive distribution version:	 3.1.0
Metastore schema version:	 3.1.0
schemaTool completed

 

Hive 시작

1. Hive 시작

hadoop@hadoop-001:~$ hive 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 65f75d13-7c0a-42ea-9dc6-588a61fc5fd4

Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

2. Hive 에서 테이블 생성 해보기

Hive Session ID = 5b23cef4-cb72-4e66-a9f3-968effa5883a
hive> show databases;
OK
default
Time taken: 0.865 seconds, Fetched: 1 row(s)
hive> create table hive_test(hive int, mysql int);
OK
Time taken: 0.949 seconds
hive>

3. MySQL 에서 metastore 데이터베이스에 테이블 등록 확인

hadoop@hadoop-001:~$ mysql -uhive -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 127
Server version: 8.0.27-0ubuntu0.20.04.1 (Ubuntu)

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use metastore;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

mysql> SELECT * FROM TBLS;
+--------+-------------+-------+------------------+--------+------------+-----------+-------+-----------+---------------+--------------------+--------------------+----------------------------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER  | OWNER_TYPE | RETENTION | SD_ID | TBL_NAME  | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED                     |
+--------+-------------+-------+------------------+--------+------------+-----------+-------+-----------+---------------+--------------------+--------------------+----------------------------------------+
|      1 |  1640320480 |     1 |                0 | hadoop | USER       |         0 |     1 | hive_test | MANAGED_TABLE | NULL               | NULL               | 0x00                                   |
+--------+-------------+-------+------------------+--------+------------+-----------+-------+-----------+---------------+--------------------+--------------------+----------------------------------------+
1 row in set (0.00 sec)
728x90
반응형

'DevOps > Hadoop' 카테고리의 다른 글

Hadoop 클러스터 설치 (고가용성) - 2  (0) 2021.12.20
Hadoop 클러스터 설치 (고가용성) - 1  (0) 2021.12.20
Hadoop 클러스터 설치 (기본)  (0) 2021.12.17
하둡 (hadoop) 이란?  (0) 2021.12.17