Hadoop 클러스터 설치 (기본)

category DevOps/Hadoop 2021. 12. 17. 15:57
728x90
반응형

설치 계획

  • 1개의 마스터 노드와 2의 데이터 노드로 클러스터를 구성
  • 하둡 3.3.0 버전 설치
hadoop-001 서버 hadoop-002 서버 hadoop-003 서버
NameNode SecondaryNameNode  
- DataNode DataNode
ResourceManager NodeManager NodeManager
JobHistoryServer    

 

사전 설정

1. hosts 파일 추가

마스터 노드와 데이터 노드를 hosts 파일에 추가해준다.

물리서버면 설정할 필요없습니다. 필자는 VirtualBox를 활용하여 테스트하였습니다.

# hadoop
20.200.0.110       hadoop-001
20.200.0.102       hadoop-002
20.200.0.103       hadoop-003

2. ssh 공개키 생성 및 접속 설정

마스터 노드를 제외한 모든 데이터 노드 서버에도 ~/.ssh/authorized_keys 파일을 추가해준다.

hadoop@hadoop-001:~$ mkdir ~/.ssh/
hadoop@hadoop-001:~$ sudo ssh-keygen -t rsa -f ~/.ssh/id_rsa
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:yCGThyuzJOWWysUBF5g+UYmEOrjhk4Gq/kunRTXrA/o hadoop@hadoop-001
The keys randomart image is:
+---[RSA 3072]----+
|oo=+o            |
|.=o. o           |
|= o.= oo         |
|**...*.oo        |
|++@o.oo.S        |
|+O.+o o          |
|o.oo o o         |
|. . =   .        |
|...+.E           |
+----[SHA256]-----+
hadoop@hadoop-001:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hadoop@hadoop-001:~$ chmod 0600 ~/.ssh/authorized_keys
hadoop@hadoop-001:~$ ls -al ~/.ssh/authorized_keys
-rw------- 1 hadoop hadoop 571 12월 17 14:23 /home/hadoop/.ssh/authorized_keys
hadoop@hadoop-001:~$ cat ~/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc...<생략>

3. ssh 공개키 소유권 및 권한 변경

 

hadoop@hadoop-001:~$ sudo chown -R hadoop:hadoop ~/.ssh
hadoop@hadoop-001:~$ sudo chmod 600 ~/.ssh/id_rsa
hadoop@hadoop-001:~$ sudo chmod 644 ~/.ssh/id_rsa.pub
hadoop@hadoop-001:~$ sudo chmod 644 ~/.ssh/authorized_keys
hadoop@hadoop-001:~$ sudo chmod 644 ~/.ssh/known_hosts

 

설치 순서

1. OpenJDK 8 설치

hadoop@hadoop-001:~$ sudo apt-get install openjdk-8-jdk -y
[sudo] hadoop의 암호:
패키지 목록을 읽는 중입니다... 완료
의존성 트리를 만드는 중입니다
상태 정보를 읽는 중입니다... 완료
다음의 추가 패키지가 설치될 것입니다 :
  ca-certificates-java fonts-dejavu-extra java-common libatk-wrapper-java libatk-wrapper-java-jni libice-dev libpthread-stubs0-dev libsm-dev libx11-dev libxau-dev libxcb1-dev libxdmcp-dev libxt-dev openjdk-8-jdk-headless openjdk-8-jre
  openjdk-8-jre-headless x11proto-core-dev x11proto-dev xorg-sgml-doctools xtrans-dev
제안하는 패키지:
  default-jre libice-doc libsm-doc libx11-doc libxcb-doc libxt-doc openjdk-8-demo openjdk-8-source visualvm icedtea-8-plugin fonts-ipafont-gothic fonts-ipafont-mincho fonts-wqy-microhei fonts-wqy-zenhei
다음 새 패키지를 설치할 것입니다:
  ca-certificates-java fonts-dejavu-extra java-common libatk-wrapper-java libatk-wrapper-java-jni libice-dev libpthread-stubs0-dev libsm-dev libx11-dev libxau-dev libxcb1-dev libxdmcp-dev libxt-dev openjdk-8-jdk openjdk-8-jdk-headless
  openjdk-8-jre openjdk-8-jre-headless x11proto-core-dev x11proto-dev xorg-sgml-doctools xtrans-dev
0개 업그레이드, 21개 새로 설치, 0개 제거 및 178개 업그레이드 안 함.
43.4 M바이트 아카이브를 받아야 합니다.
이 작업 후 162 M바이트의 디스크 공간을 더 사용하게 됩니다.
받기:1 http://kr.archive.ubuntu.com/ubuntu focal/main amd64 java-common all 0.72 [6,816 B]
받기:2 http://kr.archive.ubuntu.com/ubuntu focal-updates/universe amd64 openjdk-8-jre-headless amd64 8u292-b10-0ubuntu1~20.04 [28.2 MB]
...
libxt-dev:amd64 (1:1.1.5-1) 설정하는 중입니다 ...

hadoop@hadoop-001:~$ java -version
openjdk version "1.8.0_292"
OpenJDK Runtime Environment (build 1.8.0_292-8u292-b10-0ubuntu1~20.04-b10)
OpenJDK 64-Bit Server VM (build 25.292-b10, mixed mode)

2. Hadoop 다운로드

hadoop@hadoop-001:~$ sudo curl -o /usr/local/hadoop-3.3.0.tar.gz https://downloads.apache.org/hadoop/common/hadoop-3.3.0/hadoop-3.3.0.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  477M  100  477M    0     0  2796k      0  0:02:54  0:02:54 --:--:-- 4997k

hadoop@hadoop-001:~$ sudo mkdir -p /usr/local/hadoop && sudo tar -xvzf /usr/local/hadoop-3.3.0.tar.gz -C /usr/local/hadoop --strip-components 1
...
hadoop-3.3.0/lib/
hadoop-3.3.0/lib/native/
hadoop-3.3.0/lib/native/libhadoop.a
...


hadoop@hadoop-001:~$ sudo rm -rf /usr/local/hadoop-3.3.0.tar.gz
hadoop@hadoop-001:~$ sudo chown -R $USER:$USER /usr/local/hadoop
hadoop@hadoop-001:~$ ls -al /usr/local/hadoop/
합계 116
drwxr-xr-x 10 hadoop hadoop  4096 12월 17 11:38 .
drwxr-xr-x 11 root root  4096 12월 17 11:43 ..
-rw-rw-r--  1 hadoop hadoop 22976  7월  5  2020 LICENSE-binary
-rw-rw-r--  1 hadoop hadoop 15697  3월 25  2020 LICENSE.txt
-rw-rw-r--  1 hadoop hadoop 27570  3월 25  2020 NOTICE-binary
-rw-rw-r--  1 hadoop hadoop  1541  3월 25  2020 NOTICE.txt
-rw-rw-r--  1 hadoop hadoop   175  3월 25  2020 README.txt
drwxr-xr-x  2 hadoop hadoop  4096  7월  7  2020 bin
drwxr-xr-x  3 hadoop hadoop  4096  7월  7  2020 etc
drwxr-xr-x  2 hadoop hadoop  4096  7월  7  2020 include
drwxr-xr-x  3 hadoop hadoop  4096  7월  7  2020 lib
drwxr-xr-x  4 hadoop hadoop  4096  7월  7  2020 libexec
drwxr-xr-x  2 hadoop hadoop  4096  7월  7  2020 licenses-binary
drwxr-xr-x  3 hadoop hadoop  4096  7월  7  2020 sbin
drwxr-xr-x  4 hadoop hadoop  4096  7월  7  2020 share

3. 환경변수 /etc/profile 등록

hadoop@hadoop-001:~$ sudo vi /etc/profile 

# hadoop cluster
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
HADOOP_HOME=/usr/local/hadoop
HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export JAVA_HOME HADOOP_HOME HADOOP_CONF_DIR
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

4. Hadoop 설정 파일

4-1. core-site.xml

$HADOOP_HOME/etc/hadoop/core-site.xml

hadoop@hadoop-001:~$ vi $HADOOP_HOME/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop-001:8020</value>
    </property>
</configuration>

4-2. hdfs-site.xml

$HADOOP_HOME/etc/hadoop/hdfs-site.xml

hadoop@hadoop-001:~$ vi $HADOOP_HOME/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/usr/local/hadoop/data/namenode</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/usr/local/hadoop/data/datanode</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>hadoop-002:50090</value>
    </property>
</configuration>

4-3. yarn-site.xml

$HADOOP_HOME/etc/hadoop/yarn-site.xml

hadoop@hadoop-001:~$ vi $HADOOP_HOME/etc/hadoop/yarn-site.xml
<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop-001</value>
        </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
</configuration>

4-4. mapred-site.xml

$HADOOP_HOME/etc/hadoop/mapred-site.xml

hadoop@hadoop-001:~$ vi $HADOOP_HOME/etc/hadoop/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.env</name>
      <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
    </property>
    <property>
      <name>mapreduce.map.env</name>
      <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
    </property>
    <property>
    <name>mapreduce.reduce.env</name>
      <value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
    </property>
</configuration>

4-5. workers

$HADOOP_HOME/etc/hadoop/workers

hadoop@hadoop-001:~$ vi $HADOOP_HOME/etc/hadoop/workers
hadoop-002
hadoop-003

4-6. hadoop-env.sh

$HADOOP_HOME/etc/hadoop/hadoop-env.sh

hadoop@hadoop-001:~$ vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

HADOOP_HOME=${HADOOP_HOME:-/usr/local/hadoop}
HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-$HADOOP_HOME/etc/hadoop}

export HADOOP_PID_DIR=${HADOOP_HOME}/pids 
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}

 

실행 / 중지

1. 실행

hadoop@hadoop-001:~$ $HADOOP_HOME/bin/hdfs namenode -format -force
WARNING: /usr/local/hadoop/logs does not exist. Creating.
2021-12-17 13:46:54,239 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = hadoop-001/20.200.0.110
STARTUP_MSG:   args = [-format, -force]
STARTUP_MSG:   version = 3.3.0
...

hadoop@hadoop-001:~$ $HADOOP_HOME/sbin/start-dfs.sh
Starting namenodes on [hadoop-001]
Starting datanodes
Starting secondary namenodes [hadoop-002]

hadoop@hadoop-001:~$ $HADOOP_HOME/sbin/start-yarn.sh
Starting resourcemanager
Starting nodemanagers

hadoop@hadoop-001:~$ $HADOOP_HOME/bin/mapred --daemon start historyserver

2. 중지

hadoop@hadoop-001:~$ $HADOOP_HOME/sbin/stop-dfs.sh
Stopping namenodes on [hadoop-001]
Stopping datanodes
Stopping secondary namenodes [hadoop-002]

hadoop@hadoop-001:~$ $HADOOP_HOME/sbin/stop-yarn.sh
Stopping nodemanagers
Stopping resourcemanager

hadoop@hadoop-001:~$ $HADOOP_HOME/bin/mapred --daemon stop historyserver
hadoop@hadoop-001:~$ rm -rf $HADOOP_HOME/data/namenode/*

# 모든 데이터 노드에서 다음을 수행
hadoop@hadoop-002:~$ rm -rf $HADOOP_HOME/data/datanode/*
hadoop@hadoop-003:~$ rm -rf $HADOOP_HOME/data/datanode/*

 

웹 관리

NameNode

http://hadoop-001:9870

ResourceManager

http://hadoop-001:8088

MapReduce JobHistory

http://hadoop-001:19888

728x90
반응형

'DevOps > Hadoop' 카테고리의 다른 글

Hadoop - Hive (하이브) 설치  (0) 2021.12.24
Hadoop 클러스터 설치 (고가용성) - 2  (0) 2021.12.20
Hadoop 클러스터 설치 (고가용성) - 1  (0) 2021.12.20
하둡 (hadoop) 이란?  (0) 2021.12.17