云计算

Ubuntu20下安装hadoop

安装jdk

1.官网下载

官网链接

2.解压

1
tar -zxvf jdk-8u321-linux-x64.tar.gz

3.修改配置

1
sudo vim ~/.bashrc
1
2
3
4
export JAVA_HOME=/home/trick/jdk1.8.0_321
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
export PATH=${JAVA_HOME}/bin:$PATH

4.更新配置

1
source ~/.bashrc

5.验证

1
2
3
4
java -version
java version "1.8.0_321"
Java(TM) SE Runtime Environment (build 1.8.0_321-b07)
Java HotSpot(TM) 64-Bit Server VM (build 25.321-b07, mixed mode)

参考

安装并配置ssh免密登录

1.安装ssh

1
sudo apt-get install openssh-server

如果报错

1
2
3
4
5
The following packages have unmet dependencies:
openssh-server : Depends: openssh-client (= 1:7.2p2-4ubuntu2.10)
Depends: openssh-sftp-server but it is not going to be installed
Recommends: ssh-import-id but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

先卸载,在安装

1
2
sudo apt-get autoremove openssh-client openssh-server
sudo apt-get install openssh-client openssh-server

报错参考

2.连接并设置密码

1
ssh localhost

3.配置免密登陆

1.在~/.ssh生成公私密钥

1
ssh-keygen -t rsa

一直回车就行

2.导入公钥到认证文件,更改权限

  • 导入本机

    1
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 
  • 导入服务器(在本机配置的不需要

    将公钥复制到服务器

    1
    scp ~/.ssh/id_rsa.pub xxx@host:/home/xxx/id_rsa.pub 

    将公钥导入到认证文件,这一步的操作在服务器上进行

    1
    cat ~/id_rsa.pub >> ~/.ssh/authorized_keys 

    在服务器上更改权限

    1
    2
    chmod 700 ~/.ssh
    chmod 600 ~/.ssh/authorized_keys

测试:ssh localhost 第一次需要输入yes和密码,之后就不需要了。

安装hadoop

1.官网下载

官网地址

2.解压

1
tar -zvxf hadoop-3.3.2.tar.gz

3.创建hadoop用户和组,并授予执行权限

1
2
3
sudo addgroup hadoop
sudo usermod -a -G hadoop xxx   #将当前用户加入到hadoop组
sudo vim /etc/sudoers  #将hadoop组加入到sudoer

在 root ALL=(ALL:ALL) ALL 下写上 hadoop ALL=(ALL:ALL) ALL

1
2
3
# User privilege specification
root ALL=(ALL:ALL) ALL
hadoop ALL=(ALL:ALL) ALL

4.更改权限

1
2
sudo chmod -R 755 /home/trick/hadoop-3.3.2
sudo chown -R trick:hadoop /home/trick/hadoop-3.3.2

5.改配置文件

1
sudo vim ~/.bashrc
1
2
export HADOOP_HOME=/home/trick/hadoop-3.3.2
export PATH=.:${JAVA_HOME}/bin:${HADOOP_HOME}/bin:$PATH

配置生效

1
source ~/.bashrc

6.检查是否生效

1
2
3
4
5
6
7
$ hadoop version
Hadoop 3.3.2
Source code repository git@github.com:apache/hadoop.git -r 0bcb014209e219273cb6fd4152df7df713cbac61
Compiled by chao on 2022-02-21T18:39Z
Compiled with protoc 3.7.1
From source with checksum 4b40fff8bb27201ba07b6fa5651217fb
This command was run using /home/trick/hadoop-3.3.2/share/hadoop/common/hadoop-common-3.3.2.jar

hadoop伪分布式配置

1.修改core-site.xml文件

~/hadoop-3.3.2/etc/hadoop 目录下 core-site.xml 文件

1
sudo vim core-site.xml
1
2
3
4
5
6
7
8
9
10
11
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/trick/hadoop-3.3.2/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

2.修改hdfs-site.xml文件

1
sudo vim hdfs-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/trick/hadoop-3.3.2/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/trick/hadoop-3.3.2/tmp/dfs/data</value>
</property>
</configuration>

3.格式化名称节点

/home/trick/hadoop-3.3.2 目录下

1
./bin/hdfs namenode -format

image-20220321172914242

成功

4.开启HDFS

在hadoop-3.3.2目录下

1
./sbin/start-dfs.sh

报错

1
localhost: ERROR: JAVA_HOME is not set and could not be found.

/home/trick/hadoop-3.3.2/etc/hadoop 目录下修改hadoop-env.sh文件

1
sudo vim etc/hadoop/hadoop-env.sh

在54行找到了,以下是最终效果

1
2
3
4
5
# The java implementation to use. By default, this environment
# variable is REQUIRED on ALL platforms except OS X!
# export JAVA_HOME=

export JAVA_HOME=/home/trick/jdk1.8.0_321

再次运行,成功

image-20220321174733366

HDFS基本操作

创建文件夹

1
2
hdfs dfs -mkdir /test
hdfs dfs -mkdir /2022_3_21_Trick

创建新的空文件

1
hdfs dfs -touchz /aa.txt

查询命令

hdfs dfs -ls / 查询/目录下的所有文件和文件夹

hdfs dfs -ls -R 以递归的方式查询/目录下的所有文件

image-20220321180217056

hadoop参考

HDFS基本命令

Java编程

下载IDEA并解压

1
tar -zvxf ideaIU-2021.3.3.tar.gz