ZooKeeper版本的选择与下载

HBase文档里有这一句话:

QQ截图20200918173552

然后在ZooKeeper的下载镜像源里说道:

QQ截图20200918173812

为了求稳&参照实验室前人装了3.4.14的经验,下载了zookeeper-3.4.14.tar.gz这个包。然后用MobaXterm的Sftp传到网关,再用scp传到Master服务器:

1
scp zookeeper-3.4.14.tar.gz lpj@cpu-node0:/home/lpj/

解压并重命名:

1
2
tar -zxf zookeeper-3.4.14.tar.gz -C /home/lpj/
mv zookeeper-3.4.14 zookeeper3.4

ZooKeeper配置

首先在zookeeper文件夹里新建两个文件夹:

1
2
mkdir zkdata  # 数据
mkdir zkdatalog # 日志

然后进入到zookeeper的conf目录下,把zoo_sample.cfg复制一份命名为zoo.cfg,用来做配置文件:

1
2
cd /home/lpj/zookeeper3.4/conf/
cp zoo_sample.cfg zoo.cfg

修改zoo.cfg配置为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/lpj/zookeeper3.4/zkdata
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
dataLogDir=/home/lpj/zookeeper3.4/zkdatalog
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=cpu-node0:2888:3888
server.2=cpu-node3:2888:3888

然后在zkdata文件夹里新建一个myid文件,对应zoo.cfg中的server的ID号,把号码写进去:

1
1

然后将整个文件夹发给slave节点(加上参数-r可以直接发文件夹,不用打包再解压):

1
scp -r zookeeper3.4 lpj@cpu-node3:/home/lpj/

然后进入zkdata修改myid,把1改为2。

最后在两台服务器上配置环境变量,修改.bashrc文件,在之前修改过的path中加入zookeeper的bin路径(两个节点都要):

1
export PATH=$PATH:/home/lpj/hadoop2.7/bin:/home/lpj/hadoop2.7/sbin:/home/lpj/hbase1.4/bin:/home/lpj/zookeeper3.4/bin

并用source ~/.bashrc生效一下。

ZooKeeper的启动与check

由于配置了环境变量,启动命令直接用脚本:

1
zkServer.sh start

然后检查启动的状态使用:

1
zkServer.sh status

然后出现错误:

1
2
3
4
[lpj@cpu-node0 ~]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/lpj/zookeeper3.4/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.

去zookeeper.out检查问题:

1
2
3
2020-09-21 16:54:42,266 [myid:1] - WARN  [WorkerSender[myid=1]:QuorumCnxManager@584] - Cannot open channel to 2 at election address cpu-node3/192.168.232.103:3898
java.net.ConnectException: Connection refused (Connection refused)
at ...

由于本服务器本来也是运行了一个Hadoop项目群,所以推断是端口冲突的问题,参考zookeeper错误记录一的端口设置,修改zoo.cfg中以下几项:

1
2
3
4
clientPort=3384

server.1=cpu-node0:2878:3878
server.2=cpu-node3:2898:3898

两个节点中都要修改成一样的。然后再在两端分别启动zkServer.sh start

需要注意的是:在启动完一个节点之后用status参数检查依然是error状态的,因为它需要双向端口的开通,所以应该是所有节点都启动完了再去检查:

1
2
3
4
5
6
7
8
9
[lpj@cpu-node3 ~]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/lpj/zookeeper3.4/bin/../conf/zoo.cfg
Mode: leader

[lpj@cpu-node0 ~]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /home/lpj/zookeeper3.4/bin/../conf/zoo.cfg
Mode: follower

由于server_id的设置,cpu-node3成为leader,cpu-node0成为follower。注意:它的leader和follower是根据选举制度分的,不要认为Master就是leader。

jps检查进程会发现多了一个QuorumPeerMain进程:

1
2
3
4
5
6
7
8
9
10
11
12
13
[lpj@cpu-node0 ~]$ jps
12946 Jps
45704 JobHistoryServer
45002 SecondaryNameNode
45243 ResourceManager
44655 NameNode
9023 QuorumPeerMain

[lpj@cpu-node3 ~]$ jps
23609 Jps
9100 DataNode
23292 QuorumPeerMain
9246 NodeManager

安装成功。

zkCli.sh客户端的启动与试用

从cpu-node3连去cpu-node0:

1
zkCli.sh  -server cpu-node0:3384

然后会建立出连接:

1
2
3
4
5
6
2020-09-21 17:22:09,608 [myid:] - INFO  [main-SendThread(cpu-node0:3384):ClientCnxn$SendThread@879] - Socket connection established to cpu-node0/192.168.232.100:3384, initiating session
[zk: cpu-node0:3384(CONNECTING) 0] 2020-09-21 17:22:09,637 [myid:] - INFO [main-SendThread(cpu-node0:3384):ClientCnxn$SendThread@1299] - Session establishment complete on server cpu-node0/192.168.232.100:3384, sessionid = 0x101409bf13a0001, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null

按一下回车,进入终端:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[zk: cpu-node0:3384(CONNECTED) 0] ls / #列出子节点信息
[zookeeper]
[zk: cpu-node0:3384(CONNECTED) 1] h #列出所有命令
ZooKeeper -server host:port cmd args
stat path [watch]
set path data [version]
ls path [watch]
delquota [-n|-b] path
ls2 path [watch]
setAcl path acl
setquota -n|-b val path
history
redo cmdno
printwatches on|off
delete path [version]
sync path
listquota path
rmr path
get path [watch]
create [-s] [-e] path data acl
addauth scheme auth
quit
getAcl path
close
connect host:port

exit命令,用quit命令进行退出客户端。