Skip to main content

Apache Kafka : Install Zookeeper ensemble on AWS EC2

In this article we will see how to setup a three-node Zookeeper ensemble on AWS EC2 instances.

Let's assume we already have three EC2 instances up an running with following public IPs:

15.232.8.517
15.232.46.302
15.0.185.130

Please make sure to open 2181, 2888 and 3888 ports on each machine, as the Zookeeper instances need them to communicate with the client and themselves.

Install OpenJDK 17

The first step is to install OpenJDK 17 on each server as shown below:

sudo apt update && sudo apt upgrade -y
apt-cache search openjdk
sudo apt-get install openjdk-17-jdk
java --version

openjdk 17.0.4 2022-07-19
OpenJDK Runtime Environment (build 17.0.4+8-Ubuntu-122.04)
OpenJDK 64-Bit Server VM (build 17.0.4+8-Ubuntu-122.04, mixed mode, sharing)

Install ZooKeeper

Download ZooKeeper from the release page and move it to "/usr/local/" directory as shown below.

wget https://dlcdn.apache.org/zookeeper/zookeeper-3.8.0/apache-zookeeper-3.8.0-bin.tar.gz
tar -zxf apache-zookeeper-3.8.0-bin.tar.gz
sudo mv apache-zookeeper-3.8.0-bin /usr/local/zookeeper

Also create a "zookeeper" folder under "/var/lib/", this will act as a data directory for zookeeper.

sudo mkdir -p /var/lib/zookeeper

Each node must have a common configuration that lists all servers, and each server needs a "myid" file in the data directory that specifies the ID number of the server.

touch /var/lib/zookeeper/myid
vi /var/lib/zookeeper/myid

This file (myid) must contain the ID number of the server, which must match the configuration file.

If the ips of the servers in the ensemble are 15.232.8.517, 15.232.46.302, and 15.0.185.130, the configuration file "zoo.cfg" under "/usr/local/zookeeper/conf/" should look like this:

touch /usr/local/zookeeper/conf/zoo.cfg
vi /usr/local/zookeeper/conf/zoo.cfg

1) Node 15.232.8.517

tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=20
syncLimit=5
server.1=0.0.0.0:2888:3888
server.2=15.232.46.302:2888:3888
server.3=15.0.185.130:2888:3888

2) Node 15.232.46.302

tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=20
syncLimit=5
server.1=15.232.8.517:2888:3888
server.2=0.0.0.0:2888:3888
server.3=15.0.185.130:2888:3888

3) Node 15.0.185.130

tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=20
syncLimit=5
server.1=15.232.8.517:2888:3888
server.2=15.232.46.302:2888:3888
server.3=0.0.0.0:2888:3888

Note: You must specify 0.0.0.0 for the current node.

Once these steps are complete, start up the servers with below mentioned command and the nodes should communicate with one another in an ensemble.

sudo ZK_SERVER_HEAP=128 /usr/local/zookeeper/bin/zkServer.sh start

/usr/bin/java
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... 
STARTED

In order to test if the ensemble is running correctly, we can use the below command.

The "srvr" command will return local zookeeper node details.

# telnet localhost 2181
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
srvr

Zookeeper version: 3.8.0-5a02a05eddb59aee6ac762f7ea82e92a68eb9c0f, built on 2022-02-25 08:49 UTC
Latency min/avg/max: 0/0.0/0
Received: 1
Sent: 0
Connections: 1
Outstanding: 0
Zxid: 0x100000000
Mode: follower
Node count: 5
Connection closed by foreign host.

Here we are all done with our Zookeeper cluster setup.


Recommended number of zookeepers nodes in an ensemble ?

A majority of ensemble members (a quorum) must be working in order for ZooKeeper to respond to requests.

It is generally recommended to have an odd number of ZooKeeper servers in your ensemble, so a majority is maintained.

This means that in a three-node ensemble, you can run with one node missing.

With a five-node ensemble, you can run with two nodes missing.

In general, five is usually a good number if you have a fair number of servers.

More servers means less write performance but slightly better read performance.

Five is good because it allows you to remove a server for upgrading while still having a healthy cluster.

Configuration

Below is a quick overview of useful Zookeeper configurations and settings.

Cong Meaninig
initLimit The amount of time to allow followers to connect with a leader.
syncLimit Determines how long followers can be out of sync with the leader.
tickTime Both "initLimit" and "syncLimit" values are a number of tickTime units, which makes the init Limit 20 × 2,000 ms, or 40 seconds.
clientPort Clients only need to be able to connect to the ensemble over the clientPort.
server.X=hostname:peerPort:leaderPort Where "x" is the integer ID number of the server, "hostname" is the hostname or IP address of the server, "peerPort" is the TCP port over which servers communicate with one another, and "leaderPort" is the TCP port over which leader election is performed.

The nodes of the ensemble must be able to communicate with one another over all three (clientPort, peerPort and leaderPort).

Comments

Popular posts from this blog

Deploying Spring Boot microservices on Kubernetes Cluster

This article guides you through the deployment of two Spring Boot microservices, namely "order-service" and "inventory-service," on Kubernetes using "MiniKube" . We will establish communication between them, with "order-service" making calls to an endpoint in "inventory-service." Additionally, we will configure "order-service" to be accessible from the local machine's browser . 1) Create Spring Boot microservices The Spring Boot microservices, "order-service" and "inventory-service," have been developed and can be found in this GitHub repository. If you are interested in learning more about creating Spring Boot REST microservices, please refer to this or this (Reactive) link. 2) Build Docker Images The Docker images for both "order-service" and "inventory-service" have already been generated and deployed on DockerHub, as shown below. codeburps/order-service cod...

Circuit Breaker Pattern with Resilience4J in a Spring Boot Application

Read Also: Spring Cloud Circuit Breaker + Resilience4j Resilience4j is a lightweight fault tolerance library that draws inspiration from Netflix Hystrix but is specifically crafted for functional programming. The library offers higher-order functions, known as decorators , designed to augment any functional interface, lambda expression, or method reference with features such as Circuit Breaker, Rate Limiter, Retry, or Bulkhead . These functionalities can be seamlessly integrated within a project, class, or even applied to a single method. It's possible to layer multiple decorators on any functional interface, lambda expression, or method reference, allowing for versatile and customizable fault tolerance. While numerous annotation-based implementations exist online, this article focuses solely on the reactive approach using router predicates and router functions . How Circuit Breaker Pattern works? In general, a circuit breaker functions as an automatic electrical s...

How to create a basic Spring 6 project using Maven

Below is a step-by-step guide to creating a basic Spring project using Maven. 1) Create a Maven Project Use the following Maven command to create a new Maven project. mvn archetype:generate -DgroupId=com.tb -DartifactId=spring-demo -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false 2) Import in IntelliJ IDEA If you haven't already, open IntelliJ IDEA on your system. Go to "File" > "New" > "Project from Existing Sources..." . In the file dialog, navigate to the directory where your Maven project is located. Select the pom.xml file within the project directory and click "Open." 3) Update pom.xml In total, the application requires the below-mentioned dependencies: 4) Create Spring Configuration Create a Java configuration class that uses annotations to define your Spring beans and their dependencies. This class should be annotated with @Configuration . 5) Create the Main Application C...