4008063323.net

Mastering Apache Kafka: A Hands-On Beginner's Guide

Written on

Introduction to Hands-On Learning with Kafka

“Practical experience outweighs theoretical knowledge” — Vishnudevananda Saraswati.

To effectively grasp a new technology, engaging in hands-on practice is incredibly valuable. By actively participating, learners can bridge the gap between theory and practice. Many accomplished engineers favor this experiential approach, and we will adopt a similar methodology while exploring Apache Kafka.

Prior discussions have covered the essential fundamentals of Apache Kafka necessary for practical application. If you're unfamiliar with key terms like Producers, Consumers, Events, and Topics, it would be beneficial to review introductory material before proceeding.

Before we start our hands-on journey, there’s one crucial concept we need to cover: ZooKeeper.

Understanding Apache ZooKeeper

Overview of Apache ZooKeeper

ZooKeeper plays a vital role in the Kafka ecosystem. It is tasked with maintaining essential metadata, including cluster configuration and consumer details. ZooKeeper tracks brokers and partitions, manages them, and sends notifications to the Kafka server about events like broker failures or new topics. It's important to note that a Kafka server cannot operate without a ZooKeeper instance.

With this understanding, we can now set up our own Kafka cluster on local systems. Let's begin by downloading the necessary software.

Setting Up Apache Kafka

First, create a new directory for our Kafka installation. You can choose any name; I will use "kafka."

$ mkdir ~/kafka

$ cd ~/kafka

Once inside the directory, we'll download the latest version of Apache Kafka. We'll use wget for this, but you can also download it manually from the Apache website. Don't forget to move the file to the ~/kafka directory.

Next, we need to extract the downloaded file to access its contents:

$ tar -xvzf kafka_2.13-3.2.0.tgz

After running this command, a new directory should appear. You can verify this with the following command:

$ ls

kafka_2.13-3.2.0 kafka_2.13-3.2.0.tgz

Now that we've completed the download, change into the newly created folder and prepare for the next steps.

$ cd kafka_2.13-3.2.0

Starting the Kafka ZooKeeper

The initial step in launching our Kafka cluster is to activate ZooKeeper. Apache Kafka includes a bash script for this purpose located at bin/zookeeper-server-start.sh. Let’s execute it using the following command:

$ bin/zookeeper-server-start.sh

USAGE: bin/zookeeper-server-start.sh [-daemon] zookeeper.properties

The script requires a properties file, which it indicates by the usage message. Kafka provides a default configuration file for ZooKeeper located at config/zookeeper.properties. You can check the contents of this file, which includes properties like the port ZooKeeper operates on and where it keeps its metadata.

For now, we can use this default configuration without modification. Run the following command to start ZooKeeper:

$ bin/zookeeper-server-start.sh config/zookeeper.properties

Keep this shell session open as we will need it throughout this tutorial.

Starting the Kafka Server

Next, we will bring up the Kafka server. Similarly, a shell script is available for this task at bin/kafka-server-start.sh. We will execute it as follows:

$ bin/kafka-server-start.sh

USAGE: bin/kafka-server-start.sh [-daemon] server.properties [--override property=value]*

Again, a properties file is needed. Kafka supplies a default properties file located at config/server.properties. Let's pass this file to the script:

$ bin/kafka-server-start.sh config/server.properties

At this point, the Kafka server will begin listening on port 9092.

Now that our setup is complete, we can create a new topic, publish messages to it, and consume them.

Creating Your First Topic

In this section, we will establish our very first topic. Kafka provides a script to manage topics, found at bin/kafka-topics.sh. To create a topic named "my-first-topic," run the following command:

$ bin/kafka-topics.sh --create --topic my-first-topic --bootstrap-server localhost:9092

The --create flag indicates our intent to create a new topic, while the --bootstrap-server parameter specifies where the Kafka server is running.

Producing Messages to the New Topic

Now it's time to send some messages to our newly created topic. We will use another script located at bin/kafka-console-producer.sh. Here’s how to start the message publication process:

$ bin/kafka-console-producer.sh --topic my-first-topic --bootstrap-server localhost:9092

After executing this command, a prompt (>) will appear in your terminal. You can start typing messages and hit enter to publish them:

> First Event in my topic

> Second Event in my topic

You can continue publishing messages or exit by pressing Ctrl + C.

Consuming Messages from the New Topic

Having published messages, the next step is to consume them. Kafka provides a consumer script located at bin/kafka-console-consumer.sh. Run it with the following command:

$ bin/kafka-console-consumer.sh --topic my-first-topic --from-beginning --bootstrap-server localhost:9092

Upon executing this command, the consumer will read and display messages from the topic:

First Event in my topic

Second Event in my topic

Feel free to publish more messages, and they will appear in the consumer output.

Congratulations! You have now completed a comprehensive example of running Kafka. You can successfully initiate ZooKeeper and Kafka Server, create topics, and publish and consume messages. Many additional configurations can be explored, such as multiple producers and consumers interacting with the same topic. Readers are encouraged to experiment further.

In conclusion, I hope you found this guide helpful. To continue your learning journey, check out additional articles, and don’t forget to follow me for updates. You can also connect with me on Twitter.

Beginner's Guide to Apache Kafka with Practical Projects

This video offers a hands-on introduction to Apache Kafka, guiding you through essential concepts and practical implementations.

Exploring Kafka in 10 Minutes: Your First Application

This quick tutorial helps you build your first Kafka application in under 10 minutes, ensuring a smooth entry into the world of Kafka.

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Exploring the Connection Between Telomeres and Lifespan

This article examines the relationship between telomeres and lifespan, offering insights into how lifestyle choices may influence longevity.

Is Western Philosophy a Deception? Unraveling Its Discontent

Exploring the disillusionment in Western philosophy and academia, questioning if the institutional flaws reflect on the discipline itself.

Unlocking Data Science: A Comprehensive Overview

Explore the origins and evolution of data science, along with key learning types and their significance today.