Confluent Apache Kafka
This project expands on the original work from Chris Matta @ Confluen (https://github.com/cjmatta). It encompasses Terraform and Ansible automation to set up Kafka and monitoring tools.
- Python 3
- Ansible 2.9
- Terraform v0.13
- CentOS / RHEL 7.6 or above
This is the linux system used to invoke Terraform and Ansible in order to create the test environment.
Orchestration System Setup
# Centos 8 sudo yum -y install python3 python3-pip git wget unzip libselinux-python3
Install Terraform according to the instructions here
# Download terraform package # Note. The scripts are using features only available in version 0.13+ TERRAFORM_VERSION=0.13.3 # wget the binary wget https://releases.hashicorp.com/terraform/$TERRAFORM_VERSION/terraform_$TERRAFORM_VERSION_linux_amd64.zip # Extract unzip terraform_$TERRAFORM_VERSION_linux_amd64.zip # Install sudo mv ./terraform /usr/bin/
Clone git repository
cd ~ git clone https://github.com/cleeistaken/automation-kafka.git cd automation-cockroach
Create a Python virtual environment.
# Create virtual environment python3 -m venv $HOME/.python3-venv # Activate the virtual environment source $HOME/.python3-venv/bin/activate # (optional) Add VENV to login script echo "source $HOME/.python3-venv/bin/activate" >> $HOME/.bashrc
Install required python packages.
pip install --upgrade pip pip install --upgrade setuptools pip install -r python-requirements.txt
We create a template to address the following requirements and limitations.
- Need a user account and SSH key for Ansible.
VM Template Setup
Create a Linux VM with a supported distribution and version
Install the required packages for the Terraform customization.
sudo yum install open-vm-tools perl
From the orchestration system create and upload a ssh key to the template VM.
# Check and create a key if none exits if [ ! -f ~/.ssh/id_rsa ]; then ssh-keygen -b 2048 -t rsa -f ~/.ssh/id_rsa -q -N "" fi # Copy to the template VM ssh-copy-id confluent@<ip of the vm>
In vSphere convert the VM to a template to prevent any changes.
The terraform folder contains a few files:
main.tf: this is the main definition of the environment
variables.tf: this is the variables definition file
teraform.tfvars: this is the file where environment specific settings are made
terraform.tfvars to include specific information about the vsphere environment, and the Confluent Platform environment you'd like to build.
Once everything is set, run
terraform init, then run
terraform plan to ensure everything looks good. If everything looks right, then run
terraform apply to apply the configuration.
After everything is built, there will be outputs of the various components IP addresses. Note this for the Ansible section.
inventory.yml and set the ansible user, and private key file for access to the target machines, be to include the ip addresses for each service output by the terraform apply.
Ensure that wherever you run ansible from can ssh to each of the hosts, it's easiest to test this with the Ansible ping module:
ansible -i settings.yml -i inventory.yml -m ping all
There's a playbook called
preflight-playbook.yml which does the follwoing:
- opens up the SELinux ports to allow traffic between services
- formats the disks and then mounts them (you'll want to make sure that the devices specified in the volumes /dev/sdb1 etc.. are correct for the VMs)
preflight-playbook.yml file and make sure that the Broker and Zookeeper drives are correct (/dev/sdb, /dev/sdc etc...)
Run it like this:
ansible-playbook -i settings.yml -i inventory.yml preflight-playbook.yml
Make sure that the Kafka brokers section has the correct properties set for the environment:
log.dirs property is set in
172.20.10.11: kafka_broker: properties: broker.rack: isvlab default.replication.factor: 3 log.dirs: /var/lib/kafka/data0,/var/lib/kafka/data1
To install the core Kafka, Zookeeper, Connect and Control Center services run the all.yml playbook like this:
ansible-playbook -i settings.yml -i inventory.yml all.yml
tools-provisioning.yml playbook installs the following services
- Installs Prometheus on the tools host specified in
- Installs Prometheus node exporter on all hosts
- Installs core kafka commands needed for performance tests on tools host
- Installs Grafana on the tools host
- Installs filebeats on all hosts, which collect from:
- Installs Kibana, Elasticsearch, and Logstash on the tools host
Install Ansible Galaxy roles:
ansible-galaxy install -r ansible-requirements.yml
ansible-playbook -i settings.yml -i inventory.yml tools-provisioning.yml
tools-provisioning.yml playbook runs a Grafana instance will be running on port 3000 of the tools host:
user/pass - confluent/confluent
Add Prometheus Data Source Add a data source in Grafana under the
configuration -> data sources menu. Set the URL to
http://<tools host>:9090 and set it to default.
Import Kafka and Host Dashboards (Import JSON dashboards)[https://grafana.com/docs/grafana/latest/reference/export_import/#importing-a-dashboard] from the
grafana-dashboards directory in this repository.