Skip to content

Latest commit

 

History

History
444 lines (350 loc) · 14 KB

README.adoc

File metadata and controls

444 lines (350 loc) · 14 KB

Pac-Man with Confluent Cloud

Pac-Man with Confluent Cloud is the funniest application that you ever see while applying principles of stream processing using Apache Kafka. Built around the famous Pac-Man game, this application allows you to capture and store events from the game into Kafka topics, as well as process them in near real-time using KSQL. In order to keep you focused on the fun and interesting part, the application is based on clusters running on Confluent Cloud — a fully managed service that offers Apache Kafka as a serverless application.

pacman game

To apply principles of stream processing in the game, you are going to build a scoreboard using KSQL. The scoreboard will be based on a table that holds aggregated metrics of the players such as their highest score, the highest level achieved, and the number of times that the player loses (a.k.a game-over). As the events keep coming from the game, this scoreboard gets updated instantly by the continuous queries that keep processing those events as they happen. A program written in Go can be used to display this table in near real-time.

scoreboard

This program subscribes to a Kafka topic that acts as the sink of the table, therefore has all the data that is the result of the computation done by the continuous queries. As new data arrives, the table automatically reorder the players based on their performance. Because of this behavior; you can use this program to display the performance of the players as they play with the game, and have all sorts of fun doing it!

What you are Going to Need?

  • Confluent Cloud - You need to have an active account with Confluent Cloud to be able to spin up environments with the services required for this application. At a very minimum, you will need a Kafka cluster where your topics will be created and an managed Schema Registry. Optionally, you may want to create KSQL applications to implement the scoreboard pipeline.

  • Terraform - The application is automatically created using Terraform. The cloud providers supported are AWS, GCP, and Azure. Besides having Terraform installed locally, will need to provide your cloud provider credentials so Terraform can create and manage the resources for you.

  • Go Compiler - The program that displays the performance of the players as they play is written in Go and thus, you will need to have the compiler installed to build a native executable.

  • Confluent Cloud CLI (Optional) - During the creation of the pipeline, if you choose to implement it using a KSQL application, then you will need to have the Confluent Cloud CLI installed locally to set up access permissions to the topics. You can find instructions about how to install it here.

  • Confluent Platform (Optional) - During the creation of the pipeline, if you choose to implement it using a KSQL Server, then you will need to have the Confluent Platform installed locally to be able to spin up your own KSQL Server or use the KSQL CLI. You can find instructions about how to install it here.

1) Setting Up Confluent Cloud

As mentioned before, the application is based on clusters running on Confluent Cloud. Thus, the very first thing you need to do is creating a cluster in Confluent Cloud. You also going to need access to the Schema Registry service that is available for each environment created in Confluent Cloud.

2) Deploying the Application

The application is essentially a set of HTML/CSS/JS files that forms a microsite that can be hosted statically anywhere. But for strategic reasons, we deploy this microsite in a storage service from the chosen cloud provider. A bucket will be created and the microsite will be copied there. This bucket will be created in the very same region selected for the Confluent Cloud cluster, to ensure that the application will be co-located. Along with the bucket and the microsite, some compute instances will be spun up to deploy REST Proxy also in the same region selected for the Confluent Cloud cluster.

application

The REST Proxy instances will sit behind the load-balancer, and they will handle the incoming traffic coming from the micro site — notably all the events generated by the game — then persisting them in the respective Kafka topics. Note that during bootstrap, the REST Proxy instances will take care of creating the required Kafka topics.

Option: Deploying the Application on AWS

  1. Enter the folder that contains the AWS code

    cd terraform/aws
  2. Create a variables file for Confluent Cloud

    mv ccloud.auto.tfvars.example ccloud.auto.tfvars
  3. Provide the data on the 'ccloud.auto.tfvars' file

    bootstrap_server = "<CCLOUD_BOOTSTRA_SERVER>"
    cluster_api_key = "<CCLOUD_API_KEY>"
    cluster_api_secret = "<CCLOUD_API_SECRET>"
    
    schema_registry_url = "<SCHEMA_REGISTRY_URL>"
    schema_registry_basic_auth = "<SCHEMA_REGISTRY_API_KEY>:<SCHEMA_REGISTRY_SECRET>"
  4. Create a variables file for AWS

    mv cloud.auto.tfvars.example cloud.auto.tfvars
  5. Provide the credentials on the 'cloud.auto.tfvars' file

    aws_access_key = "<AWS_ACCESS_KEY>"
    aws_secret_key = "<AWS_SECRET_KEY>"
  6. Initialize the Terraform plugins

    terraform init
  7. Start the application deployment

    terraform apply -auto-approve
  8. Output with endpoints will be shown

    Outputs:
    
    KSQL_Server = http://pacman00000-ksql-000000.region.elb.amazonaws.com
    Pacman = http://pacman000000000000000.s3-website-region.amazonaws.com

Note: When you are done with the application, you can automatically destroy all the resources created by Terraform using the command below:

terraform destroy -auto-approve

Option: Deploying the Application on GCP

  1. Enter the folder that contains the GCP code

    cd terraform/gcp
  2. Create a variables file for Confluent Cloud

    mv ccloud.auto.tfvars.example ccloud.auto.tfvars
  3. Provide the data on the 'ccloud.auto.tfvars' file

    bootstrap_server = "<CCLOUD_BOOTSTRA_SERVER>"
    cluster_api_key = "<CCLOUD_API_KEY>"
    cluster_api_secret = "<CCLOUD_API_SECRET>"
    
    schema_registry_url = "<SCHEMA_REGISTRY_URL>"
    schema_registry_basic_auth = "<SCHEMA_REGISTRY_API_KEY>:<SCHEMA_REGISTRY_SECRET>"
  4. Create a variables file for GCP

    mv cloud.auto.tfvars.example cloud.auto.tfvars
  5. Specify the GCP project name on the 'cloud.auto.tfvars' file

    gcp_credentials = "credentials.json"
    gcp_project = "<YOUR_GCP_PROJECT>"
  6. Create an service account key

    https://cloud.google.com/community/tutorials/getting-started-on-gcp-with-terraform
  7. Copy your service account key

    cp <source>/credentials.json .
  8. Initialize the Terraform plugins

    terraform init
  9. Start the application deployment

    terraform apply -auto-approve
  10. Output with endpoints will be shown

    Outputs:
    
    KSQL_Server = http://0.0.0.0
    Pacman = http://0.0.0.0

Note: When you are done with the application, you can automatically destroy all the resources created by Terraform using the command below:

terraform destroy -auto-approve

Option: Deploying the Application on Azure

  1. Enter the folder that contains the Azure code

    cd terraform/azr
  2. Create a variables file for Confluent Cloud

    mv ccloud.auto.tfvars.example ccloud.auto.tfvars
  3. Provide the data on the 'ccloud.auto.tfvars' file

    bootstrap_server = "<CCLOUD_BOOTSTRA_SERVER>"
    cluster_api_key = "<CCLOUD_API_KEY>"
    cluster_api_secret = "<CCLOUD_API_SECRET>"
    
    schema_registry_url = "<SCHEMA_REGISTRY_URL>"
    schema_registry_basic_auth = "<SCHEMA_REGISTRY_API_KEY>:<SCHEMA_REGISTRY_SECRET>"
  4. Create a variables file for Azure

    mv cloud.auto.tfvars.example cloud.auto.tfvars
  5. Provide the credentials on the 'cloud.auto.tfvars' file

    azure_subscription_id = "<AZURE_SUBSCRIPTION_ID>"
    azure_client_id = "<AZURE_CLIENT_ID>"
    azure_client_secret = "<AZURE_CLIENT_SECRET>"
    azure_tenant_id = "<AZURE_TENANT_ID>"
  6. Initialize the Terraform plugins

    terraform init
  7. Start the application deployment

    terraform apply -auto-approve
  8. Output with endpoints will be shown

    Outputs:
    
    KSQL_Server = http://pacman0000000-ksql.region.cloudapp.azure.com
    Pacman = http://pacman0000000000000000000.z5.web.core.windows.net

Note: When you are done with the application, you can automatically destroy all the resources created by Terraform using the command below:

terraform destroy -auto-approve

3) Creating the Pipeline

When users play with the Pac-Man game — two types of events will be generated. The first one is called User Game and contains the data about the user’s current game, such as their score, current level, and the number of lives. The second one is called User Losses and, as the name implies, contains data about the number of times the user loses the game. To build a scoreboard out of this, a stream processing pipeline need to be implemented to perform a series of computations on these two events and derive a table that will contain statistic data about each user’s game.

pipeline

To implement the pipeline you will be using KSQL. The code for this pipeline has been written for you and the only thing you need to do is to execute them into a full-fledged KSQL Server. Therefore, you need to decide which KSQL Server you are going to use. There are three options:

  1. Using the KSQL Server created by Terraform

  2. Using your own KSQL Server running locally

  3. Using Confluent Cloud KSQL (Managed Service)

Whatever option you pick, the KSQL Server will be pointing to the Kafka cluster running on Confluent Cloud. You can even mix and match options to showcase the fact that all options are handling data coming from the single-source-of-truth which is Apache Kafka.

Option: KSQL Server created by Terraform

  1. Enter the folder that contains the AWS/GCP/Azure code

    cd terraform/<provider>
  2. Execute the command to print the outputs

    terraform output
  3. Select and copy the KSQL Server endpoint

  4. Enter the folder that contains the KSQL code

    cd ../../pipeline
  5. Start a new session of the KSQL CLI:

    ksql <ENDPOINT_COPIED_ON_STEP_THREE>
  6. Run the queries in the KSQL CLI session:

    RUN SCRIPT 'queries.sql';

Option: Own KSQL Server running locally

  1. Enter the folder that contains the KSQL code

    cd pipeline
  2. Start a new KSQL Server instance

    ksql-server-start ksql-server.properties
  3. Start a new session of the KSQL CLI:

    ksql http://localhost:8088
  4. Run the queries in the KSQL CLI session:

    RUN SCRIPT 'queries.sql';

Note: The file 'ksql-server.properties' is generated by Terraform during deployment.

Option: Confluent Cloud KSQL

  1. Access the Kafka cluster on Confluent Cloud

    select cluster
  2. Select the 'KSQL' tab and click on 'Add Application'

    new ksql app
  3. Name the KSQL application and click on 'Continue'

    name ksql app
  4. Confirm the terms and then click on 'Launch cluster'

  5. Log in into Confluent Cloud using the CCloud CLI

    ccloud login
  6. Within your environment, list your Kafka clusters

    ccloud kafka cluster list
  7. Select and copy the cluster id from the list

  8. Make sure your Kafka cluster is selected

    ccloud kafka cluster use <CLUSTER_ID_COPIED_ON_STEP_SEVEN>
  9. Find your KSQL application 'Id' using the CCloud CLI

    ccloud ksql app list
  10. Select and copy the KSQL application id from the list

  11. Set up read/write permissions to the Kafka topics

    ccloud ksql app configure-acls <KSQL_APP_ID_COPIED_ON_STEP_TEN> USER_GAME USER_LOSSES
  12. Within the KSQL application, copy the entire pipeline code in the editor

    create pipeline
  13. Click on 'Run' to create the pipeline

4) Executing the Scoreboard Program

In order to verify if the pipeline is working as expected, you can execute a program written in Go that displays the content of the scoreboard. Because tables in KSQL ultimately create topics, this program subscribes to the 'SCOREBOARD' topic and updates the display as new records arrive. Moreover, this program sorts the data based on each user’s game to simulate a real game scoreboard.

  1. Enter the folder that contains the code

    cd scoreboard
  2. Create a native executable for the program

    go build -o scoreboard scoreboard.go
  3. Execute the program to display the data

    ./scoreboard

Note: This program can only be executed after the application is deployed in the cloud provider. Reason being, to connect to Confluent Cloud this program relies on a file called 'ccloud.properties' that is generated by Terraform during deployment.