Ingest from Apache Pulsar

How to ingest data from Apache Pulsar

Apache Pulsar is a cloud-native, distributed messaging and streaming platform originally created at Yahoo!. In this guide we'll learn how to ingest data from Pulsar into Pinot.

Pinot Version1.1.0
Codestartreedata/pinot-recipes/pulsar

Prerequisites

To follow the code examples in this guide, you must install Docker (opens in a new tab) locally and download recipes.

Clone this repository and navigate to this recipe:

git clone git@github.com:startreedata/pinot-recipes.git
cd pinot-recipes/recipes/ingest-json-files

Makefile

make recipe

To produce data into Pulsar, use the Python code below.

import pulsar
import json
import time
import random
import uuid
 
client = pulsar.Client('pulsar://localhost:6650')
producer = client.create_producer('events')
 
  message = {
    "ts": int(time.time() * 1000.0),
    "uuid": str(uuid.uuid4()).replace("-", ""),
    "count": random.randint(0, 1000)
}
payload = json.dumps(message, ensure_ascii=False).encode('utf-8')
producer.send(payload)
client.close()
 

See the data

Navigate to localhost:9000/#/query (opens in a new tab) to see the data in Apache Pinot.

Clean up

make clean

Troubleshooting

To clean up old Docker installations that may be interfering with your testing of this recipe, run the following command:

docker system prune