The StarTree Data Manager can ingest messages from your own Kafka cluster or hosted Kafka services like Confluent Cloud or Amazon MSK.
Select data source
Click on the Kafka button under the Which data do you want to use? heading:
Enter Kafka credentials
Enter the URL of your Kafka cluster, and your username and password. You can also optionally specify the schema registry. Click TEST CONNECTION to check that StarTree Cloud can access the Kafka cluster.
You will see a success message if the bucket has been configured correctly.
Click NEXT to go to the next screen.
Next we need to select the topic that we'd like to connect to Pinot and its format, as shown in the screenshot below:
We'll select the
The messages that we publish to the
events topic have the following structure:
The Dataset Manager will then make an educated guess at the field and data types for each of the fields in the messages on the topic.
Columns and field/data types
We'll change the
ts field type to be
The updated data transformation is shown below:
Updated Columns and field/data types
Once you're happy with the data transformations, scroll down, and click on the NEXT button.
On this screen you'll be able to configure indexes, tennats, ingestion scheduling, and data retention on this data source.
Configure indexes, tenants, ingestion scheduling, and data retention
For more information on the different types of indexes and when to use them, see the Apache Pinot Indexing Documentation.
Once you're happy with the configuration, scroll down, and click on the NEXT button.
You'll now see the review and submit screen, where you can review everything that we've configured in the previous steps.
Review Data Source
If anything doesn't look right, click on the PREV button to go back to the previous screen.
Once you're happy ready to create the data source, click on the FINISH button. You'll then see the following screen:
Data Source Created
Query Data Source
To have a look at the data that we've imported, click on the Query Console link, which will open the Pinot Data Explorer. Click on the events table and then click RUN QUERY to run a basic query against the data source:
Query events Data Source