StarTree Data Manager
The StarTree Data Manager makes it easy to get data into a StarTree Cloud cluster.
What is the StarTree Data Manager?
The StarTree Data Manager provides a way to create a data pipeline and ingest data from a variety of sources without writing any code, commonly referred to as a no-code interface. You can ingest data into Apache Pinot with a few clicks in a graphical user interface (GUI), and perform various transformations and enrichments, including adding and removing fields.
Why is the StarTree Data Manager important?
Data manager makes it easy to ingest data into StarTree with a no-code interface--removing complexity and minimizing potential errors. With the StarTree Data Manager, you benefit from a visual interface to model data, get feedback on transformations, and perform iterations, so you can use the right data model while data is ingested as a table, saving time to value.
The StarTree Data Manager helps you catch issues like data format incompatibility, data quality, and connectivity which can otherwise be a time-consuming process.
Use Data Manager to ingest data
To create an integration from any supported source, do the following:
Select or create a connection to your data source.
Select the data source: In StarTree Data Manager, the data source and connection are separated to facilitate reusability of connections. Once a connection to a source (like S3 or Confluent Cloud) is established, you need to select the exact data source from that connection, like a directory in S3 or a topic in Confluent Cloud, which will then be mapped to a specific table in Apache Pinot.
Perform data modeling: Iteratively transform and enrich data, and preview the transformations. Update the schema by adding new fields, removing fields from the source, or changing the fields that already exist in the source. Alternatively, provide the schema in JSON format, and preview the changes to the data model.
Configure indexes and other advanced configurations: Select indexes from a variety of available indexes for one or many fields in the schema. Configure the star-tree index as needed, and apply advanced configurations (like upsert, data retention, and batch schedules). Provide the applicable configuration as JSON to apply all the indexes and other configurations.
Preview and create the table: Verify the choices you've made choices so far. To make changes, update the schema and table configurations as needed, and then complete the table creation workflow.
Create a table using StarTree Data Manager
For detailed step-by-step instructions for ingesting data into StarTree Data Manager, see Ingest Data.
What connectors are available?
The following connectors are supported in StarTree Data Manager:
- Apache Kafka
- AWS Kinesis
- AWS S3
- Confluent Cloud
- Delta Lake
- Google BigQuery
- Snowflake
- File Upload