Create and update an Apache Pinot table configuration
In Apache Pinot, create a table by creating a JSON file, generally referred to as your table config. Update, add, or delete parameters as needed, and then reload the file.
Create a Pinot table configuration
Before you create a Pinot table config, you must first have a running Pinot cluster with broker and server tenants. In StarTree Cloud, this is taken care of for you.
-
Create a plaintext file locally using settings from the available properties (opens in a new tab) for your use case.
-
Use the Pinot API to upload your table config file:
POST @fileName.json URL:9000/tables
You may find it useful to download an example from the Pinot GitHub (opens in a new tab) and then modify it. An example from among these is included at the end of this page in Example Pinot table config file.
Update a Pinot table configuration
To modify your Pinot table configuration, use the Pinot UI or the API.
Any time you make a change to your table config, you may need to do one or more of the following, depending on the change.
Simple changes only require updating and saving your modified table config file. These include:
-
Changing the data or segment retention time
-
Changing the realtime consumption rate limiter (opens in a new tab) settings
To update existing data and segments, after you update and save the change(s) to the table config file, do the following as applicable:
When you add or modify indexes or the table schema, perform a segment reload (opens in a new tab). To reload (opens in a new tab) all segments:
- In the Pinot UI, from the table page, click Reload All Segments.
- Using the Pinot API, send
POST /segments/{tableName}/reload
.
When you re-partition data, perform a segment refresh (opens in a new tab). To refresh, replace an existing segment with a new one by uploading a segment reusing the existing filename.
- Using the Pinot API, send
POST /segments?tableName={yourTableName}
. - Automate this action by including
SegmentRefreshTask
in your table config to make Pinot refresh segments if they are not consistent with the table config. See the SegmentRefreshTask documentation for limitations to using this.
When you change the transform function used to populate a derived field or increase the number of partitions in an upsert-enabled table, perform a table re-bootstrap. One way to do this is to delete and recreate the table:
- Using the Pinot API, first send
DELETE /tables/{tableName}
followed byPOST /tables
with the new table config.
When you change the stream topic or change the Kafka cluster containing the Kafka topic you want to consume from, perform a real-time ingestion pause and resume. To pause and resume real-time ingestion:
- Using the Pinot API, first send
POST /tables/{tableName}/pauseConsumption
followed byPOST /tables/{tableName}/resumeConsumption
.
Update a Pinot table in the UI
To update a table configuration in the Pinot UI, do the following:
-
In the Cluster Manager click the Tenant Name of the tenant that hosts the table you want to modify.
-
Click the Table Name in the list of tables in the tenant.
-
Click the Edit Table button. This creates a pop-up window containing the table config. Edit the contents in this window. Click Save when you are done.
Update a Pinot table using the API
To update a table configuration using the Pinot API, do the following:
-
Get the current table configuration with
GET /tables/{tableName}
. -
Modify the file locally.
-
Upload the edited file with
PUT /table/{tableName} fileName.json
.
Example Pinot table config file
This example comes from the Apache Pinot Quickstart Examples (opens in a new tab). This table config defines a table called airlineStats_OFFLINE, which you can interact with by running the example.
{
"OFFLINE": {
"tableName": "airlineStats_OFFLINE",
"tableType": "OFFLINE",
"segmentsConfig": {
"timeType": "DAYS",
"replication": "1",
"segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
"timeColumnName": "DaysSinceEpoch",
"segmentPushType": "APPEND",
"minimizeDataMovement": false
},
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant"
},
"tableIndexConfig": {
"rangeIndexVersion": 2,
"autoGeneratedInvertedIndex": false,
"createInvertedIndexDuringSegmentGeneration": false,
"loadMode": "MMAP",
"enableDefaultStarTree": false,
"starTreeIndexConfigs": [
{
"dimensionsSplitOrder": [
"AirlineID",
"Origin",
"Dest"
],
"skipStarNodeCreationForDimensions": [],
"functionColumnPairs": [
"COUNT__*",
"MAX__ArrDelay"
],
"maxLeafRecords": 10
},
{
"dimensionsSplitOrder": [
"Carrier",
"CancellationCode",
"Origin",
"Dest"
],
"skipStarNodeCreationForDimensions": [],
"functionColumnPairs": [
"MAX__CarrierDelay",
"AVG__CarrierDelay"
],
"maxLeafRecords": 10
}
],
"enableDynamicStarTreeCreation": true,
"aggregateMetrics": false,
"nullHandlingEnabled": false,
"optimizeDictionary": false,
"optimizeDictionaryForMetrics": false,
"noDictionarySizeRatioThreshold": 0
},
"metadata": {
"customConfigs": {}
},
"fieldConfigList": [
{
"name": "ts",
"encodingType": "DICTIONARY",
"indexType": "TIMESTAMP",
"indexTypes": [
"TIMESTAMP"
],
"timestampConfig": {
"granularities": [
"DAY",
"WEEK",
"MONTH"
]
}
}
],
"ingestionConfig": {
"transformConfigs": [
{
"columnName": "ts",
"transformFunction": "fromEpochDays(DaysSinceEpoch)"
},
{
"columnName": "tsRaw",
"transformFunction": "fromEpochDays(DaysSinceEpoch)"
}
],
"continueOnError": false,
"rowTimeValueCheck": false,
"segmentTimeValueCheck": true
},
"tierConfigs": [
{
"name": "hotTier",
"segmentSelectorType": "time",
"segmentAge": "3130d",
"storageType": "pinot_server",
"serverTag": "DefaultTenant_OFFLINE"
},
{
"name": "coldTier",
"segmentSelectorType": "time",
"segmentAge": "3140d",
"storageType": "pinot_server",
"serverTag": "DefaultTenant_OFFLINE"
}
],
"isDimTable": false
}
}