Recipes
These recipes/how-to guides describe how to solve common problems with Apache Pinot.
Batch Ingestion
- Importing CSV files with columns containing spaces
- Import data files from different directories
- Ingest CSV files from a S3 bucket
- Ingest JSON files
- Ingest Parquet Files from a S3 Bucket into Pinot Using Spark
- Backfill offline segment
Streaming Ingestion
- Ingest simple JSON data from Kafka
- Ingest data from Kafka configured with SASL authentication
- Ingest data from Kafka configured with SSL and SASL authentication
- Ingest GitHub API Events using Kinesis
- Ingest data from Pulsar
- Configuring segment threshold
- Ingest Avro messages with Confluent Schema registry
Transformation Functions
- Groovy Transformation Functions
- JSON Transformation Functions
- Chaining Transformation Functions
- Filtering Functions
- DateTime Strings to Timestamps
- Combine source fields
Deep Storage
Upserts
Real-Time to Offline Job
- Manually scheduling real-time to offline job
- Automatically scheduling real-time to offline job
- Upserts and the real-time to offline job
JSON Documents
- Unnest arrays in JSON documents
- Rename fields when unnesting arrays in JSON documents
- Flattening nested objects
- Index JSON columns
- Update JSON index