Release Version 0.7.1: November 2023

Significant Apache Pinot updates since last StarTree release

For complete details on Pinot changes, see Releases (opens in a new tab).

Breaking changes

Dependencies

StarTree extensions for Apache Pinot

The following updates are available only in StarTree Cloud.

  • Improvements to file ingestion task (opens in a new tab):
    • Enhancements to batch ingestion using minion to improve atomic ingestion and backfill operations
    • Control size-based segment creation with desiredSegmentSize (opens in a new tab) to improve performance
  • Automatically tune segment size for segment refresh task without configuring maxNumRecordsPerTask and maxNumRecordsPerSegment. Size-based tuning helps make predictable segment sizes and avoid memory- or size- related exceptions
  • Validation is stricter for using sync mode in conjunction with other tasks. You can no longer schedule the segment refresh task at the same time as sync mode.
  • Separate RocksDB log from server logs to improve debugging experience and allow you to set different retention and rollover policies
  • Improve Kafka logs by changing the following classes to error-level:
    • KafkaConsumer
    • AppInfoParser
    • ConsumerConfig
  • Enhancements to upsert tables:
    • Correctly track primary key count and add corresponding metrics
    • Improve stability during deletion
  • Improve performance and navigation in broker and server Grafana dashboards
  • Move to Google Trust Services Certificate Authority to improve certification management

Data Manager

  • Improve data sampling from Kafka topics with large numbers of partitions by preventing "no data" error in preview
  • Automate Google Cloud Platform (GCP) credentials in Data Manager so you can ingest instead of having to contact StarTree support
  • Improve error messages to aid troubleshooting

ThirdEye

  • Improve loading time for multi-dimension alerts and dashboard statistics
  • Simplified alert creation with advanced anomaly detection and tuning options, reducing complexity of data patterns and seasonality