Skip to main content

StarTree Cloud

StarTree Cloud provides managed hosting for Apache Pinot on all major cloud platforms, including AWS, GCP and Azure. StarTree Cloud enables developers to provision Apache Pinot clusters of different sizes, ingest data from real-time and batch data sources, and run analytics workloads with ultra-low latency.

Meanwhile, StarTree manages the underlying infrastructure for you, allowing you to have timely insights from a diverse set of data and make informed business decisions.

StarTree Cloud consists of tools that give you a better developer experience, including StarTree Dataset Manager (for data ingestion) and StarTree ThirdEye (for Anomaly Detection).

What are the use-cases for StarTree Cloud?

  • User facing analytics
  • Business metrics
  • Anomaly detection
  • Root cause analysis
  • Dashboards
  • Log analytics
  • Cohort analytics
  • Ad-hoc exploration
image

Key features of StarTree Cloud

Managed Apache Pinot

  • Query Latency and Speed: Computes on the fly and is very fast due to the indexing strategies, partitioning/data layout, and bloom filters. It supports partial pre-aggregated values to provide very low latency, real-time analytics on the data.
    • Millisecond level latencies for most OLAP queries
    • Allows to achieve a hard upper bound for query latencies for a given use-case
    • Low latency with high throughput
  • Data Mutability: Designed with low cost to serve to answer OLAP queries and low latency on immutable data and mutable data(Upsert Support).
  • Indexing: The following indexing techniques are supported:
    • Inverted Index
    • Sorted Index
    • Range Index
    • JSON Index
    • Text Index
    • GeoSpatial Index
    • StarTree Index
  • Throughput: Purpose-built for supporting very high throughput for the analytical workload. Can support 10000+ QPS in a single cluster.
  • Cost to serve: Low: columnar storage provides excellent compression leading to lower storage and in-memory footprint.
  • Operational/Production Readiness: Built to be multi-tenant. Provides an easy way to scale a cluster up / down, replace nodes, and reshuffle data.
  • Advanced query features (joins): Limited support: Fact - dim join supported. An early version of fact-fact distributed shuffle join is also available.
  • Integration with existing data eco-systems: Integrates well with the rest of the data ecosystem. Has excellent support for backfills.

StarTree Dataset Manager

The StarTree Dataset Manager is UI that makes it easy to onboard data into a StarTree Cloud.

You can access the Dataset manager by navigating to the StarTree Dataset Manager from the environment page. To learn more, see the Dataset Manager documention.

Learn more

To learn more, see the following developer guides: