Connect Trino to StarTree Cloud
There are a few ways to install and utilize Trino. You can:
- Install install Trino (opens in a new tab)
- Install Trino using the Trino CLI (opens in a new tab)
- Install Trino using the JDBC driver (opens in a new tab)
In a StarTree deployment, Pinot components are hosted within a Kubernetes cluster. Pinot server endpoints are not exposed outside the Kubernetes cluster in this setup. While this is fine for most use cases, it creates a problem for external services like Trino.
Pinot Proxy is our solution to enable Trino connection into the Pinot cluster when inside the Kubernetes cluster. Without Pinot Proxy, we will have to create dedicated Kubernetes routes for each single Pinot Server host, and such a solution is not scalable for Kubernetes.
At a very high level, Pinot Proxy behaves similarly to a reverse proxy like Nginx. It proxies RPC requests into Pinot clusters inside Kubernetes.
A special feature of Pinot Proxy is that it can forward Trino gRPC requests to specific Pinot Servers rather than routing them by load balancers.
This diagram illustrates how it communicates with Trino Pinot drivers.
We have detailed slides (opens in a new tab) explaining the technical aspect of this design and why we need a Pinot proxy.
The main benefit is that with Pinot Proxy, Trino can finally run queries in streaming mode when connecting to Pinot inside a Kubernetes cluster.
For more information on Pinot connectors, view Apache Pinot Connector (opens in a new tab) documentation.
NOTE: Trino has to be version
400or higher to properly connect to Pinot in Kubernetes.
The Trino connection URLs must be modified to point to Pinot Proxy host names to connect via Pinot Proxy. Suppose your pinot cluster is named
pug, the environment name is
prod, and StarTree cloud domain is
awesome-company.startree.cloud. Below are the configs needed in
pinot.properties configuration in Trino:
# Pinot controller URI, in the format of <scheme>://<hostname>:<port>, scheme is required. # Replace the pug.prod.awesome-company.startree.cloud with the link to your pinot cluster pinot.controller-urls=https://proxy.broker.pug.prod.awesome-company.startree.cloud:443 # Pinot Server gRPC port, Trino default is 8090, StarTree Cloud default is 8096 pinot.grpc.port=8096 # Enable Pinot Rest Proxy pinot.proxy.enabled=true # Pinot Rest Proxy gRPC URI, in the format of <hostname>:<port> # Replace the pug.prod.awesome-company.startree.cloud with the link to your pinot cluster # For the clusters without TLS enabled, the port number will be 80 pinot.grpc.proxy-uri=proxy-grpc.broker.pug.prod.awesome-company.startree.cloud:443
If the cluster has TLS enabled, we will need the extra properties in the configuration file.
Follow the doc to generate a API token and Credential for your Pinot Auth Token and get the Username/Password out from it.
# Extra Pinot gRPC configs # Enable gRPC TLS pinot.grpc.use-plain-text=false # Authentication configs pinot.controller.authentication.type=PASSWORD pinot.controller.authentication.user=<startree-cloud-pinot-user> pinot.controller.authentication.password=<startree-cloud-pinot-password> pinot.broker.authentication.type=PASSWORD pinot.broker.authentication.user=<startree-cloud-pinot-user> pinot.broker.authentication.password=<startree-cloud-pinot-password>