Skip to main content

Cohort Recommender

API Reference

The cohort recommender API is accessible at /api/rca/metrics/cohorts. Example:

POST https://thirdeye.yournamespace.domain.com/api/rca/metrics/cohorts
accept: application/json

This endpoint accepts a json payload and can be leveraged in several different ways.

Parameters

NameDescription
startThe starting time in epoch millis. Example: 1623110400000 means Tue Jun 08 2021 00:00:00 UTC
endThe end time in epoch millis . Example: 1623283200000 means Thu Jun 10 2021 00:00:00 UTC
thresholdExact value of metric. Cohorts which contribute more than this will be included.
percentageIf threshold is not provided, then percentage is used. Example: setting percentage to 10 means that threshold = 10% of overall aggregate.
generateEnumerationItems (beta)If set to true, thirdeye will also try to generate the list of enumeration items in addition to cohorts
whereThis is an additional where clause that can be added to the query in the form of a SQL expression. Example: ""where": "country LIKE 'US%' AND \"device\" = 'phone'""
havingSimilar to where, having clause is a SQL expression that can be added to the query. Example: "COUNT(*) > 1000000"
maxDepthThe max depth in dimensions that the recommendation engine will dive to. Example: maxDepth = 3 means all cohorts reported will have a max of 3 dimensions. like country = 'US', device = 'phone', version='0.3.0'. Default value is 10
dimensionsIf set, this is the list of dimensions that the cohort recommender will iterate through. Example: if the dataset has 5 dimensions and "dimensions": ["country", "device"]. ThirdEye will only iterate on these 2 dimensions
limitIf set, this behaves like top N. The number of results are trimmed to limit and the results are sorted in descending order by contribution. The default value is 100

Metric

A metric can be mentioned in multiple ways in the API.

Using ID

{
"metric": {
"id": 12746
}
}

Using name

{
"metric": {
"name": "pageviews"
}
}

Specifying the column name, aggregation function and table name

{
"metric": {
"aggregationColumn": "views",
"aggregationFunction": "SUM",
"dataset": {
"name": "pageviews"
}
}
}

Payload Examples

Show me cohorts with more than 10% contribution to the overall metric having more than n rows in a given timeframe.

{
"metric": {
"aggregationColumn": "views",
"aggregationFunction": "SUM",
"dataset": {
"name": "pageviews"
}
},
"percentage": 10,
"having": "COUNT(*) > 1000000",
"start": "1623110400000",
"end": "1623283200000"
}

Give me cohorts that are above 40k pageviews showing only the top 5 for every dimension combination.

{
"metric": {
"id": 12746
},
"threshold": 400000,
"maxDepth": 3,
"limit": 5,
"start": "1623110400000",
"end": "1623283200000"
}