StarTree Cloud Coding Competition With Grand Prizes

startree-mean-variance-dx

Description

Detect an anomaly if the metric is not in mean ± n*std. mean and std (standard deviation) are estimated with historical data. The amount of historical data to use is set with the lookback property. Aggregation function with 1 operand: SUM, MAX,etc... Use the enumerationItems property to configure the different dimensions to explore.

Flowchart

Parameters

DATA

name	description	default value
dataSource	The Pinot datasource to use.	-
dataset	The dataset to query.	-
aggregationColumn	The column to aggregate. Can be a derived metric.	-
aggregationFunction	The aggregation function to apply on the aggregationColumn. Example: `AVG`.	-
monitoringGranularity	The period of aggregation of the timeseries. In ISO-8601 format. Example: `PT1H`.	-
timezone	Timezone used to group by time. In TZ-identifier (opens in a new tab) format. For instance, `UTC` or `US/Pacific`.	UTC
timeColumn	TimeColumn used to group by time. If set to AUTO (the default value), the Pinot primary time column is used.	AUTO
timeColumnFormat	Required if timeColumn is not AUTO. Learn more (opens in a new tab).
completenessDelay	The time for your data to be considered complete and ready for anomaly detection. In ISO-8601 format. Example: `PT2H`. Learn more (opens in a new tab).	P0D
queryFilters	Filters to apply when fetching data. Prefix with `AND`. Example: `AND country='US'`	${queryFilters}
queryLimit	Maximum number of timeseries point to fetch.	100000001

DETECTION

name	description	default value
lookback	Historical time period to use to train the model. In ISO-8601 format. Example: `P21D`.	-
sensitivity	The model will detect fewer anomalies with lower sensitivity and more with higher sensitivity.	-
pattern	Whether to detect an anomaly if it's a drop, a spike or any of the two.	UP_OR_DOWN
seasonalityPeriod	Seasonality to consider when computing mean and variance. Possible values are `P7D` (weekly and smaller periods), `P1D` (daily and smaller periods), PT0S (no seasonality). Eg: with P7D, a Monday 12 AM value will be estimated from the mean and variance of the previous Monday 12 AM values.	PT0S

FILTER

Time of week

name	description	default value
daysOfWeek	Used to ignore anomalies that happen at specific time periods. A list of days. Anomalies happening on these days are ignored if timeOfWeekIgnore is true. Example: `["MONDAY", "SUNDAY"]`.	[]
hoursOfDay	Used to ignore anomalies that happen at specific time periods. A list of hours. Anomalies happening on these hours are ignored. Example: `[0,1,2,23]`	[]
dayHoursOfWeek	Used to ignore anomalies that happen at specific time periods. A mapping of `{DAY: [hours]}`. Anomalies happening on these timeframes are ignored if timeOfWeekIgnore is true. Example: `{"FRIDAY": [22, 23], "SATURDAY": [0, 1, 2]}`	{}

Threshold

name	description	default value
thresholdFilterMin	Used to ignore anomalies that don't meet the thresholdFilter min and max. Example: set `thresholdFilterMin = 10` to ignore anomalies when the metric is smaller than 10. Can help ignore anomalies happening in low data regimes. Filter threshold minimum. If `-1`, no minimum threshold is applied.	-1
thresholdFilterMax	Used to ignore anomalies that don't meet the thresholdFilter min and max. Example: set `thresholdFilterMin = 10` to ignore anomalies when the metric is smaller than 10. Can help ignore anomalies happening in low data regimes. Filter threshold maximum. If `-1`, no maximum threshold is applied.	-1

Guardrail metric

name	description	default value
guardrailMetricMin	Used to ignore anomalies that don't meet the guardrail threshold. Minimum threshold of the guardrail metric. If `-1`, no minimum threshold is applied.	-1
guardrailMetricMax	Used to ignore anomalies that don't meet the guardrail threshold. Maximum threshold of guardrailMetric. If `-1`, no maximum threshold is applied.	-1
guardrailMetric	Used to ignore anomalies that don't meet the guardrail threshold. Metric to use as a threshold guardrail. Example: `COUNT(*)` and set `guardrailMetricMin = 100` to ignore anomalies detected when there is less than 100 observations in the period.	COUNT(*)

Special events

name	description	default value
eventFilterSqlFilter	Used to ignore anomalies that happen during events. Sql filter to apply on the events. Learn more (opens in a new tab)
eventFilterLookaround	Used to ignore anomalies that happen during events. Offset to apply on startTime and endTime to look around the timeframe. In ISO-8601 format. Example: `P1D`.	P2D
eventFilterTypes	Used to ignore anomalies that happen during events. List of event types to fetch by. Example: `["HOLIDAY", "DEPLOYMENT"]`. `[]` fetches all events. Use `["__NO_EVENTS"]` to disable.	['__NO_EVENTS']
eventFilterBeforeEventMargin	Used to ignore anomalies that happen during events. A period in ISO-8601 format that corresponds to a period that is also impacted by the event. Example: if beforeEventMargin is `P1D`, if event happens on `[Dec 24 0:00, Dec 25 0:00[`, the label will be applied to anomalies happening on `[Dec 23 0:00 and Dec 25 0:00[`	P0D
eventFilterAfterEventMargin	Used to ignore anomalies that happen during events. Same as eventFilterBeforeEventMargin at the end of the event.	P0D

POSTPROCESS

Data mutability

name	description	default value
mutabilityPeriod	Use if your data is mutable. ThirdEye will maintain the detection results up to date on the mutable period. For instance, if your last 10 days of data is mutable, set `P10D`. At each cron detection job, the detection results for the last 10 days will be updated.	P0D
reNotifyPercentageThreshold	For detection replay when data is mutable. If the percentage difference between an existing anomaly and a new anomaly on the same time frame is above this threshold, renotify. Combined with `reNotifyAbsoluteThreshold`. Both thresholds must pass to be re-notified. If zero, always renotify. If null or negative, never re-notifies.	-1
reNotifyAbsoluteThreshold	For detection replay when data is mutable. If the absolute difference between an existing anomaly and a new anomaly on the same time frame is above this threshold, renotify. Combined with `reNotifyPercentageThreshold`. Both thresholds must pass to be re-notified. If zero, always renotify. If null or negative, never re-notifies.	-1

Anomaly merger

name	description	default value
mergeMaxGap	Maximum gap between 2 anomalies for anomalies to be merged. In ISO-8601 format. Example: `PT2H`. The default behavior is to merge consecutive anomalies only. To disable anomaly merging entirely, set this value to `P0D`.
mergeMaxDuration	Maximum duration of an anomaly merger. At merge time, if an anomaly merger would get bigger than this limit, the anomalies are not merged. In ISO-8601 format. Example: `P7D`.

RCA

name	description	default value
rcaAggregationFunction	The aggregation function to use for RCA. If the detection metric name is known to ThirdEye, this parameter is optional.
rcaIncludedDimensions	List of the dimensions (columns in the dataset) to use in RCA drill-downs. If not set or empty, all dimensions of the table are used. Learn more (opens in a new tab).	[]
rcaExcludedDimensions	List of dimensions (columns in the dataset) to ignore in RCA drill-downs. If not set or empty, all dimensions of the table are used. rcaExcludedDimensions and rcaIncludedDimensions cannot be used at the same time.	[]
rcaEventTypes	A list of type to filter on for RCA. Only events that match such types will be shown in the RCA related events tab. Learn more (opens in a new tab).	[]
rcaEventSqlFilter	A Sql filter for RCA events. Only events that match the filter will be shown in the RCA related events tab. Learn more (opens in a new tab).

DIMENSION_EXPLORATION

name	description	default value
enumerationItems	Array of enumerations. The detection pipeline will run for each enumeration. The format is the following: `[ { "name": "US country", "description": "slice for US only", "params": { "queryFilters": " AND country='US'" } } ,... # other enumerations ,]` To make a property configurable for each enumeration, ensure it is set to the special value: `[DOLLAR]{myProperty}` - replace [DOLLAR] by the dollar character). In the example above the `queryFilter` property must be set to `[DOLLAR]{queryFilters}`.	-
enumerationItemIdKeys	List of keys to use to identify the enumeration. The format is the following: `[ "queryFilters" ]` The keys must be present in the `params` object of each enumeration. The keys will be used to generate the dimension exploration id. The id will be used to identify the enumeration in the detection pipeline.	['queryFilters']

startree-mean-variance startree-mean-variance-percentile