Mapping Functions

Data Ingestion – Mapping Functions

Data Ingestion mapping functions allows users to transform their source data column values before they are ingested into Pinot.

Notes

  1. Mapping functions are Scalar in nature, in the sense that, they operate on one or more values and generate a single output value.
  2. Column names in the mapping functions are not case-sensitive.
  3. While applying the Transformation Function, the column name should be different in Original Data and Pinot Data.

Math Functions

The following is an example of the Math add function.

Input Data

student_namesubjectscores*
JaneMaths4
JoePhysics3

The add mapping function is applied as follows.

add(scores, 10)

Output Data

student_namesubjectnew_scores*
JaneMaths14
JoePhysics13
💡

The following will not work as the add function works on two operands.

1. add(columnNameA, columnNameB, columnNameC)
2. add(columnNameA, 10, 20)

String Functions

The following is an example of the String upper function.

Input Data

student_name*subjectscores
JaneMaths4
JoePhysics3

The upper mapping function is applied as follows.

UPPER(student_name)

Output Data

student_name*subjectscores
JANEMaths4
JOEPhysics3

DateTime Functions

toDateTime

Converts from milliseconds to a formatted date-time string, based on the provided Joda-Time pattern. (opens in a new tab)

The following is an example of the toDateTime DateTime function.

Input Data

movie_namerelease_date*
MyMovie11118966400000
MyMovie21216339200000

The toDateTime mapping function is applied as follows.

toDateTime(release_date, 'yyyy-MM-dd''T''HH:mm:ss.SSS''Z''')

Output Data

movie_namerelease_date*
MyMovie12005-06-17T00:00:00.000Z
MyMovie22008-07-18T00:00:00.000Z
💡

If the time column in source data is named as timestamp then it needs to be quoted while applying the mapping function. Example: toDateTime("timestamp", 'yyyy-MM-dd')

fromDateTime

Converts a formatted date-time string to milliseconds, based on the provided Joda-Time pattern (opens in a new tab).

The following is an example of the fromDateTime DateTime function.

Input Data

movie_namerelease_date*
MyMovie12005-06-17T00:00:00.000Z
MyMovie22008-07-18T00:00:00.000Z

The fromDateTime mapping function is applied as follows.

fromDateTime(release_date, 'yyyy-MM-dd''T''HH:mm:ss.SSS''Z''')

Output Data

movie_namerelease_date*
MyMovie11118966400000
MyMovie21216339200000

DATETIMECONVERT

Converts value from one unit into another. Following conversions are possible.

FromTo
EpochEpoch
EpochISO
ISOISO
ISOEpoch

Examples using the DATETIMECONVERT function.

ColumnNameINPUT_VALUEOUTPUT_VALUEMAPPING_FUNCTION
timeFieldA"2024-03-07T00:23:06.407974"1709770986407DATETIMECONVERT("timeFieldA", 'SIMPLE_DATE_FORMAT|yyyy-MM-dd''T''HH:mm:ss.SSSSSS', 'EPOCH|MILLISECONDS|1', 'MILLISECONDS|1')
timeFieldB1709770986407"2024-03-07 00:23:06Z"DATETIMECONVERT("timeFieldB", 'EPOCH|MILLISECONDS|1', 'SIMPLE_DATE_FORMAT|yyyy-MM-dd HH:mm:ss''Z''', 'MILLISECONDS|1')
timeFieldC1709770986"2024-03-07 00:23:06Z"DATETIMECONVERT("timeFieldC", 'EPOCH|SECONDS|1', 'SIMPLE_DATE_FORMAT|yyyy-MM-dd HH:mm:ss''Z''', 'MILLISECONDS|1')
timeFieldD"2024-03-07"1709769600000DATETIMECONVERT("timeFieldD", 'SIMPLE_DATE_FORMAT|yyyy-MM-dd', 'EPOCH|MILLISECONDS|1', 'MILLISECONDS|1')
timeFieldE"2024-03-07T00:23:06"1709770986000DATETIMECONVERT("timeFieldE", 'SIMPLE_DATE_FORMAT|yyyy-MM-dd''T''HH:mm:ss', 'EPOCH|MILLISECONDS|1', 'MILLISECONDS|1')
timeFieldF"2024-03-07T00:23:06.123Z"1709770986000DATETIMECONVERT("timeFieldF", 'SIMPLE_DATE_FORMAT|yyyy-MM-dd''T''HH:mm:ss.SSS''Z''', 'EPOCH|MILLISECONDS|1', 'MILLISECONDS|1')
timeFieldG"2024-03-07 00:23:06.123Z"1709770986000DATETIMECONVERT("timeFieldG", 'SIMPLE_DATE_FORMAT|yyyy-MM-dd HH:mm:ss.SSS''Z''', 'EPOCH|MILLISECONDS|1', 'MILLISECONDS|1')
timeFieldH"2024-03-07 00:23:06.123 Z"1709770986000DATETIMECONVERT("timeFieldH", 'SIMPLE_DATE_FORMAT|yyyy-MM-dd HH:mm:ss.SSS ''Z''', 'EPOCH|MILLISECONDS|1', 'MILLISECONDS|1')

Json Functions

jsonPath

Extracts the object value from jsonField based on 'jsonPath', the result type is inferred based on JSON value.

Input Data

student_nameresults*
Jane{"results":{"subjects":[{"name":"maths","score":90},{"name":"english","score":70}]}}
Joe{"results":{"subjects":[{"name":"maths","score":88},{"name":"english","score":91}]}}

Output Data

student_namemath_score*
Jane90
Joe88

The jsonPath mapping function is applied as follows.

jsonPath(results, '$.results.subjects[0].score')

Binary Functions

toBase64

Input Data

resource_nameresource_path*
base64_docshttps://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html (opens in a new tab)

Output Data

resource_nameresource_path*
base64_docsaHR0cHM6Ly9kb2NzLm9yYWNsZS5jb20vamF2YXNlLzgvZG9jcy9hcGkvamF2YS91dGlsL0Jhc2U2NC5FbmNvZGVyLmh0bWwK

The toBase64 mapping function is applied as follows.

toBase64(toUTF8(resource_path))

fromBase64

Input Data

resource_nameencoded_resource_path*
base64_docsaHR0cHM6Ly9kb2NzLm9yYWNsZS5jb20vamF2YXNlLzgvZG9jcy9hcGkvamF2YS91dGlsL0Jhc2U2NC5FbmNvZGVyLmh0bWwK

Output Data

resource_nameresource_path*
base64_docshttps://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html (opens in a new tab)

The fromBase64 mapping function is applied as follows.

fromBytes(fromBase64(encoded_resource_path), 'UTF8')

Tips and Tricks

💡

tip How to check if a given function is supported?

Enter the function within the Mapping Function textbox and click on Test Transformation.

myCoolFunction()

If the function is not supported an error message is returned back.

Unsupported function: mycoolfunction not found