Ingesting JSON files

Ingesting JSON files

Learn how to ingest JSON documents from a newline-delimited JSON (jsonlines) file. Watch the following video, or complete the tutorial below, starting with Prerequites.

Prerequisites

To follow the code examples in this guide, you must install Docker (opens in a new tab) locally and download recipes.

Clone this repository and navigate to this recipe:

git clone git@github.com:startreedata/pinot-recipes.git
cd pinot-recipes/recipes/ingest-json-files

Run the recipe

Spin up a Pinot cluster using Docker Compose:

docker compose up

Open another tab to add the movies table:

docker run \
   --network json \
   -v $PWD/config:/config \
   apachepinot/pinot:1.0.0 AddTable \
     -tableConfigFile /config/table.json   \
     -schemaFile /config/schema.json \
     -controllerHost "pinot-controller-json" \
    -exec

Import data/ingest.json into Pinot:

docker run \
   --network json \
   -v $PWD/config:/config \
   -v $PWD/data:/data \
   apachepinot/pinot:1.0.0 LaunchDataIngestionJob \
     -jobSpecFile /config/job-spec.yml

Navigate to http://localhost:9000/#/query (opens in a new tab) and run the following query:

select * 
from movies 
limit 10

You will see the following output:

genreidtitleyear
Drama300441473147483650Dear John2010
Comedy332567813147483648The Ugly Truth2009
Romance346905752147483649P.S. I Love You2007
Comedy361248901147483647Valentine's Day2010
Fantasy394030854147483651The Curious Case of Benjamin Button2008