Ingesting JSON files

Learn how to ingest JSON documents from a newline-delimited JSON (jsonlines) file. Watch the following video, or complete the tutorial below, starting with Prerequites.

Pinot Version	1.0.0
Code	startreedata/pinot-recipes/ingest-json-files

Prerequisites

To follow the code examples in this guide, you must install Docker (opens in a new tab) locally and download recipes.

Clone this repository and navigate to this recipe:

git clone git@github.com:startreedata/pinot-recipes.git
cd pinot-recipes/recipes/ingest-json-files

Run the recipe

Spin up a Pinot cluster using Docker Compose:

docker compose up

Open another tab to add the movies table:

docker run \
   --network json \
   -v $PWD/config:/config \
   apachepinot/pinot:1.0.0 AddTable \
     -tableConfigFile /config/table.json   \
     -schemaFile /config/schema.json \
     -controllerHost "pinot-controller-json" \
    -exec

Import data/ingest.json into Pinot:

docker run \
   --network json \
   -v $PWD/config:/config \
   -v $PWD/data:/data \
   apachepinot/pinot:1.0.0 LaunchDataIngestionJob \
     -jobSpecFile /config/job-spec.yml

Navigate to http://localhost:9000/#/query (opens in a new tab) and run the following query:

select * 
from movies 
limit 10

You will see the following output:

genre	id	title	year
Drama	300441473147483650	Dear John	2010
Comedy	332567813147483648	The Ugly Truth	2009
Romance	346905752147483649	P.S. I Love You	2007
Comedy	361248901147483647	Valentine's Day	2010
Fantasy	394030854147483651	The Curious Case of Benjamin Button	2008

Ingesting JSON files from Kafka Ingest Parquet Files from S3 Using Spark