Ingesting JSON files
Learn how to ingest JSON documents from a newline-delimited JSON (jsonlines) file. Watch the following video, or complete the tutorial below, starting with Prerequites.
Pinot Version | 1.0.0 |
Code | startreedata/pinot-recipes/ingest-json-files |
Prerequisites
To follow the code examples in this guide, you must install Docker (opens in a new tab) locally and download recipes.
Clone this repository and navigate to this recipe:
git clone git@github.com:startreedata/pinot-recipes.git
cd pinot-recipes/recipes/ingest-json-files
Run the recipe
Spin up a Pinot cluster using Docker Compose:
docker compose up
Open another tab to add the movies
table:
docker run \
--network json \
-v $PWD/config:/config \
apachepinot/pinot:1.0.0 AddTable \
-tableConfigFile /config/table.json \
-schemaFile /config/schema.json \
-controllerHost "pinot-controller-json" \
-exec
Import data/ingest.json into Pinot:
docker run \
--network json \
-v $PWD/config:/config \
-v $PWD/data:/data \
apachepinot/pinot:1.0.0 LaunchDataIngestionJob \
-jobSpecFile /config/job-spec.yml
Navigate to http://localhost:9000/#/query (opens in a new tab) and run the following query:
select *
from movies
limit 10
You will see the following output:
genre | id | title | year |
---|---|---|---|
Drama | 300441473147483650 | Dear John | 2010 |
Comedy | 332567813147483648 | The Ugly Truth | 2009 |
Romance | 346905752147483649 | P.S. I Love You | 2007 |
Comedy | 361248901147483647 | Valentine's Day | 2010 |
Fantasy | 394030854147483651 | The Curious Case of Benjamin Button | 2008 |