JQ Tutorial

In this tutorial, I am going to show you how to use jq basic commands to manipluate JSON data and files. I am not an expert in this tool, so if you spot any mistake, please reach out to me.

What is JQ?

jq is an open-source powerful, lightweight command-line JSON processor. With this tool, you can do many things with JSON files, we will explore some of these capabilities.

Basic JSON objects commands

How to generate a basic JSON file?

Let's assume we want to generate a file called omar.json that contains this data

{
  "first_name": "Omar",
  "last_name": "Qunsul",
  "timestamp": "Sat, 02 Jan 2021 00:06:37 +0100"
}

Using jq we can generate JSON from scratch, and then saving it directly to the file omar.json

export TIMESTAMP=$(date -R)
jq -n --arg first_name "Omar" --arg last_name "Qunsul" '{first_name: $first_name, last_name: $last_name, timestamp: env.TIMESTAMP}' > omar.json
We can notice two things in this command

  1. We can pass arguments to the jq command, that are read using the $ sign
  2. The JQ command can read ENV variables as well. As you can see here using the env.TIMETAMP

with this, now we can read the JSON File using the command

jq . omar.json
Output:
{
  "first_name": "Omar",
  "last_name": "Qunsul",
  "timestamp": "Sat, 02 Jan 2021 00:06:37 +0100"
}

Reading partial attributes

To read only a subset of the attributes, you can choose what you want to read using such command

jq -c '{first_name, timestamp}' omar.json

Output:

{
  "first_name": "Omar",
  "timestamp": "Sat, 02 Jan 2021 00:06:37 +0100"
}

But that's only part of the story. You can even generated a new computed attributes in the output. For example, if you want to generate a full name from the firstname, and lastname parts. You can run this command

jq '{full_name: (.first_name + " " + .last_name)}' omar.json

Output:

{
  "full_name": "Omar Qunsul"
}

Dealing with arrays

Generating a new array

Now let's switch to a more interesting problem to deal with jq. Dealing with array. In the previous section we were dealing with 1 object, that contained an employee data. Let's generate an array of employees. Let's start only 1 array that contains the previous employee data. You can do this by running this command

jq '[.]' omar.json > employees.json

Now let's output this generated file using the command

jq . employees.json

Output:

[
  {
    "first_name": "Omar",
    "last_name": "Qunsul",
    "timestamp": "Sat, 02 Jan 2021 00:06:37 +0100"
  }
]

What we did in the [.] part is contain the whole object of the omar.json object into an array, and then outputting it into a new file employees.json.

Now by running this command, we can print the number of elements contained in the file employees.json

jq '. | length' employees.json

This will obviously print 1

Adding a new element to an array

Now let's add another employee to the list. Using this command

export TIMESTAMP=$(date -R)
jq --arg first_name "John" --arg last_name "Something" '. + [{first_name: $first_name, last_name: $last_name, timestamp: env.TIMESTAMP}]' employees.json | sponge employees.json

Note: We have used the tool sponge to make sure we are reading the whole input, before we write to the output file employees.json, which also happened to be part of the input.

Let's print the content of the file employees.json now.

jq . employees.json

Output:

[
  {
    "first_name": "Omar",
    "last_name": "Qunsul",
    "timestamp": "Sat, 02 Jan 2021 00:06:37 +0100"
  },
  {
    "first_name": "John",
    "last_name": "Something",
    "timestamp": "Sat, 02 Jan 2021 17:26:16 +0100"
  }
]

Sorting array elements by some attribute

We can also print the elements of the array sorted by some attribute, like (first_name) using such command

jq '. | sort_by(.first_name)' employees.json

Output:

[
  {
    "first_name": "John",
    "last_name": "Something",
    "timestamp": "Sat, 02 Jan 2021 17:26:16 +0100"
  },
  {
    "first_name": "Omar",
    "last_name": "Qunsul",
    "timestamp": "Sat, 02 Jan 2021 00:06:37 +0100"
  }
]

Keep on mind, that this won't affect the source file. It's only printing the results without saving them anywhere

Reading subset of the objects attributes

as we did with the simple objects before, we can also print out the objects but only with a subset of their attributes. For example first name

jq '[.[] | {first_name}]' employees.json

Output:

[
  {
    "first_name": "Omar"
  },
  {
    "first_name": "John"
  }
]

Or if you want, you can print the first names as as array instead.

jq '[.[] | .first_name]' employees.json

Output:

[
  "Omar",
  "John"
]

Modifying array elements

Now let's continue by modifying the objects in the list, by adding a salary attribute.

jq '. | .[0].salary = 500 | .[1].salary = 850' employees.json | sponge employees.json

Filtering array elements

Now that we have some data that we can query, let's print out the employees full names that have a salary larger than 600

jq '.[] | select(.salary > 600) | {full_name: (.first_name + " " + .last_name)}' employees.json

Output:

{
  "full_name": "John Something"
}

That's cool right ?!


About Me

My name is Omar Qunsul. I write these articles mainly as a future reference for me. So I dedicate some time to make them look shiny, and share them with the public.

You can find me on twitter @OmarQunsul, and on Linkedin.


Homepage