jq is an incredibly powerful and efficient command-line tool for processing JSON data. It’s often referred to as “the sed or awk for JSON.” This crash course will cover the most essential concepts to get you started.

The Basic Syntax

The basic jq command has a simple structure:

jq '<filter>' <input_file>

Or, more commonly, it’s used in a pipeline:

<some_command_that_outputs_json> | jq '<filter>'

The filter is the core of jq. It’s the “program” that tells jq what to do with the JSON input.


The Identity Filter: .

The simplest and most fundamental filter is the dot, .. It’s the “identity” filter, meaning it takes the input and outputs it with no changes (other than pretty-printing it by default).

  • Example: Pretty-printing a minified JSON file.

    echo '{"name":"Alice","age":30}' | jq '.'
    # Output:
    # {
    #   "name": "Alice",
    #   "age": 30
    # }
    

    This is often the first thing people use jq for: making unreadable JSON readable.


Accessing Object Fields

You can access fields (key-value pairs) in a JSON object using the dot (.) followed by the field name.

  • Example: Getting a single value.

    echo '{"name":"Alice","age":30}' | jq '.name'
    # Output:
    # "Alice"
    
  • Example: Accessing a nested field.

    echo '{"user":{"name":"Bob","id":123}}' | jq '.user.name'
    # Output:
    # "Bob"
    

Working with Arrays

To access elements in a JSON array, you use square brackets ([]).

  • Example: Getting the first element of an array.

    echo '["apple", "banana", "cherry"]' | jq '.[0]'
    # Output:
    # "apple"
    
  • Example: Getting the value of a field in an object that is inside an array.

    echo '[{"name":"Alice"}, {"name":"Bob"}]' | jq '.[1].name'
    # Output:
    # "Bob"
    

Iterating Over Arrays: []

The [] filter, when used without an index, acts as an iterator. It takes an array and outputs each element on a new line. This is one of the most powerful features for filtering and transforming data.

  • Example: Extracting all names from an array of objects.

    echo '[{"name":"Alice"}, {"name":"Bob"}]' | jq '.[] | .name'
    # Output:
    # "Alice"
    # "Bob"
    

    Notice the pipe (|). This is a key operator in jq that “pipes” the output of one filter into the input of the next. Here, the . outputs the entire array, [] then iterates over it, and for each element, .name extracts the name.

Filtering with select()

The select() function is used to filter elements based on a condition. It takes a boolean expression as its argument.

  • Example: Find objects where a value is greater than a number.
    echo '[{"name":"Alice","age":30}, {"name":"Bob","age":45}]' | jq '.[] | select(.age > 40)'
    # Output:
    # {
    #   "name": "Bob",
    #   "age": 45
    # }
    

Creating New Objects and Values

You can construct new JSON objects or arrays on the fly.

  • Example: Creating a new object with specific fields.

    echo '{"name":"Alice","age":30,"city":"London"}' | jq '{username: .name, years_old: .age}'
    # Output:
    # {
    #   "username": "Alice",
    #   "years_old": 30
    # }
    

    This is extremely useful for transforming data into a different format.

  • Example: Creating a new array.

    echo '[{"name":"Alice"}, {"name":"Bob"}]' | jq '[.[] | .name]'
    # Output:
    # [
    #   "Alice",
    #   "Bob"
    # ]
    

    Here, the outer brackets [] tell jq to collect all the individual outputs from the inner filter (.[] | .name) and put them into a single array.


Key Flags and Options

  • -r (raw output): By default, jq outputs JSON values, which includes surrounding quotes for strings. The -r flag gives you the raw string output.

    echo '{"name":"Alice"}' | jq '.name'
    # Output: "Alice"
    
    echo '{"name":"Alice"}' | jq -r '.name'
    # Output: Alice
    
  • -c (compact output): This flag produces minified, single-line JSON output. It’s the opposite of the default pretty-printing.

    echo '{"name":"Alice","age":30}' | jq -c '.'
    # Output: {"name":"Alice","age":30}
    

This crash course covers the most essential tools in the jq toolbox. With these filters and flags, you can handle a vast majority of common JSON manipulation tasks. For more advanced use cases, remember to check the official jq manual for its extensive list of functions and operators.

jq '
  [
    .[] |
    select(.yaml_path) |
    select(.yaml_path | contains("registry")) |
    { rule: .rule_name, path: .yaml_path}
  ] as $filtered_data
  |
  {
    count: ($filtered_data | length),
    results: $filtered_data
  }
' bulk_converted_criteria.json

Explanation:

  1. [ ... ] as $filtered_data: This part builds the array of filtered objects exactly as before, but it stores the entire result in a variable called $filtered_data.

  2. { count: ..., results: ... }: This creates a new JSON object.

  3. ($filtered_data | length): It calculates the length of the stored array and assigns it to the count key.

  4. $filtered_data: It uses the stored array and assigns it to the results key.

  5. Why this additional select? This prevents from crashes when the json has no yaml_path object

I’ll explain the jq command that was used to reorder the fields:

Command Breakdown

jq '.[].criteria.actions |= map({pattern: .pattern, ACTION_UNIQUE_NAME: .ACTION_UNIQUE_NAME, rows: .rows})' updated_criteria.json > updated_criteria_reordered.json

Let me break this down piece by piece:

1. jq - The JSON processor tool

  • jq is a lightweight command-line JSON processor
  • It allows you to slice, filter, map, and transform structured data

2. '.[]' - Array iteration

  • .[] iterates over each element in the top-level array`
  • In your case, it processes each rule object in the JSON array

3. .criteria.actions - Navigate to the actions array

  • This selects the actions array within the criteria object of each rule

4. |= - Update operator

  • The |= operator updates the selected path with the result of the expression on the right
  • It’s equivalent to “modify in place”

5. map(...) - Transform each element

  • map applies a transformation to each element in the array
  • In this case, it transforms each action object

6. {pattern: .pattern, ACTION_UNIQUE_NAME: .ACTION_UNIQUE_NAME, rows: .rows} - Object reconstruction

  • This creates a new object with fields in the desired order
  • .pattern refers to the pattern field from the original object
  • .ACTION_UNIQUE_NAME refers to the ACTION_UNIQUE_NAME field
  • .rows refers to the rows field
  • Key point: The order of fields in this object literal determines the output order

Before vs After:

  • Before: {"ACTION_UNIQUE_NAME": "...", "pattern": "...", "rows": [...]}
  • After: {"pattern": "...", "ACTION_UNIQUE_NAME": "...", "rows": [...]}

The beauty of this command is that it preserves the entire JSON structure and all data - it only changes the order of the three fields within each action object.

concatenation of array values

jq '.[].criteria.actions[].rows[].VALUE |= (if type == "array" then map(if type == "string" then . else tostring end) | join("&comma;") else . end)' ./updated_crieteria_limited.json > ./updated_criteria_limited_transformed.json

concat the json values with , in place

Even if rows is always a JSON array in your data, the error message “Cannot iterate over null (null)” indicates that at least one of the parent fields, criteria or actions[0], is null for some objects in your file. This causes the jq path to break before it even reaches rows.

The error doesn’t mean rows itself is null. It means the expression .criteria.actions[0].rows evaluates to null because one of the intermediate steps, like .criteria, or .actions[0], is null or doesn’t exist.

A More Robust JQ Command

To handle this, a better jq command should check for the existence of the rows array and its parent objects. The ? operator is a concise way to handle null or missing values gracefully.

Bash

jq '.[] | select( (.criteria.actions[0].rows? // []) | any(.condition == "NOT_EXISTS") ) | .name' ./temp.json

Breakdown of the Fix

  • ?: The ? after rows is a “null-propagation” operator. It prevents the command from throwing an error if .criteria or .actions[0] is null. If rows is null or doesn’t exist, the expression to the left of the // operator will evaluate to null.

  • // []: The // operator is a “fallback” or “default” operator. If the expression on the left is null (which it will be if rows doesn’t exist), it substitutes the value with the expression on the right—in this case, an empty array [].

  • any(...): The any() function can then safely be applied to this new, guaranteed-to-be-an-array value. If the array is empty ([]), any() will simply return false, and the object won’t be selected.

This command is the most idiomatic and efficient way to handle potential nulls in nested JSON structures. It’s more concise and robust than multiple type checks.