jq is an incredibly powerful and efficient command-line tool for processing JSON data. It’s often referred to as “the sed or awk for JSON.” This crash course will cover the most essential concepts to get you started.
The Basic Syntax
The basic jq command has a simple structure:
jq '<filter>' <input_file>
Or, more commonly, it’s used in a pipeline:
<some_command_that_outputs_json> | jq '<filter>'
The filter is the core of jq. It’s the “program” that tells jq what to do with the JSON input.
The Identity Filter: .
The simplest and most fundamental filter is the dot, .. It’s the “identity” filter, meaning it takes the input and outputs it with no changes (other than pretty-printing it by default).
-
Example: Pretty-printing a minified JSON file.
echo '{"name":"Alice","age":30}' | jq '.' # Output: # { # "name": "Alice", # "age": 30 # }This is often the first thing people use
jqfor: making unreadable JSON readable.
Accessing Object Fields
You can access fields (key-value pairs) in a JSON object using the dot (.) followed by the field name.
-
Example: Getting a single value.
echo '{"name":"Alice","age":30}' | jq '.name' # Output: # "Alice" -
Example: Accessing a nested field.
echo '{"user":{"name":"Bob","id":123}}' | jq '.user.name' # Output: # "Bob"
Working with Arrays
To access elements in a JSON array, you use square brackets ([]).
-
Example: Getting the first element of an array.
echo '["apple", "banana", "cherry"]' | jq '.[0]' # Output: # "apple" -
Example: Getting the value of a field in an object that is inside an array.
echo '[{"name":"Alice"}, {"name":"Bob"}]' | jq '.[1].name' # Output: # "Bob"
Iterating Over Arrays: []
The [] filter, when used without an index, acts as an iterator. It takes an array and outputs each element on a new line. This is one of the most powerful features for filtering and transforming data.
-
Example: Extracting all names from an array of objects.
echo '[{"name":"Alice"}, {"name":"Bob"}]' | jq '.[] | .name' # Output: # "Alice" # "Bob"Notice the pipe (
|). This is a key operator injqthat “pipes” the output of one filter into the input of the next. Here, the.outputs the entire array,[]then iterates over it, and for each element,.nameextracts the name.
Filtering with select()
The select() function is used to filter elements based on a condition. It takes a boolean expression as its argument.
- Example: Find objects where a value is greater than a number.
echo '[{"name":"Alice","age":30}, {"name":"Bob","age":45}]' | jq '.[] | select(.age > 40)' # Output: # { # "name": "Bob", # "age": 45 # }
Creating New Objects and Values
You can construct new JSON objects or arrays on the fly.
-
Example: Creating a new object with specific fields.
echo '{"name":"Alice","age":30,"city":"London"}' | jq '{username: .name, years_old: .age}' # Output: # { # "username": "Alice", # "years_old": 30 # }This is extremely useful for transforming data into a different format.
-
Example: Creating a new array.
echo '[{"name":"Alice"}, {"name":"Bob"}]' | jq '[.[] | .name]' # Output: # [ # "Alice", # "Bob" # ]Here, the outer brackets
[]telljqto collect all the individual outputs from the inner filter (.[] | .name) and put them into a single array.
Key Flags and Options
-
-r(raw output): By default,jqoutputs JSON values, which includes surrounding quotes for strings. The-rflag gives you the raw string output.echo '{"name":"Alice"}' | jq '.name' # Output: "Alice" echo '{"name":"Alice"}' | jq -r '.name' # Output: Alice -
-c(compact output): This flag produces minified, single-line JSON output. It’s the opposite of the default pretty-printing.echo '{"name":"Alice","age":30}' | jq -c '.' # Output: {"name":"Alice","age":30}
This crash course covers the most essential tools in the jq toolbox. With these filters and flags, you can handle a vast majority of common JSON manipulation tasks. For more advanced use cases, remember to check the official jq manual for its extensive list of functions and operators.
print more than one value
jq '
[
.[] |
select(.yaml_path) |
select(.yaml_path | contains("registry")) |
{ rule: .rule_name, path: .yaml_path}
] as $filtered_data
|
{
count: ($filtered_data | length),
results: $filtered_data
}
' bulk_converted_criteria.jsonExplanation:
-
[ ... ] as $filtered_data: This part builds the array of filtered objects exactly as before, but it stores the entire result in a variable called$filtered_data. -
{ count: ..., results: ... }: This creates a new JSON object. -
($filtered_data | length): It calculates the length of the stored array and assigns it to thecountkey. -
$filtered_data: It uses the stored array and assigns it to theresultskey. -
Why this additional select? This prevents from crashes when the json has no yaml_path object
I’ll explain the jq command that was used to reorder the fields:
Command Breakdown
jq '.[].criteria.actions |= map({pattern: .pattern, ACTION_UNIQUE_NAME: .ACTION_UNIQUE_NAME, rows: .rows})' updated_criteria.json > updated_criteria_reordered.jsonLet me break this down piece by piece:
1. jq - The JSON processor tool
jqis a lightweight command-line JSON processor- It allows you to slice, filter, map, and transform structured data
2. '.[]' - Array iteration
.[]iterates over each element in the top-level array`- In your case, it processes each rule object in the JSON array
3. .criteria.actions - Navigate to the actions array
- This selects the
actionsarray within thecriteriaobject of each rule
4. |= - Update operator
- The
|=operator updates the selected path with the result of the expression on the right - It’s equivalent to “modify in place”
5. map(...) - Transform each element
mapapplies a transformation to each element in the array- In this case, it transforms each action object
6. {pattern: .pattern, ACTION_UNIQUE_NAME: .ACTION_UNIQUE_NAME, rows: .rows} - Object reconstruction
- This creates a new object with fields in the desired order
.patternrefers to thepatternfield from the original object.ACTION_UNIQUE_NAMErefers to theACTION_UNIQUE_NAMEfield.rowsrefers to therowsfield- Key point: The order of fields in this object literal determines the output order
Before vs After:
- Before:
{"ACTION_UNIQUE_NAME": "...", "pattern": "...", "rows": [...]} - After:
{"pattern": "...", "ACTION_UNIQUE_NAME": "...", "rows": [...]}
The beauty of this command is that it preserves the entire JSON structure and all data - it only changes the order of the three fields within each action object.
concatenation of array values
jq '.[].criteria.actions[].rows[].VALUE |= (if type == "array" then map(if type == "string" then . else tostring end) | join(",") else . end)' ./updated_crieteria_limited.json > ./updated_criteria_limited_transformed.jsonconcat the json values with , in place
Even if rows is always a JSON array in your data, the error message “Cannot iterate over null (null)” indicates that at least one of the parent fields, criteria or actions[0], is null for some objects in your file. This causes the jq path to break before it even reaches rows.
The error doesn’t mean rows itself is null. It means the expression .criteria.actions[0].rows evaluates to null because one of the intermediate steps, like .criteria, or .actions[0], is null or doesn’t exist.
A More Robust JQ Command
To handle this, a better jq command should check for the existence of the rows array and its parent objects. The ? operator is a concise way to handle null or missing values gracefully.
Bash
jq '.[] | select( (.criteria.actions[0].rows? // []) | any(.condition == "NOT_EXISTS") ) | .name' ./temp.json
Breakdown of the Fix
-
?: The?afterrowsis a “null-propagation” operator. It prevents the command from throwing an error if.criteriaor.actions[0]isnull. Ifrowsisnullor doesn’t exist, the expression to the left of the//operator will evaluate tonull. -
// []: The//operator is a “fallback” or “default” operator. If the expression on the left isnull(which it will be ifrowsdoesn’t exist), it substitutes the value with the expression on the right—in this case, an empty array[]. -
any(...): Theany()function can then safely be applied to this new, guaranteed-to-be-an-array value. If the array is empty ([]),any()will simply returnfalse, and the object won’t be selected.
This command is the most idiomatic and efficient way to handle potential nulls in nested JSON structures. It’s more concise and robust than multiple type checks.