Working with JSON Logs

LynxDB provides multiple tools for working with JSON log data, from zero-config dot-notation to full structural extraction and array manipulation. This guide covers all the approaches and when to use each one.

Quick reference

Approach	Use when	Example
Dot-notation	Ad-hoc queries on nested JSON	`\| where response.status >= 500`
`\| json`	Quick extraction of all/some fields	`\| json \| stats count by level`
`\| unpack_json`	Production pipelines with prefix/field control	`\| unpack_json prefix=app_`
`json_extract()`	Extracting one value in an eval/where	`\| eval name = json_extract(data, "user.name")`
`\| unroll`	Exploding JSON arrays into rows	`\| unroll field=items`

Dot-notation: zero-config nested access

LynxDB supports dot-notation for accessing nested JSON fields directly in where, eval, and stats without any explicit extraction step.

# Filter on a nested field -- no unpack needed
echo '{"level":"error","request":{"method":"POST","duration_ms":5012}}' \
  | lynxdb query '| where request.duration_ms > 1000 | table level, request.method'

How it works

When the VM encounters a field like request.method:

Direct lookup -- checks if a field literally named request.method exists (e.g., from a previous unpack_json).
Root extraction -- if not found, looks for a field named request and extracts method from its JSON value.
_raw fallback -- if the root field doesn't exist either, tries to extract request.method from _raw.

This means dot-notation works whether or not you've explicitly extracted fields.

Multi-level nesting

-- Access deeply nested values
| where response.headers.content_type = "application/json"
| eval origin = request.headers.origin
| stats count by response.status

`| json` -- quick extraction

The | json command extracts all JSON keys from _raw (or a specified field) into top-level fields.

# Extract all fields
cat app.json | lynxdb query '| json | stats count by level'

# Extract specific paths only (faster for wide objects)
cat app.json | lynxdb query '| json level, status, duration_ms | where status >= 500'

# Extract from a non-default field
cat logs.json | lynxdb query '| json field=message | table level, service'

`| unpack_json` -- production extraction

For production pipelines, unpack_json provides full control over extraction:

-- Add a prefix to avoid field name collisions
| unpack_json prefix=app_
| where app_level = "error"

-- Extract only specific fields (performance optimization)
| unpack_json fields=level,status,duration_ms

-- Keep the original _raw field
| unpack_json keep_original=true

-- Extract from a specific field
| unpack_json field=metadata prefix=meta_

JSON eval functions

Use JSON functions in eval and where for precise extraction and validation.

Extract a single value

| eval user = json_extract(payload, "user.name")
| eval first_tag = json_extract(metadata, "tags.0")

Validate JSON before processing

| where json_valid(message) = true
| unpack_json field=message

Inspect structure

-- Get keys of a JSON object
| eval keys = json_keys(config)

-- Get length of a JSON array
| eval num_items = json_array_length(order, "items")
| where num_items > 10

Build JSON output

-- Construct a JSON object for export
| stats count by host, level
| eval summary = json_object("host", host, "level", level, "count", count)
| table summary

See the JSON Functions Reference for the complete API.

`| unroll` -- exploding arrays

When a field contains a JSON array, | unroll creates one row per element.

echo '{"order":"ORD-1","items":[{"sku":"A1","qty":2},{"sku":"B3","qty":1}]}' \
  | lynxdb query '| json | unroll field=items | table order, items.sku, items.qty'

Output:

order    items.sku    items.qty
ORD-1    A1           2
ORD-1    B3           1

Array of objects

Object elements are flattened with dot-notation. For field=items:

items.sku = the sku key from each element
items.qty = the qty key from each element

Array of scalars

Scalar elements replace the field value directly:

-- Input: {"name": "alice", "tags": ["admin", "user"]}
| json | unroll field=tags
-- Row 1: name=alice, tags=admin
-- Row 2: name=alice, tags=user

Aggregate over unrolled data

| json
| unroll field=items
| stats sum(items.qty) AS total_sold, dc(order_id) AS orders by items.sku
| sort -total_sold
| head 20

Chaining parsers

Real-world logs often have nested formats. Chain extraction commands to handle them:

Docker JSON logs with embedded application log

# Docker wraps each log line in JSON: {"log":"...","stream":"stdout","time":"..."}
# The inner "log" field contains the application's logfmt output
cat docker-logs.json | lynxdb query '
  | json
  | unpack_logfmt field=log prefix=app_
  | where app_level = "error"
  | stats count by app_service'

Syslog with embedded JSON

# Syslog header wraps a JSON application message
cat syslog.log | lynxdb query '
  | unpack_syslog
  | unpack_json field=message prefix=app_
  | stats count by hostname, app_level'

Nginx access log with JSON body field

cat access.log | lynxdb query '
  | unpack_combined
  | unpack_json field=request_body prefix=body_
  | where body_action = "purchase"
  | stats sum(body_amount) by client_ip'

Performance tips

Use dot-notation for ad-hoc queries. No extraction overhead for fields you don't materialize.
Specify fields= when using unpack_json. Extracting 3 fields from a 50-key JSON object is much faster than extracting all 50.
Use json_extract() in eval/where for single values. When you only need one field, json_extract avoids parsing the entire object.
Place extraction early in the pipeline. Extract before where so the filter can operate on typed fields:
```
| json | where status >= 500   -- fast: status is an integer
```
Use json_valid() to guard against malformed data. Avoid parse errors in production pipelines:
```
| where json_valid(_raw) = true | json
```

Next steps

json command reference
unpack_json reference
unroll reference
JSON Functions reference
Field Extraction Guide -- REX, EVAL, and schema-on-read

Quick reference​

Dot-notation: zero-config nested access​

How it works​

Multi-level nesting​

| json -- quick extraction​

| unpack_json -- production extraction​

JSON eval functions​

Extract a single value​

Validate JSON before processing​

Inspect structure​

Build JSON output​

| unroll -- exploding arrays​

Array of objects​

Array of scalars​

Aggregate over unrolled data​

Chaining parsers​

Docker JSON logs with embedded application log​

Syslog with embedded JSON​

Nginx access log with JSON body field​

Performance tips​

Next steps​