Your First LynxFlow Query

LynxFlow is LynxDB's query language. It's a pipeline language inspired by Splunk's SPL -- data flows left to right through pipe (|) operators.

The Pipeline Concept

Every LynxFlow query is a pipeline of operators:

from <dataset> [search terms] | operator1 args | operator2 args | operator3 args

Data starts on the left (a from stage with optional search terms) and flows through each operator. Each operator transforms the data and passes it to the next.

Step 1: Search

The simplest query is a keyword search attached to the from stage:

from main error

This finds all events containing the word "error". You can also search specific fields:

from main level=error

Combine terms with boolean operators:

from main level=error source=nginx
from main level=error OR level=warn
from main level=error NOT source=redis

tip

If your query omits the from stage, LynxDB reads from the default main dataset. So stats count() is equivalent to from main | stats count(). Note that search-term sugar like level=error only works in the from stage -- everywhere else, use where with ==.

Step 2: Filter with WHERE

Use where for precise filtering. Inside where, == compares (a bare = binds values, so it is not a comparison):

from main source=nginx | where status >= 500
from main source=nginx | where status >= 500 and duration_ms > 1000
from main source=nginx | where contains(uri, "/api/")

Step 3: Aggregate with STATS

stats computes aggregations. Aggregate calls always take parentheses -- count(), not count:

// Count events
from main | stats count()

// Count by field
from main level=error | stats count() by source

// Multiple aggregations
from main source=nginx | stats count(), avg(duration_ms), p99(duration_ms) by uri

// With renaming
from main source=nginx | stats count() as requests, avg(duration_ms) as avg_latency by uri

Step 4: Sort and Limit

Alias count() with as count when later stages reference it:

// Sort descending (prefix with -)
from main source=nginx | stats count() as count by uri | sort -count

// Take top N
from main source=nginx | stats count() as count by uri | sort -count | head 10

// Or use the TOP shortcut
from main source=nginx | top 10 uri

Step 5: Select Columns

// Pick specific fields
from main level=error | keep _time, source, message

// Remove fields
from main level=error | drop _raw

Step 6: Transform with EXTEND

Create computed fields:

from main source=nginx
  | stats count() as total, count(where status >= 500) as errors by uri
  | extend error_rate = round(errors / total * 100, 1)
  | where error_rate > 5
  | sort -error_rate
  | keep uri, total, errors, error_rate

Step 7: Time Series with EVERY

Aggregate over time buckets:

// Error count per 5-minute bucket
from main level=error | every 5m stats count()

// Error count by source per 5 minutes
from main level=error | every 5m by source stats count()

every is sugar for grouping by a time bucket -- every 5m stats count() desugars to stats count() by bin(_time, 5m).

Step 8: Extract Fields with PARSE

Extract new fields from raw text using regex:

from main "connection refused"
  | parse regex r"host=(?P<host>\S+) port=(?P<port>\d+)"
  | stats count() as count by host, port
  | sort -count

Putting It All Together

Here's a real-world query that finds the slowest API endpoints with high error rates:

from main source=nginx
  | stats count() as total,
          count(where status >= 500) as errors,
          avg(duration_ms) as avg_latency,
          p99(duration_ms) as p99_latency
    by uri
  | extend error_rate = round(errors / total * 100, 1)
  | where error_rate > 5 or p99_latency > 1000
  | sort -error_rate
  | keep uri, total, errors, error_rate, avg_latency, p99_latency

Operator Quick Reference

Operator	What it does	Example
`from`	Choose dataset + search sugar	`from main "connection refused"`
`where`	Filter rows	`\| where status >= 500`
`stats`	Aggregate	`\| stats count(), avg(x) by y`
`extend`	Compute fields	`\| extend rate = errors / total * 100`
`sort`	Order results	`\| sort -count`
`head`	Limit results	`\| head 10`
`keep`	Select columns	`\| keep uri, count`
`drop`	Remove fields	`\| drop _raw`
`parse`	Extract via regex/json/logfmt	`\| parse regex r"host=(?P<host>\S+)"`
`every`	Time series	`\| every 5m stats count()`
`top`	Top N values	`\| top 10 uri`
`dedup`	Remove duplicates	`\| dedup host`

Next Steps

LynxFlow v2 Reference -- Full language reference
Searching & Filtering -- Advanced search techniques
Aggregations -- All aggregation functions
STATS operator -- Detailed operator reference

The Pipeline Concept​

Step 1: Search​

Step 2: Filter with WHERE​

Step 3: Aggregate with STATS​

Step 4: Sort and Limit​

Step 5: Select Columns​

Step 6: Transform with EXTEND​

Step 7: Time Series with EVERY​

Step 8: Extract Fields with PARSE​

Putting It All Together​

Operator Quick Reference​

Next Steps​