Time Series Analysis

Log analytics often requires understanding trends over time: when do errors spike? Is latency increasing? How does traffic change throughout the day? LynxDB provides the every stage and the bin() function for time-based aggregation (formerly TIMECHART, BIN, and time_bucket() in SPL2).

Time bucketing with `every`

The every stage is the primary tool for time series analysis. It buckets events into time intervals and computes aggregations for each bucket.

Basic time series

Count events per 5-minute interval:

lynxdb query 'from main level=error | every 5m stats count() as count'

every is pure sugar: it desugars to stats count() as count by bin(_time, 5m) (visible with --show-rewritten), and the bucket timestamp emits as _time.

Choose the time span

The span argument controls the bucket size. Combine it with a bracket time range on from:

# 1-minute granularity (high detail)
lynxdb query 'from main[-1h] level=error | every 1m stats count() as count'

# 1-hour granularity (overview)
lynxdb query 'from main[-7d] level=error | every 1h stats count() as count'

# 1-day granularity (trend)
lynxdb query 'from main[-30d] level=error | every 1d stats count() as count'

Common span values: 1m, 5m, 10m, 15m, 30m, 1h, 6h, 12h, 1d.

Aggregation functions in `every`

Use any aggregate function, not just count:

# Average latency over time
lynxdb query 'from nginx | every 5m stats avg(duration_ms) as avg_lat'

# P99 latency over time
lynxdb query 'from nginx | every 5m stats p99(duration_ms) as p99_lat'

# Multiple aggregations
lynxdb query 'from nginx | every 5m stats count() as count, avg(duration_ms) as avg_lat, p99(duration_ms) as p99_lat'

# Sum of bytes transferred
lynxdb query 'from nginx | every 1h stats sum(bytes) as total_bytes'

Split by a field

Use by <field> to produce separate series for each value:

# Error count over time, split by source
lynxdb query 'from main level=error | every 5m by source stats count() as count'

# Latency by endpoint
lynxdb query 'from nginx | every 5m by uri stats avg(duration_ms) as avg_lat'

# Status code distribution over time
lynxdb query 'from nginx | every 5m by status stats count() as count'

warning

In every, the by clause comes before the stats keyword: every 5m by source stats count(). Writing every 5m stats count() by source is a parse error.

The `rate` shortcut

For plain event counts per bucket, the rate sugar stage is even shorter — it desugars to every <per> [by <keys>] stats count() as rate:

lynxdb query 'from nginx | rate per 5m by service'

`bin()` -- explicit time buckets in `stats`

bin(_time, <span>) snaps a timestamp to a bucket boundary. Used as a group key in stats, it is the explicit form of every — useful when you want to combine time bucketing with other groupings or computations.

info

_time is the canonical timestamp field in LynxFlow queries and API output. In a stats by list, the binned key emits as _time.

Basic binning

lynxdb query 'from nginx
  | stats count() as count, avg(duration_ms) as avg_lat by bin(_time, 5m)'

Correlate metrics with binned timestamps

lynxdb query 'from postgres duration_ms>1000
  | stats count() as slow_queries, avg(duration_ms) as avg_latency by bin(_time, 5m)
  | where slow_queries > 10'

`bin()` in `extend`

You can also materialize the bucket as a named column:

lynxdb query 'from nginx
  | extend hour = bin(_time, 1h)
  | stats avg(duration_ms) as avg_lat, count() as count by hour
  | sort hour'

`every` vs explicit `bin()`

Feature	`every`	`stats ... by bin(_time, ...)`
Convenience	One keyword, reads naturally	Explicit group key
Split by	`by` clause before `stats`	Any mix of keys in the `by` list
Flexibility	Fixed to time grouping	Combine time with other groupings, name the bucket via `extend`
Typical use	Quick time series charts	Complex multi-dimensional analysis

Both compile to the same plan — every is pure sugar.

`bin()` in materialized views

bin()-based bucketing (formerly time_bucket()) is essential for defining materialized views:

lynxdb mv create mv_errors_5m \
  'from main level=error | every 5m by source stats count() as count, avg(duration) as avg_duration' \
  --retention 90d

Practical examples

Find error spikes

Identify 5-minute windows with abnormally high error counts:

lynxdb query 'from main[-24h] level=error
  | every 5m stats count() as count
  | where count > 100'

Compare latency across endpoints

lynxdb query 'from nginx[-6h] | every 10m by uri stats avg(duration_ms) as avg_lat'

Traffic pattern analysis

See how request volume changes hour by hour:

lynxdb query 'from nginx[-7d] | every 1h stats count() as count'

Correlate errors with slow queries

lynxdb query 'from main[-6h] level=error or (source=postgres duration_ms>1000)
  | extend event_type = if(source == "postgres", "slow_query", "error")
  | every 5m by event_type stats count() as count'

Compute moving averages with streamstats

Smooth out spiky time series with a running average:

lynxdb query 'from nginx
  | every 5m stats avg(duration_ms) as avg_lat
  | sort +_time
  | streamstats window=6 avg(avg_lat) as moving_avg
  | keep _time, avg_lat, moving_avg'

The window=6 option computes the average over the trailing 6 rows (30 minutes at 5-minute intervals).

Time series on local files

Time analysis works in pipe mode and file mode:

# Analyze a local access log
lynxdb query --file access.log '| every 1h stats count() as count'

# Analyze kubectl output over time
kubectl logs deploy/api --since=6h | lynxdb query '
  | stats count() as count, avg(duration_ms) as avg_dur by bin(_time, 5m)'

Output format for time series

Time series data often needs to be consumed by other tools. Use --format csv for spreadsheet compatibility or --format json for programmatic use:

# CSV for spreadsheets or graphing tools
lynxdb query 'from main[-7d] level=error | every 1h stats count() as count' --format csv > errors_by_hour.csv

# JSON for programmatic use
lynxdb query 'from main[-7d] level=error | every 1h stats count() as count' --format json

Next steps

Run aggregations -- general aggregation patterns beyond time series
Materialized views -- precompute time-bucketed aggregations
every reference -- full every syntax
stats reference -- stats with bin() group keys
from reference -- bracket time ranges (from main[-1h], [@d], absolute ranges)

Time bucketing with every​

Basic time series​

Choose the time span​

Aggregation functions in every​

Split by a field​

The rate shortcut​

bin() -- explicit time buckets in stats​

Basic binning​

Correlate metrics with binned timestamps​

bin() in extend​

every vs explicit bin()​

bin() in materialized views​

Practical examples​

Find error spikes​

Compare latency across endpoints​

Traffic pattern analysis​

Correlate errors with slow queries​

Compute moving averages with streamstats​

Time series on local files​

Output format for time series​

Next steps​