Upgrading from grep/awk

If you live in the terminal and your log analysis toolkit is grep, awk, sed, and jq, LynxDB pipe mode gives you the power of a full analytics engine with zero setup. Same philosophy: read from stdin, process, write to stdout. No server, no config file, no daemon.

How Pipe Mode Works

LynxDB's query command detects when data is piped via stdin. It creates an ephemeral in-memory engine, ingests the data, runs your LynxFlow query, prints results, and exits. Nothing is saved to disk.

cat app.log | lynxdb query 'stats count() by level'

This is the equivalent of a full analytics pipeline in a single command.

Side-by-Side Comparisons

Count Lines Matching a Pattern

# grep
grep -c "ERROR" app.log

# LynxDB
lynxdb query --file app.log 'from main level=error | stats count()'

Count by Field Value

# grep + sort + uniq
grep -oP 'level=\K\w+' app.log | sort | uniq -c | sort -rn

# awk
awk -F'level=' '{print $2}' app.log | awk '{print $1}' | sort | uniq -c | sort -rn

# LynxDB
lynxdb query --file app.log 'stats count() by level | sort -count'

Filter and Aggregate

# grep + awk (fragile, depends on log format)
grep "status=5" access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -10

# LynxDB (works with any log format)
lynxdb query --file access.log 'where status >= 500 | stats count() by uri | sort -count | head 10'

Average of a Numeric Field

# awk
awk '{sum+=$NF; n++} END {print sum/n}' data.log

# LynxDB
lynxdb query --file data.log 'stats avg(duration_ms)'

Percentiles

# awk (requires writing a percentile function)
# ... complex multi-line awk script ...

# LynxDB
lynxdb query --file data.log 'stats p50(duration_ms), p95(duration_ms), p99(duration_ms)'

Time-Based Aggregation

# awk (requires parsing timestamps, bucketing, counting)
# ... very complex awk script ...

# LynxDB
lynxdb query --file app.log 'from main level=error | every 5m stats count()'

Top Values

# grep + sort + uniq + head
grep -oP 'host=\K\S+' app.log | sort | uniq -c | sort -rn | head -5

# LynxDB
lynxdb query --file app.log 'top 5 host'

JSON Logs

# jq (one field at a time)
cat app.json | jq -r '.level' | sort | uniq -c | sort -rn

# jq (complex aggregation -- difficult)
cat app.json | jq -r '[.level, .source] | @tsv' | sort | uniq -c | sort -rn

# LynxDB (handles JSON natively)
cat app.json | lynxdb query 'stats count() by level, source | sort -count'

Extracting Fields with Regex

# grep -oP
grep -oP 'duration=\K\d+' app.log

# LynxDB (named capture groups)
lynxdb query --file app.log 'parse regex r"duration=(?P<dur>\d+)" | keep dur'

Chaining with Unix Tools

LynxDB outputs NDJSON when piped, so it composes with standard tools:

# LynxDB aggregation -> jq for further processing
lynxdb query --file app.log 'stats count() by host' | jq '.host'

# LynxDB filter -> CSV export -> sort
lynxdb query --file app.log 'stats count() by status' --format csv | sort -t, -k2 -rn

# LynxDB as a filter in a pipeline
cat huge.log | lynxdb query 'where level == "ERROR"' | wc -l

Common Recipes

Quick Error Count

cat app.log | lynxdb query 'where level == "ERROR" | stats count()'

Errors Per Service in the Last Hour

# Against a running server
lynxdb query 'from main level=error | stats count() by source' --since 1h

# Against a local file
lynxdb query --file app.log 'where level == "ERROR" | stats count() by source'

Slow Requests

kubectl logs deploy/api | lynxdb query 'where duration_ms > 1000 | stats avg(duration_ms), count() by endpoint | sort -count'

HTTP Status Code Distribution

lynxdb query --file access.log 'stats count() by status | sort -count'

Unique Visitors

lynxdb query --file access.log 'stats dc(client_ip) as unique_visitors'

Error Spike Detection

lynxdb query --file app.log 'from main level=error | every 1m stats count()'

Parse Unstructured Logs

# Extract IP and status from Apache combined log format
lynxdb query --file access.log \
  'parse regex r"^(?P<ip>\S+) \S+ \S+ \[[^\]]+\] .(?P<method>\S+) (?P<uri>\S+) \S+ (?P<status>\d+)"
   | stats count() by status | sort -count'

Docker/Kubernetes Logs

# Docker
docker logs myapp 2>&1 | lynxdb query 'from main "OOM" | stats count() by container'

# Kubernetes
kubectl logs deploy/api --since=1h | lynxdb query 'stats avg(duration_ms) by endpoint'

# Multiple pods
kubectl logs -l app=api --all-containers | lynxdb query 'where level == "ERROR" | stats count() by pod'

Process Compressed Logs

zcat /var/log/app.log.gz | lynxdb query 'stats count() by level'

Why LynxDB Over grep/awk

Capability	grep/awk/jq	LynxDB Pipe Mode
Simple text search	Easy	Easy
Count by field	Awkward (`sort \| uniq -c`)	`stats count() by field`
Averages, percentiles	Write your own function	Built-in (`avg`, `p99`, etc.)
Time-based buckets	Very difficult	`every 5m stats count()`
JSON parsing	jq (separate tool)	Native
Multiple aggregations	Near impossible	`stats count(), avg(x), p99(x) by y`
Top-N	`sort \| head` (no ties)	`top 10 field`
Joins	Not possible	`join`, `let` bindings
Output formats	Text only	JSON, table, CSV, TSV

When to Keep Using grep

Simple text search in a single file: grep "error" app.log is faster for one-off searches
When you need regex match highlighting
When you need line numbers: grep -n "pattern" file

LynxDB complements grep rather than replacing it. Use grep for quick text searches, and LynxDB when you need aggregation, statistics, or structured analysis.

Next Steps

Pipe Mode Guide -- full pipe mode documentation
LynxFlow Reference -- learn the query language
First Query -- your first LynxFlow query
Server Mode -- when you need persistence

How Pipe Mode Works​

Side-by-Side Comparisons​

Count Lines Matching a Pattern​

Count by Field Value​

Filter and Aggregate​

Average of a Numeric Field​

Percentiles​

Time-Based Aggregation​

Top Values​

JSON Logs​

Extracting Fields with Regex​

Chaining with Unix Tools​

Common Recipes​

Quick Error Count​

Errors Per Service in the Last Hour​

Slow Requests​

HTTP Status Code Distribution​

Unique Visitors​

Error Spike Detection​

Parse Unstructured Logs​

Docker/Kubernetes Logs​

Process Compressed Logs​

Why LynxDB Over grep/awk​

When to Keep Using grep​

Next Steps​