Upgrading from grep/awk
If you live in the terminal and your log analysis toolkit is grep, awk, sed, and jq, LynxDB pipe mode gives you the power of a full analytics engine with zero setup. Same philosophy: read from stdin, process, write to stdout. No server, no config file, no daemon.
How Pipe Mode Works
LynxDB's query command detects when data is piped via stdin. It creates an ephemeral in-memory engine, ingests the data, runs your LynxFlow query, prints results, and exits. Nothing is saved to disk.
cat app.log | lynxdb query 'stats count() by level'
This is the equivalent of a full analytics pipeline in a single command.
Side-by-Side Comparisons
Count Lines Matching a Pattern
# grep
grep -c "ERROR" app.log
# LynxDB
lynxdb query --file app.log 'from main level=error | stats count()'
Count by Field Value
# grep + sort + uniq
grep -oP 'level=\K\w+' app.log | sort | uniq -c | sort -rn
# awk
awk -F'level=' '{print $2}' app.log | awk '{print $1}' | sort | uniq -c | sort -rn
# LynxDB
lynxdb query --file app.log 'stats count() by level | sort -count'
Filter and Aggregate
# grep + awk (fragile, depends on log format)
grep "status=5" access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -10
# LynxDB (works with any log format)
lynxdb query --file access.log 'where status >= 500 | stats count() by uri | sort -count | head 10'
Average of a Numeric Field
# awk
awk '{sum+=$NF; n++} END {print sum/n}' data.log
# LynxDB
lynxdb query --file data.log 'stats avg(duration_ms)'
Percentiles
# awk (requires writing a percentile function)
# ... complex multi-line awk script ...
# LynxDB
lynxdb query --file data.log 'stats p50(duration_ms), p95(duration_ms), p99(duration_ms)'
Time-Based Aggregation
# awk (requires parsing timestamps, bucketing, counting)
# ... very complex awk script ...
# LynxDB
lynxdb query --file app.log 'from main level=error | every 5m stats count()'
Top Values
# grep + sort + uniq + head
grep -oP 'host=\K\S+' app.log | sort | uniq -c | sort -rn | head -5
# LynxDB
lynxdb query --file app.log 'top 5 host'
JSON Logs
# jq (one field at a time)
cat app.json | jq -r '.level' | sort | uniq -c | sort -rn
# jq (complex aggregation -- difficult)
cat app.json | jq -r '[.level, .source] | @tsv' | sort | uniq -c | sort -rn
# LynxDB (handles JSON natively)
cat app.json | lynxdb query 'stats count() by level, source | sort -count'
Extracting Fields with Regex
# grep -oP
grep -oP 'duration=\K\d+' app.log
# LynxDB (named capture groups)
lynxdb query --file app.log 'parse regex r"duration=(?P<dur>\d+)" | keep dur'
Chaining with Unix Tools
LynxDB outputs NDJSON when piped, so it composes with standard tools:
# LynxDB aggregation -> jq for further processing
lynxdb query --file app.log 'stats count() by host' | jq '.host'
# LynxDB filter -> CSV export -> sort
lynxdb query --file app.log 'stats count() by status' --format csv | sort -t, -k2 -rn
# LynxDB as a filter in a pipeline
cat huge.log | lynxdb query 'where level == "ERROR"' | wc -l
Common Recipes
Quick Error Count
cat app.log | lynxdb query 'where level == "ERROR" | stats count()'
Errors Per Service in the Last Hour
# Against a running server
lynxdb query 'from main level=error | stats count() by source' --since 1h
# Against a local file
lynxdb query --file app.log 'where level == "ERROR" | stats count() by source'
Slow Requests
kubectl logs deploy/api | lynxdb query 'where duration_ms > 1000 | stats avg(duration_ms), count() by endpoint | sort -count'
HTTP Status Code Distribution
lynxdb query --file access.log 'stats count() by status | sort -count'
Unique Visitors
lynxdb query --file access.log 'stats dc(client_ip) as unique_visitors'
Error Spike Detection
lynxdb query --file app.log 'from main level=error | every 1m stats count()'
Parse Unstructured Logs
# Extract IP and status from Apache combined log format
lynxdb query --file access.log \
'parse regex r"^(?P<ip>\S+) \S+ \S+ \[[^\]]+\] .(?P<method>\S+) (?P<uri>\S+) \S+ (?P<status>\d+)"
| stats count() by status | sort -count'
Docker/Kubernetes Logs
# Docker
docker logs myapp 2>&1 | lynxdb query 'from main "OOM" | stats count() by container'
# Kubernetes
kubectl logs deploy/api --since=1h | lynxdb query 'stats avg(duration_ms) by endpoint'
# Multiple pods
kubectl logs -l app=api --all-containers | lynxdb query 'where level == "ERROR" | stats count() by pod'
Process Compressed Logs
zcat /var/log/app.log.gz | lynxdb query 'stats count() by level'
Why LynxDB Over grep/awk
| Capability | grep/awk/jq | LynxDB Pipe Mode |
|---|---|---|
| Simple text search | Easy | Easy |
| Count by field | Awkward (sort | uniq -c) | stats count() by field |
| Averages, percentiles | Write your own function | Built-in (avg, p99, etc.) |
| Time-based buckets | Very difficult | every 5m stats count() |
| JSON parsing | jq (separate tool) | Native |
| Multiple aggregations | Near impossible | stats count(), avg(x), p99(x) by y |
| Top-N | sort | head (no ties) | top 10 field |
| Joins | Not possible | join, let bindings |
| Output formats | Text only | JSON, table, CSV, TSV |
When to Keep Using grep
- Simple text search in a single file:
grep "error" app.logis faster for one-off searches - When you need regex match highlighting
- When you need line numbers:
grep -n "pattern" file
LynxDB complements grep rather than replacing it. Use grep for quick text searches, and LynxDB when you need aggregation, statistics, or structured analysis.
Next Steps
- Pipe Mode Guide -- full pipe mode documentation
- LynxFlow Reference -- learn the query language
- First Query -- your first LynxFlow query
- Server Mode -- when you need persistence