Retention and Data Lifecycle

LynxDB automatically manages data lifecycle through retention policies, tiered storage, and materialized view retention. Older data is deleted or moved to cheaper storage without manual intervention.

Retention Policy

The global retention setting controls how long data is kept before automatic deletion.

Config Key	`retention`
Env Var	`LYNXDB_RETENTION`
Default	`7d`
Hot-Reloadable	Yes

retention: "30d"

# Set via CLI flag
LYNXDB_RETENTION=30d lynxdb server

# Change at runtime (no restart)
lynxdb config set retention 30d
lynxdb config reload

Accepted duration formats:

7d -- 7 days
4w -- 4 weeks (28 days)
6h -- 6 hours (for testing)
90d -- 90 days
365d -- 1 year

How Retention Works

Segments older than the retention period are deleted during compaction:

The compaction scheduler runs at storage.compaction_interval (default: every 30 seconds)
It checks each segment's time range
Segments where all events are older than the retention period are deleted
Segments that partially overlap the retention boundary are compacted -- old events are dropped, recent events are kept

This means:

Deletion is not exact to the second -- it depends on segment boundaries
A segment spanning the retention boundary is not deleted until compaction splits it
Compaction must be running for retention to take effect

Tiered Storage Lifecycle

When S3 tiering is enabled, data moves through three tiers based on age:

Hot (local SSD)  -->  Warm (S3 Standard)  -->  Cold (S3 Glacier)
  Recent data          Older data               Archive data

Configuration

LynxDB handles hot-to-warm transitions automatically. Warm-to-cold transitions use S3 Lifecycle rules:

# LynxDB config -- hot to warm
storage:
  s3_bucket: "my-lynxdb-logs"
  tiering_interval: "5m"      # Check every 5 minutes

# AWS S3 Lifecycle -- warm to cold (Glacier after 90 days)
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-lynxdb-logs \
  --lifecycle-configuration '{
    "Rules": [
      {
        "ID": "GlacierTransition",
        "Status": "Enabled",
        "Filter": {},
        "Transitions": [
          {"Days": 90, "StorageClass": "GLACIER"}
        ]
      },
      {
        "ID": "DeleteAfter365",
        "Status": "Enabled",
        "Filter": {},
        "Expiration": {"Days": 365}
      }
    ]
  }'

Example Lifecycle Policies

Startup (cost-sensitive):

retention: "30d"
storage:
  s3_bucket: "my-logs"
  # S3 Lifecycle: delete after 30 days

Mid-size company (compliance):

retention: "365d"
storage:
  s3_bucket: "my-logs"
  # S3 Lifecycle: Glacier after 90 days, delete after 365 days

Enterprise (long-term archive):

retention: "2555d"  # 7 years
storage:
  s3_bucket: "my-logs"
  # S3 Lifecycle: IA after 30 days, Glacier after 90 days, Deep Archive after 365 days

Materialized View Retention

Materialized views have their own retention policy, independent of the raw data retention:

# Create a view with 90-day retention
lynxdb mv create mv_errors_5m \
  'level=error | stats count, avg(duration) by source, time_bucket(timestamp, "5m") AS bucket' \
  --retention 90d

# Create a cascading view with longer retention
lynxdb mv create mv_errors_1h \
  '| from mv_errors_5m | stats sum(count) AS count by source, time_bucket(bucket, "1h") AS hour' \
  --retention 365d

This pattern lets you keep detailed data for a short period and pre-aggregated summaries for much longer:

Data	Retention	Query Speed
Raw events	7d	Normal
5-minute aggregates (MV)	90d	~400x faster
1-hour aggregates (cascading MV)	365d	~400x faster

Monitoring Retention

Check current data age and storage usage:

# View data age range
lynxdb status
# Oldest: 2026-02-01T10:30:00Z

# Check storage breakdown
lynxdb status --format json | jq '{total_events, storage_bytes, oldest_event}'

Changing Retention

Retention is hot-reloadable. Changing it takes effect at the next compaction cycle:

# Increase retention
lynxdb config set retention 90d
lynxdb config reload

# Decrease retention (data will be deleted at next compaction)
lynxdb config set retention 7d
lynxdb config reload

caution

Decreasing retention causes immediate data deletion at the next compaction cycle. This cannot be undone. Ensure you have backups if needed.

Estimating Storage Needs

Rule of thumb for storage estimation:

Raw Log Size	LynxDB Storage (LZ4)	Compression Ratio
1 GB/day raw logs	~200-400 MB/day on disk	2.5-5x
10 GB/day raw logs	~2-4 GB/day on disk	2.5-5x
100 GB/day raw logs	~20-40 GB/day on disk	2.5-5x

Example: 10 GB/day raw logs with 30-day retention = ~60-120 GB total disk usage.

With S3 tiering:

Hot tier (7 days on SSD): ~14-28 GB
Warm tier (remaining 23 days in S3): ~46-92 GB at S3 Standard pricing

Next Steps

S3 Tiering Configuration -- configure tiered storage
Materialized Views -- pre-aggregate for long-term retention
Backup and Restore -- protect against data loss
Performance Tuning -- optimize compaction

Retention Policy​

How Retention Works​

Tiered Storage Lifecycle​

Configuration​

Example Lifecycle Policies​

Materialized View Retention​

Monitoring Retention​

Changing Retention​

Estimating Storage Needs​

Next Steps​