Index Is Just One Tool: Why Observability Needs Multiple Storage Patterns


TL;DR: Indexes are great, but they’re not a religion. At a modern scale, “index everything” slows writes and bloats storage. Observability works best when multiple storage patterns work together.
A single index can’t handle every need. Durable object storage keeps all data safe and complete. Small, focused indexes make frequent searches faster. Traces link logs from the same request, showing the full path it took. You get high fidelity, sane cost, and fast queries without pre‑deciding what to drop.
Indexes make searches faster when you know the question. But observability questions are noisy, emergent, and messy. Production traffic shifts. Incident queries are often ad hoc. If your system uses one big index for everything, you’ll run into at least five big problems:
Observability shouldn’t be a choice between high costs and losing detail; there’s a better way.
Instead of one store that tries to do everything, use different storage patterns, each built for a specific type of question. Teams get the best results by combining a few complementary patterns:
By matching the storage pattern to the type of question - wide scans, targeted lookups, or context-driven analysis, you keep fidelity high, control costs, and get answers faster.
High cardinality → index blow-up
Fields like user_id, session_id, or request_id create millions of distinct values. The index grows fast, shards keep splitting and rebalancing, and writes slow down. Memory and storage climb just to keep the index healthy.
Multi-modal queries → cross-shard slowdowns
Incident queries mix styles: full-text search in logs, time-range filters, and joins to traces. A single index can’t optimize for all of that. The query has to touch many shards and then merge the results. Latency spikes right when you need answers most.
Lifecycle churn → constant background work
Real systems roll retention, replay data, and backfill missed events. In a single-index setup, each of these triggers re-indexing and segment moves. That background work competes with live traffic and turns routine maintenance into risk.
Cold data economics → paying for “hot” you don’t use
Keeping months or years of data hot and indexed is costly. Most questions hit the last few hours or days, but the index still carries the whole corpus. You pay the “hot” tax even when you only need cold data for occasional forensics or compliance.
Operational fragility → hotspots and skew
Traffic isn’t even. Some services or tenants are noisier than others. Those keys create hot shards and skew. Teams spend time on shard sizing, capacity planning, and firefighting the index instead of debugging the incident.
Bottom line is, Observability questions are diverse. The more your queries vary, the less any one index fits them all. That’s why a multi-pattern approach (lake for truth, micro-indexes for speed, trace links for context, and light summaries for quick checks) holds up better under real load.
CtrlB’s architecture was designed around these principles:
When you’re comparing architectures, the real question isn’t “Which index is best?” It’s “Which mix of approaches gives quick answers, keeps all the details, and stays affordable?”
Indexing is a powerful tool. But in observability, data is multi‑modal & questions are unpredictable. Treat the index as one tool in the kit, not the entire workshop. When object storage, micro‑indexes, trace‑aware relationships, and lightweight caches work together, you get the holy trio: speed, fidelity, and cost control.
Do I need to index everything?
No. Index the pivots (time, service, env, level, trace_id). Keep full fidelity in the lake.
Will a lake‑first design make queries slow?
Not if you pair it with micro‑indexes and a small hot cache.
How do I tie logs to traces without timestamp joins?
Propagate and store trace_id (and service keys) in both. Pivot by ID.
Where do I start if my data is already indexed elsewhere?
Keep the index for the hot path, but move the durable source of truth to object storage. Layer micro‑indexes and trace IDs over time.
Join thousands of developers using CtrlB to monitor their systems with complete confidence and extreme precision.
Connect your entire stack in minutes with zero friction.
Sub-second latency on all queries. No waiting.
SOC2 Type II compliant, secure, and highly available.