Why Disk-Less Data Lakes Are the Future 

May 15, 2025

Let’s face it, working with logs and large-scale data systems can be a real grind. If you’ve ever stared at a spinning loader bar after hitting “search” or wondered why your storage bill looks like a phone number, you’re not alone.

The tools we’ve used for years just aren’t keeping up. Traditional data lakes, built around disk-based storage, were never meant for real-time data, massive scale, or the kind of flexibility modern teams need.

But what if you could ditch the slow storage layer altogether?

That’s where disk-less data lakes come in, and platforms like CtrlB are already showing how this approach makes log management not just faster, but smarter and cheaper.


The Problem with Traditional Data Lakes

When data lakes first came into the picture, they felt like a game-changer. Just throw in all your data: structured, unstructured, logs, metrics, and figure things out later. No need to set up strict rules or worry about structure upfront. It sounded perfect.

But that dream hasn’t quite worked out. In reality, a lot of companies find themselves stuck. Traditional setups like Hadoop or Elasticsearch (ES) rely on virtual machines (VMs) and local disk storage, and that comes with significant baggage:

  • Redundancy: You end up storing the same log or data point multiple times.
  • Duplication: Systems like ES create extra shards and replicas, inflating storage and compute costs.
  • Durability limits: Local disks on VMs are failure-prone. Even with replication, data loss isn’t rare.
  • High compute costs: You pay upfront to parse and index everything, even if you never query most of it.
  • Slow data access: Cold logs are often archived or offloaded, making them slow and expensive to retrieve.
  • Painfully slow queries: When data lives on disk, search times suffer. The more data you have, the slower it gets.

So What’s Disk-Less, and Why Does It Matter?

Disk-less data lakes break the marriage between storage and compute. Instead of tying everything to disks, they:

  • Store data in cheap, scalable object storage like Amazon S3.
  • Spin up compute only when you need it, and store for as long as you need it.

Blob storage (like AWS S3, GCS, etc.) is built for scale. It’s not tied to a physical disk or virtual machine (VM). That means:

  • You don’t need to worry about managing servers or storage disks.
  • You get 99.99999999999% durability (11 nines).
  • Your data is stored across multiple locations by default, meaning built-in redundancy without you having to set anything up.

Compared to storing logs on VMs, which can fail, require maintenance, and aren’t built to scale indefinitely, whereas blob storage is cheaper, safer, and way more flexible. It’s a subtle architectural shift with massive implications. 

CtrlB’s Take: Diskless, Durable, and Efficient

At CtrlB, we’ve seen firsthand how painful it can be to work with traditional systems, especially when your team needs answers immediately. While we’re positioning ourselves as a data lake, our architecture already leans into disk-less principles like:

  • It stores raw logs directly on blob storage, with no disk dependence.
  • It uses on-demand compute, pulling and processing only what you query.
  • Logs live in efficient, scalable storage. Cold logs aren’t constantly loaded; they’re just fetched when needed.
  • Durability is inherited from blob storage, meaning your logs are safer and cheaper to store.
  • Our search is fast users get what they need without waiting forever.
  • It builds context dynamically, with sub-second latency.

It’s Time to Rethink the Stack

The old model made sense in the early days of cloud, when data was smaller and real-time wasn’t a requirement. But today’s systems are noisy, distributed, and always-on. Disk-less isn’t just a performance boost, it’s a shift in how we think about scale, flexibility, and cost. And once you’ve worked with a platform that doesn’t make you wait around for logs, it’s hard to go back.
If you’re:

  • Tired of bloated observability bills
  • Sick of maintaining costly VM-based setups
  • Storing more logs and querying

It’s time to rethink the foundation.

CtrlB is the future of data lakes: fast, durable, and ready when you are.

Check out ctrlb.ai and see how fast log search and analytics can be.


Ready to take control of your observability data?