How One Startup Is Taming the High Cardinality Problem
- By Winston Thomas
- December 16, 2024
![](/sites/default/files/inline-images/iStock-1302924529.jpg)
There is a reason why observability became a major buzzword in 2024. Sprawling cloud platforms and distributed applications mean IT teams are always scrambling from things going south.
Gartner calls observability a competitive business advantage. Analysts also predict the market will reach USD4.1 billion by 2028.
But there’s a catch. Observability promised to tame the complexity of cloud-native apps but distributed, cloud-native apps saw another beast of a problem emerge — what industry insiders name the high cardinality problem.
Deconstructing the beast
Cardinality is, in this case, the number of unique combinations a metric can take. The more dimensions and more key-value pairs, the higher the cardinality gets.
Unlike traditional monolithic apps, cloud-native apps can make cardinality explode. Microservices architecture, dynamic creation and destruction of instances, rich instrumentation, and torrent user-metrics and environment-specific metrics all add up to a high cardinality scenario.
The problem with high cardinality is that it can cripple performance, obscure crucial patterns, and, most importantly, send budgets into a death spiral. Many platforms resort to drastic measures to combat this, advising users to limit data dimensions, warning about performance bottlenecks, or restricting unique labels per metric.
But Krishna Yadappanavar saw an opportunity. “And then that’s where we started focusing on building this observability data lake,” he adds.
Solving the IT Gordian Knot
With a resume that includes a decade at VMware and a successful exit to Cisco for close to USD400 million, Yadappanavar is no stranger to quirks of distributed, cloud-native apps.
His latest venture, Kloudfuse, isn’t solving a small problem — it’s tackling what he calls the “5C problem” in observability: cardinality, control, causality, consolidation and cost. While most observability platforms buckle under the high cardinality pressure, this startup wants to turn it into a playground.
Kloudfuse’s approach hinges on an observability data lake. It wrangles different telemetry streams — metrics, events, logs, and traces — for real-time decision-making.
This unification simplifies operations by eliminating the need for multiple disparate tools and allowing customers to keep their data within their virtual private clouds (VPCs), reducing complexity and operational costs.
“Kloudfuse has transformed the way we monitor and manage our infrastructure and applications,” said Pankaj Pandey, director of engineering and architect at Tata 1mg, in a recent press release. “It has not only enhanced our operational efficiency but also empowered our teams to proactively address issues before they impact our customers.”
Architecting the ultimate single point of view
Where other observability platforms offer fragmented views, Yadappanavar sees Kloudfuse providing a unified command center that sits above your entire infrastructure, orchestrating data like a digital symphony conductor. It’s like air traffic control for pilots on a plane. And that was version 1.0.
With version 3.0, Kloudfuse is pushing even further into the frontier of observability. By integrating advanced machine learning models initially developed by tech giants like Facebook and Amazon, they’re not just monitoring systems — they’re predicting their behaviors. It also unifies the observability front and backends of applications.
“Kloudfuse 3.0 sets a new standard in unified observability by focusing on critical areas such as data, AI and analytics, handling large volumes of data to ensure scalability, deployment flexibility, and enterprise-grade features,” declares Yadappanavar. “Customers can now gain deeper insights into their digital experiences and optimize performance in real-time.”
Kloudfuse has features designed to give developers X-ray vision in their applications. Continuous Profiling helps pinpoint performance bottlenecks down to the line of code, while Real User Monitoring (RUM) offers a granular view of user experiences.
Advanced analytics and AI tools like K-Lens and Facebook’s Prophet enhance anomaly detection, sniffing out outliers and performance hiccups with laser precision. And for those drowning in log data, Log Archival and Hydration offer a lifeline, efficiently storing logs without sacrificing accessibility.
Visualizing complex datasets is a breeze with heatmaps and multi-attribute charts, making spotting those hidden gremlins in the system easier. Kloudfuse even has its query language, FuseQL, designed to overcome the limitations of traditional log querying and turbocharge performance. They’ve also thrown in support for Arm processors, VPCs, and a suite of enterprise-grade security features like RBAC, SSO, and multi-key authentication, just for good measure.
One view to rule all Ops
What truly sets Kloudfuse apart from established companies like Datadog, Grafana, New Relic, etc., is its vision of a unified operations landscape.
In a world increasingly fragmented by a dizzying array of “Ops” — DevOps, SecOps, FinOps, and now LLMOps — Yadappanavar sees a different future. One where these aren’t warring factions but collaborators, each with a unique lens on the same underlying data.
Whether you’re a dataops specialist looking at data quality or an LLMops engineer tracking model performance, the underlying data “is similar,” says Yadappanavar. He sees them looking at it from different lenses.
This might sound deceptively simple, but a single platform offering multiple perspectives makes sense. “So that’s the model we are moving towards, where a single platform can solve all the operations [needs],” says Yadappanavar.
The road ahead
Kloudfuse is not done yet. With upcoming features like security information and event management (SIEM) integration and advanced LLM observability, they’re positioning themselves at the cutting edge of an observability revolution.
As app development and IT continue their relentless march toward complexity, observability is no longer a luxury — it’s a necessity. Kloudfuse is not alone; big names like Splunk, IBM, ManageEngine, Datadog, and New Relic are retooling their platforms with AI. But Yadappanavar knows that platforms built from the ground up to tame high cardinality will have a distinct advantage.
Cloud-native app management may be a tornado, but it is no longer something to fear. Yadappanavar sees it as something to conduct.
Image credit: iStockphoto/Amiak
Winston Thomas
Winston Thomas is the editor-in-chief of CDOTrends. He likes to piece together the weird and wondering tech puzzle for readers and identify groundbreaking business models led by tech while waiting for the singularity.