View all on-demand sessions from the Intelligent Security Summit here.
In recent years, a new breed of cloud data platforms has sprung up in the backyard of hyperscale mainstays like AWS and Microsoft. Today, Snowflake, Databricks and a handful of others are successfully driving enterprise data efforts, enabling global giants to connect, store and generate insights from information coming from different sources.
The solutions offer companies enormous power and possibilities. But their dominance has also led to a kind of ‘gold rush’. Example: A massive proliferation of data infrastructure upstack tools.
In the wake of the successes of Snowflake and Databricks, a busy ecosystem of tools has emerged. The tool suppliers are trying to unlock the potential of modern data platforms. But as their ranks grow, they may also see consolidation. Signs of that were seen earlier this week in the agreement of analytics engineering house dbt Labs to acquire Transform, which has sought to create a semantic data layer to better integrate the modern data stack.
While players like Snowflake and Databricks provide a platform to host the data and build applications, they can’t do it all. There are many areas of the data lifecycle that these solutions don’t fully support, such as data ingestion, transformation, orchestration, management, and observability. Modern upstack tools, provided by third-party vendors, fill these gaps.
“A large number of companies compete for different products and services to companies [that] trying to build on the Snowflake and Databricks ecosystems,” said Sean Knapp, founder and CEO of Ascend.io, which automates data and analytics engineering workloads. Knapp told VentureBeat that the problem of crowding in this space has been exacerbated by overfunding, allowing many potential positions to thrive at many individual companies.
Evolution of data monoliths
As data platforms came to the fore, early adopters sought to address their immediate pain points by building the required software solutions themselves. This was the first wave in the evolution of upstack data tools, when there was no pattern or widespread adoption to justify the existence of enterprise solutions.
Gradually, as needs emerged from the early adopter era, the second wave of point solutions emerged. This is where most enterprises are currently located. They use whatever specialized data tools they can find to solve small pieces of the puzzle and make significant profits in a short period of time.
Today, Snowflake and Databricks support dozens of partner tools. Some popular ones come from dbt Labs, Matillion, and Prophecy (for data preparation and transformation); Hightouch Hevo and Fivetran (for data capture); and Anomalo and Lightup (for data quality).
Meanwhile, trusted business intelligence like Alteryx, PowerBI, and Tableau create custom analytics and visualization tools that are now widely used in Snowflake and Databricks implementations.
There is a lot of overlap in what the vendors offer, and many solutions also cover aspects such as data science and observability.
Most available upstack tools do their job well, but if there are too many solutions for different capabilities on the same infrastructure, teams can end up designing extremely complex data ecosystems. They have to assemble, integrate and manage all of their different tools simultaneously, which means they pay not only for the technology used, but also for engineering time and opportunity costs. This has a direct impact on ROI.
Further, when data bounces between multiple tools, it becomes very difficult to tune and optimize its movement and processing.
“Moving from a simple monolithic model to a complex model with hundreds or even thousands of interdependencies can lead to a data ecosystem that is difficult to understand and maintain, requires many expensive licenses, and requires a steep learning curve for user training and onboarding.” Ben Haynes, co-founder and CEO of Aim, VentureBeat told me. Directus features a data platform with a “back-end-as-service engine” for developers, along with no-code tooling for non-technical users.
The various component services within stacks are constantly moving objects.
“If one of the services moves forward and another stagnates or becomes unsupported, the integrations and dependencies between them can break,” added Ascend.io’s Haynes. “One dependency break can have a domino effect, bringing operations to a halt. Because microservices often don’t fit together perfectly, there may also be capability gaps that need to be filled with custom code and logic.”
Are new waves of consolidation on the horizon?
As teams grow weary of managing dozens of tools and standard patterns of what is needed in the long run emerge, the third wave, “rapid consolidation,” is expected to emerge. Here, teams will try to implement a single platform that unifies most, if not all, of the capabilities they use. Such capabilities often include inclusion, transformation, and observability. Teams will try to reduce complexity and better focus on core product requirements.
“What our data does, how we do it or how we apply the information may differ, but there are many common patterns. As we see these patterns emerge, there is immense value in creating a single platform that brings together many more of these capabilities,” explains Knapp.
“With consolidation, our teams don’t have to spend most of their time merging and integrating tools, which doesn’t add value,” he added. “The more uniform system makes them more efficient and paves the way for new developments. For example, you can apply really advanced layers of intelligence to the data lifecycle because you have more unified metadata and can build automated systems.
Directus leader Haynes, in turn, sees a balanced “hub-and-spoke” model emerge, where the hub serves as the foundation for common or critical functionality, does 80% of the work, but still offers the ability to easily connect other companies to connect with each other. critical hyper-specialized tools like those from Stripe, Hubspot, or Salesforce.
In general, the consolidation of upstack tools is expected to be driven by private equity-driven M&A, particularly those led by the dominant data platforms.
For example, Snowflake recently announced the decision to acquire Myst for time series forecasting and SnowConvert to support cloud migration. Similarly, Qlik was owned by Thoma Bravo last month announced his intention to join forces with Talend, another entity owned by Thoma Bravo.
“It makes perfect sense that the Snowflakes and the Databricks of the world are very greedy. Whether we see really big acquisitions or if they come in the second half of this year or next year is a matter of question. I would probably bet more on the second half of this year and the beginning of next year,” Knapp said. For Snowflake and Databricks, he added, there will be some caution in acquiring entities that can create competitive dynamics within their ecosystems.
VentureBeat’s mission is to become a digital city plaza where tech decision makers can learn about transformative business technology and execute transactions. Discover our Briefings.