When COVID hit the world a few months ago, an extended period of gloom seemed all but inevitable. Yet many companies in the data ecosystem have not just survived but in fact thrived.

Perhaps most emblematic of this is the blockbuster IPO of data warehouse provider Snowflake that took place a couple of weeks ago and catapulted Snowflake to a $69 billion market cap at the time of writing – the biggest software IPO ever (see the S-1 teardown). And Palantir, an often controversial data analytics platform focused on the financial and government sector, became a public company via direct listing, reaching a market cap of $22 billion at the time of writing (see the S-1 teardown).

Meanwhile, other recently IPO’ed data companies are performing very well in public markets. Datadog, for example, went public almost exactly a year ago (an interesting IPO in many ways, see my blog post here). When I hosted CEO Olivier Pomel at my monthly Data Driven NYC event at the end of January 2020, Datadog was worth $12 billion. A mere eight months later, at the time of writing, its market cap is $31 billion.

Many economic factors are at play, but ultimately financial markets are rewarding an increasingly clear reality long in the making: To succeed, every modern company will need to be not just a software company but also a data company. There is, of course, some overlap between software and data, but data technologies have their own requirements, tools, and expertise. And some data technologies involve an altogether different approach and mindset – machine learning, for all the discussion about commoditization, is still a very technical area where success often comes in the form of 90-95% prediction accuracy, rather than 100%. This has deep implications for how to build AI products and companies.

Of course, this fundamental evolution is a secular trend that started in earnest perhaps 10 years ago and will continue to play out over many more years. To keep track of this evolution, my team has been producing a “state of the union” landscape of the data and AI ecosystem every year; this is our seventh annual one. For anyone interested in tracking the evolution, here are the prior versions: 2012, 2014, 2016, 2017, 2018 and 2019 (Part I and Part II).

This post is organized as follows:

Key trends in data infrastructure

Key trends in analytics and enterprise AI

The 2020 landscape — for those who don’t want to scroll down, here is the landscape image

Let’s dig in.

Key trends in data infrastructure

There’s plenty going on in data infrastructure in 2020. As companies start reaping the benefits of the data/AI initiatives they started over the last few years, they want to do more. They want to process more data, faster and cheaper. They want to deploy more ML models in production. And they want to do more in real-time. Etc.

This raises the bar on data infrastructure (and the teams building/maintaining it) and offers plenty of room for innovation, particularly in a context where the landscape keeps shifting (multi-cloud, etc.).

In the 2019 edition, my team had highlighted a few trends:

A move from Hadoop to cloud services to Kubernetes + Snowflake

The increasing importance of data governance, cataloging, and lineage

The rise of an AI-specific infrastructure stack (“MLOps”, “AIOps”)

While those trends are still very much accelerating, here are a few more that are top of mind in 2020:

1. The modern data stack goes mainstream. The concept of “modern data stack” (a set of tools and technologies that enable analytics, particularly for transactional data) has been many years in the making. It started appearing as far back as 2012, with the launch of Redshift, Amazon’s cloud data warehouse.

But over the last couple of years, and perhaps even more so in the last 12 months, the popularity of cloud warehouses has grown explosively, and so has a whole ecosystem of tools and companies around them, going from leading edge to mainstream...

When COVID hit the world a few months ago, an extended period of gloom seemed all but inevitable. Yet many companies in the data ecosystem have not just survived but in fact thrived.

Perhaps most emblematic of this is the blockbuster IPO of data warehouse provider Snowflake that took place a couple of weeks ago and catapulted Snowflake to a $69 billion market cap at the time of writing – the biggest software IPO ever (see the S-1 teardown). And Palantir, an often controversial data analytics platform focused on the financial and government sector, became a public company via direct listing, reaching a market cap of $22 billion at the time of writing (see the S-1 teardown).

Meanwhile, other recently IPO’ed data companies are performing very well in public markets. Datadog, for example, went public almost exactly a year ago (an interesting IPO in many ways, see my blog post here). When I hosted CEO Olivier Pomel at my monthly Data Driven NYC event at the end of January 2020, Datadog was worth $12 billion. A mere eight months later, at the time of writing, its market cap is $31 billion.

Many economic factors are at play, but ultimately financial markets are rewarding an increasingly clear reality long in the making: To succeed, every modern company will need to be not just a software company but also a data company. There is, of course, some overlap between software and data, but data technologies have their own requirements, tools, and expertise. And some data technologies involve an altogether different approach and mindset – machine learning, for all the discussion about commoditization, is still a very technical area where success often comes in the form of 90-95% prediction accuracy, rather than 100%. This has deep implications for how to build AI products and companies.

Of course, this fundamental evolution is a secular trend that started in earnest perhaps 10 years ago and will continue to play out over many more years. To keep track of this evolution, my team has been producing a “state of the union” landscape of the data and AI ecosystem every year; this is our seventh annual one. For anyone interested in tracking the evolution, here are the prior versions: 2012, 2014, 2016, 2017, 2018 and 2019 (Part I and Part II).

This post is organized as follows:

Key trends in data infrastructure

Key trends in analytics and enterprise AI

The 2020 landscape — for those who don’t want to scroll down, here is the landscape image

Let’s dig in.

Key trends in data infrastructure

There’s plenty going on in data infrastructure in 2020. As companies start reaping the benefits of the data/AI initiatives they started over the last few years, they want to do more. They want to process more data, faster and cheaper. They want to deploy more ML models in production. And they want to do more in real-time. Etc.

This raises the bar on data infrastructure (and the teams building/maintaining it) and offers plenty of room for innovation, particularly in a context where the landscape keeps shifting (multi-cloud, etc.).

In the 2019 edition, my team had highlighted a few trends:

A move from Hadoop to cloud services to Kubernetes + Snowflake

The increasing importance of data governance, cataloging, and lineage

The rise of an AI-specific infrastructure stack (“MLOps”, “AIOps”)

While those trends are still very much accelerating, here are a few more that are top of mind in 2020:

1. The modern data stack goes mainstream. The concept of “modern data stack” (a set of tools and technologies that enable analytics, particularly for transactional data) has been many years in the making. It started appearing as far back as 2012, with the launch of Redshift, Amazon’s cloud data warehouse.

But over the last couple of years, and perhaps even more so in the last 12 months, the popularity of cloud warehouses has grown explosively, and so has a whole ecosystem of tools and companies around them, going from leading edge to mainstream...