Industry Deep Dive: Open Source Databases

This industry is incredibly interesting not only because databases are the core of every data strategy (and data is becoming the core of every strategy) but also because the business model is in itself still in question. With Benchmark's recent Series A lead of Timescale, a PostgreSQL packaged time series database and MongoDB's 2017 IPO, I thought it might be a good time to dig in a little to the industry, companies, business models and prospects for startups. 

Primer: Ajay Kulkarni, Co-Founder and CEO of Timescale has a great post on the history of SQL versus NoSQL and where the industry is heading today.

In looking to get context on growth rates, margins and revenue targets for new open source database startups, I looked at the IPO filings of 4 recent open source enterprise companies (Hortonworks, MongoDB, Pivotal and Cloudera) to understand what "success" looks like - at least what Wall St. is expecting should the companies make it to IPO. Importantly this influences the internal growth targets and financial forecasts of new startups. 

Business Model and Growth Targets

Open source database companies typically generate revenue through subscription agreements (for enterprises to use their software for commercial purposes) and through professional services. Red Hat was able to build a multi-billion dollar business through services revenue alone (but this is most likely due to the fact that they were part of the generational shift from UNIX to LINUX.) Although the code is open source, companies implement an "open-core" model where the core of the code is open source but enterprise grade features require a license.

The below tables show the Subscription revenue and gross margin at IPO (Yr 0) and growth rates and margins for the 2 years preceding the IPO. We can use the table to generate a Subscription revenue target of $150m at 85% margin (and $30m Services revenue at 15%) at IPO for our fictional new open source database company. 


The above graphs show why open source database companies are moving away from relying on services revenue as a key driver. Both the growth rates and margins are unreliable: the average services revenue growth rate at IPO was 24% at a gross margin of only 13%.

Funding Required to Achieve Growth Targets and IPO

The amount of capital required to go public (or just build something of value) depends very heavily on the industry of the company. This is why software businesses have allowed for another golden age of VC (there are obvious exceptions however, according to Crunchbase Snap raisde $4.6b in VC funding.)

Below are graphs showing the amount of capital raised by each company before and after monetization. This distinction is important. The burn for engineers to build the database is small and as such the amount raised (and therefore founder dilution) is low. However, during the monetization phase significant amounts of capital are required. This money goes into the sales and marketing process for the new database to compete against Oracle, Amazon and the other open source solutions available.  


The practical implication of raising this much (MongoDB's Series F was $150m, Cloudera's Series F was $740m - Crunchbase) is that for this amount of capital to be invested the valuation must reach commensurately high levels. For example if in these funding rounds the founders are willing to give away 20% of the company, the pre-money valuation would have to have been $600m. It would be interesting to see how this was justified, especially given monetization had only begun. 

Industry Development and Future Prospects 

The shift back to SQL from the NoSQL sidetrack is going to offer massive potential for value generation. The question is who will benefit? AWS CEO Andy Jassy recently said on CNBC that Amazon Aurora (the scalable SQL database) is their fastest growing product - ever. And Google's leadership in the database community basically ushered in the move back to SQL (after they were the ones to usher in the first move from SQL to NoSQL.) Their Spanner product, given their reputation is dominant. 

But the startups are taking advantage of companies growing despondency with a perceived lock-in with larger companies. Oracle has been the most damaging in this regard and this is directly translated to Amazon and Google. Companies want strategic flexibility, especially given the mission critical nature of the database. They can only really get this by deploying a truly independent offering like that from Cockroch Labs, Timescale or the many other independent solutions. It is hard to bet against Amazon and Google. It will be very interesting to see how the future unfolds.