Machine Learning in Private Investment

Short post: The Establishment tools used to 'model' the value of a company (CAPM, DCF, etc.) use values (variance in returns, revenue, etc.) as a way to quantify and prescribe 'value' to a company. People are comfortable with it (despite Mandelbrot, Taleb et. al.) because the quantified values it uses are 'primary' to the underlying asset (company): revenue, variance of returns, growth rate etc.

I think in applying machine learning, we may look at 'secondary' values, so-called 'alternative data;' satellite images, metadata on companies, investor network graphs, hidden relationships. The private investment system is complex but there must be a mechanical, abstract representation of the system that we can use to improve the likelihood of investment success beyond 50-year-old establishment-level financial theory. 

Note: For a great primer on the usefulness of mechanical representations (models) in system representation and the application of data (machine learning) in solving problems read Peter Norvig's classic 'The Unreasonable Effectiveness of Data' or watch here

Seed Fund: Number of Portfolio Companies, Ownership and Fund Returns

Return On Invested Capital (ROIC) is a common metric used in describing the performance of an early stage fund. Common ROIC targets (Rt) for seed funds are 3.0x – 5.0x (higher the better). Over N portfolio companies at initial check size (assumed to be constant) i, this is just Rt.N.i (for example a $50m fund at 5.0x ROIC is $250m = 5.0x.50.1m [1]).

It is very difficult to attribute success outcomes to portfolio companies at the earliest investment stage (will this be $3b exit or $300m exit?) so we can look at the portfolio aggregate exit value, or the Implied Portfolio Market Cap on exit (IPMC). That way we don’t need to assign individual success outcomes to specific companies but just look at the overall market value of our portfolio at exit. This can be represented as:

When pr is 0 (no reserves for pro rata) IPMC generalizes to:

This states that adding more portfolio companies generates a higher required portfolio market cap on exit (all else being equal). Looking at an example, say we had a portfolio where our return target is 5.0x, we invest in 30 companies (N = 30), at first check of $10m post money (Vpost = $10m), not executing pro rata (d = 0.5) our required combined exited market cap is $3b. So we would need at least $3b in exited value from our 30 companies to achieve 3.0x ROIC (Note: if we had executed every pro rata opportunity, d = 1, and IPMC = $1.5b).

Indeed we can calculate the marginal increase in IPMC for every new portfolio company added (just set d = 0.5, Rt = 5.0x, Vpost = $10m). In this generalized scenario, adding another portfolio company (independent on initial check size) adds another $100m to the required portfolio market cap on exit [2].

We can use IPMC as a way to 'sanity check' the construction of our portfolio. Will we see/have access to enough high quality companies to generate a portfolio with $3b in combined exited value?

It is also helpful in the initial portfolio construction process. The below chart shows the IPMC for different targeted ROIC scenarios at different levels of aggregate ownership at exit. (based on a $50m fund). We can see that as ownership decreases, IPMC increases exponentially (this is also bad given the IPMC is tied to underlying portfolio company performance which is modeled as a power law). 


[1] This is the simple case where there are no reserves for pro rata

[2] It should be noted that in the power law environment this marginal contribution model makes a little less sense since the fund returns are not made up of a little return from many companies but in fact the opposite: a large return from a small number of companies. I still believe this framework is of strong value; even if returns come from just one company, this can still be represented in this model.)

Optimizations and Likelihoods of Fund Returns

Returns from a VC fund are driven by the exit outcome of the underlying portfolio companies and the funds’ ownership of these companies at exit. For seed stage, it is common to model the company level returns using a power law. The first sentence above can be represented as the following (leaving the representation of the power law aside for the moment):

Ownership at exit can also be represented as:

As is clear, the fund returns are just the sum of contributions from exited portfolio companies; the exit value of company n multiplied by ownership at exit of company n. The interesting thing to note here is that if exit ownership is reduced linearly (either through smaller initial check size or greater pro rata based dilution) for returns to be the same, the size of the exit En (in $) must compensate and be higher. But this is exponentially less likely since we are in a power law environment. A large En (the ‘unicorn’ scenario) is the scarce element in this power law environment and (generally) not the ownership at seed.

Structuring a Seed Fund

There are factors that structurally limit the size of a seed VC fund like deal flow, allocations and check sizes and pro rata participation. Thoughts on each of these below, with the size of the fund at the end.

  • Step 1: Deal Flow and First Check Sizes
  • Step 2: Pro Rata
  • Step 3: Fund Size

Step 1: Deal Flow and Ownership

Deal Flow and Portfolio Companies

  • The number of companies you will have in your fund may be influenced by the following:
  • Where/how you get your deal flow
  • Whether you have a sector focus and how specific this is
  • GP threshold of deal quality
  • There are funds that will do 4 deals a month and others that will do 4 deals a year
    • The number of companies in your portfolio is the first limiting factor in fund construction

First Check Sizes

  • Together with the post money valuation (pre-money valuation + money raised in round), the funds’ first check into a company will determine the funds’ initial ownership
  • This is critical since it sets the upper limit on ownership in the company (assuming no super pro rata rights) and will likely be diluted (assuming the seed investors do not participate in late stage pro rata rounds even if they have pro rata in the first place)
  • Lead investors in the seed round generally end up with 15% to 20% of the company
  • Raising $1.5m on a $8m post with lead investor deploying $1.2 leaving $300k for angels
  • You should also determine whether the fund will lead seed investments (sometimes taking a board seat, guiding/advising the company on strategy, supporting hiring, guiding the company to Series A through networks)
  • This is a key differentiator for micro-VC seed stage funds since generally you do want an active lead investor getting the company to series A, if this is not your fund you will own significantly less than the lead and this changes fund economics (deploy $300k to a $8.0m post, own 3.75% versus 15%)


Step 2: Pro Rata

  • But ownership in the company at exit is also heavily affected by whether the fund participates in pro rata rounds
  • Many times seed investors will not be given pro rata rights and in some instances will not be provided these rights even if they were given at the time of seed (in some cases this can be positive if a very strong Series A investor is leading the A, therefore giving the company a higher potential exit opportunity)
    • An illustrative example (real raising amounts/valuations will change):
      • Let’s say the fund invests $1.0m as part of a $1.5m round at a $10m post valuation. Here the fund owns 10% of the company


  • The table on the bottom left shows the subsequent funding rounds, how much the company is raising, how much the funds pro rata allocation would be to maintain 10% and the pre and post money valuations to work out dilution
  • The bottom right table shows the effect of this dilution if the fund did not participate in any pro rata, if the fund Series A pro rata only, and if the fund did Series A and Series B
  • Note: it is unlikely that the seed fund would be able to participate in Series C rounds since here the fund allocation would be $10m, also the Series A, B, C investors would be aggressive on gaining as much ownership as they can)
  • Dilution calculation: If the fund did not participate in the Series A:
  • Fund owns 10% of the pre-money $30m, $3m
  • But company raises $5m cash, so we now own $3m in a $35m company ($3m/$35m = 8.6%)
  • From the table, we see taking no pro rata we get diluted to 4.5% ownership, while participating through the B we get diluted to 5.9%
  • In a billion dollar exit this represents a $14.5m difference
  • We have effectively exchanged $2.5m to retain 1.4%
  • If we did not invest in the A or B, we would have $2.5m to deploy to other seed companies
  • At $1m per check this would be 2.5 more companies
  • To make up this $14.5m through 2.5 other seed checks (assuming 1 would become a winner), we would need that company to realize an exit of at least $330m
  • Math as follows: In that other seed company our 10% ($1m into a $10m post) would be diluted to 4.5%, this 4.5% would have to cover $14.5m, 4.4%*$330 = $14.5
  • So if we did not participate in the A or B pro rata rounds we would need 1 out of the next 2.5 companies to realize an exit of $330m+ in order for this scenario to be better than participating in the Series A, Series B pro rata rounds (assuming you knew at Series A if it is likely to be successful)
  • Having a $1b+ and $300m+ realized exit in 4 companies may be difficult given the power law in seed stage returns


Step 3: Fund Size

So now putting it all together we can generate the size of the fund that fits with our strategy

Number of companies:

  • Say we could identify 1 company per month that crosses the threshold of quality (sector, team, market etc.) that’s 12 companies per year with a 2 year investment period on a micro-VC fund that’s 24, say 25 companies
  • At $1m per check that’s $25m in primary investments
  • Assuming we have pro rata in all companies, and 60% of companies survive to Series A that’s 15 companies of which the fund will choose to participate in 10 of these companies’ Series A, that’s $5m (assuming we own 10% and they raise $5m at A)
  • We already know that in some instances we will not be able to participate our full pro rata amount given deal dynamics, so this seems to be upper bound
  • Say half of these 10 companies survive to series B, that’s 5 companies and the fund would participate in 3 of them, that’s $6m (assuming we still own 10% and they each raise $20m)
  • So total fund size through 25 primary checks at $1m each over 2 years, and Series A pro rata in 10 companies at $5m and Series B pro rata in 3 companies at $6m we get total fund size of ~$40m


  • 6.0x ROIC on a $40m fund is $240m
  • In 25 companies, say 5 are winners at blended 5.5% ownership this implies success case should be $900m
  • This seems unreasonable
  • Assume 1 company runs to $5b, at 5.5% ownership this would return $275


Fund Construction Summary:

  • $40m fund, approximately 60% to primary seed checks, 40% reserved for pro rata
  • 25 companies at seed, participate in 10 Series A rounds and 3 Series B
  • Will generate 6.0x ROIC with 5.5% ownership (diluted after Series B) with 1 out of 25 companies that exits at $5b (3.0x ROIC with 1 out of 25 companies exiting at $2b)
  • Even if fund could not maintain (or does not get pro rata rights) will be diluted to 4.5%, of $5b exit this is still ~6.0x ROIC

Will need to prove:

  • High quality deal flow
  • Ability to write $1m seed checks into $1.5m seed round
  • Ability to maintain pro rata through Series A

Human Learning vs. Machine Learning

Machines may undergo extremely accelerated learning cycles given the inherent inter-contentedness of their network. By this I mean when a human learns the optimal strategy for a particular event it is a single point of learning affecting that one individual. When a machine learns the optimal strategy it will be seamlessly distributed to every other machine in the network. 

There are very clear, inherent limitations in human learning. You need to "teach" (verb) me your understanding. It takes time to get you from 0 to 1. Once one machine gets to 1, every machine has got to 1. 

Imagine every human benefiting from the collective intelligence of every other human on the planet every day for the entire history of humanity. No rote learning, repetition or time delay through new cognition. Instant understanding, unlimited progress. 

Decision Making in a Low Information Environment:

Applications in Angel and Seed Investing

In December of 2000 the NASDAQ had fallen 51%. In an effort to help understand why, researchers from the Berkeley International Computer Science Institute and Rutgers authored a paper examining the effects of learning with limited information, focusing on strategic interactions “on the Internet.” However, these academic models can also help in understanding the decisions today between participants in another low information environment; angel and seed investors.

The paper examines the monopoly, synchronous and asynchronous Cournot and Stackelberg competition environments. We will consider all here (since it is difficult to label the ‘correct’ model, if there even is one). The model that is most applicable (potentially) to angel/seed investing is the asynchronous Stackelberg environment (indeed it is shown in the study that asynchronous environments tend towards Stackelberg competition. Here there is a leader (large VC’s participating in Series A, B), follower (Angel/Seed funds) structure where the leader knows ex-ante that the follower observes leader actions).

Lesson 1: Players are responsive and react quickly to changes in environment even if their current near-equilibrium payoffs are not affected

Since angel and seed investing is a low information environment, players may find it hard to attribute high payoff events with their corresponding investment actions. As a result actions that provided a high ‘equilibrium’ payoff state are potentially unknown so when they observe experimentation by other players they may perceive the other player to have more direct (higher payoff) information. The importance of building a strong negative feedback loop is examined later.

Lesson 2: Instead of independent probabilities of experimentation subjects appear to enter autocorrelated ‘experimentation phases’

This is a very surprising result and may provide some insight into how tech bubbles are formed around these ‘autocorrelated experimentation phases’. Autocorrelated because they follow a repeatable pattern in each returning occurrence of the experimentation phase.

Lesson 3: The greater variation a player can have in the payoffs of opponents the greater the inherent instability of the system where the opponent is more likely to experiment

It could be argued that venture capital follows a Stackelberg competitive environment; there is a leader and a follower and the leader knows ex ante the follower will respond to his or her actions. Here the leader is the institutional VC and the follower is the angel. The angel is incentivized to align with actions of the institutional VC since the angel’s investment is potentially dependent on future funding from a VC. In this system the variation is extreme from bankruptcy to IPO leading to, as the paper finds, greater instability and higher rates of experimentation.

Lesson 4: Players can be led to confuse experimentation by opponents with changes in underlying payoffs leading to a cascading effect of experimentation

Cascades of experimentation in the startup investing environment doesn’t sound good. But it does allow us to have greater insight into the competitive pressures leading to tech bubbles. One VC ‘experimenting’ in a new internet startup is observed by all others as a change in the underlying payoffs each could be expected to gain leading to cascading experimentation from all.

Understanding the competition in a low information environment could provide valuable insights into the creation and sustaining nature of tech bubbles. Since it is very difficult to connect high payoff events (10x investment) with the direct causes of this success (team, market, execution, tech?) players are incentivized to experiment (all the way to NASDAQ -75%.)

Structure for reviewing potential investment: Input is information on new investment (the pitch), B outcome of previous investment decisions, A is structure for filtering investments based on inputs (diligence etc.), output is investment decision

So what can we learn from this? In my view it highlights the value in building a robust negative feedback loop for investment decisions. Here we take as ‘Input’ the information on the current potential investment as well as the information on competitive experimentation (from other angels/VC’s). We also take ‘B’ the result of our previous investment decisions and feed both through ‘A’ our decision model for the current investment and arrive at ‘Output’ our investment decision (which the result will then be fed back to inform the next potential investment).

Without doubt this system is employed by a great number of successful angels and VC’s. Formalizing its structure can be beneficial so as to be balance ‘new’ competitive experimentation information with ones own investment theses and diligence structures (to avoid cascading experimentation).

Most importantly this formalized negative feedback loop allows the angel or VC to learn what decisions/theses/ideas/diligence allowed them to arrive at a successful payoff (and which didn’t). The issue here is the latency between input decision and output (payoff result) which should be the subject of another post since this one is almost TL;DR. Takeaway: Build formalized negative feedback loops for your investment decisions.

*There may be situations in the angel/seed application here that do not correlate exactly to theory (Cournot and Stackelberg competition) however this application is designed to illustrate concepts for discussion rather than add to the academic literature

Investment Opportunities in Artificial Intelligence

Disclaimer: Much of the following post will be proven wrong by the relentless judgement of time. Indeed, there may be something to learn from where we were wrong (in January of 2017). This post is designed as an exploration tool rather than a descriptor of fact.

There are numerous investment opportunities provided by the awakening of AI. Both in seed stage and 30 year plus public companies. This post is an attempt at structuring our current explorations and understandings of the investment opportunities afforded by the development of (current, narrow) AI. This forms the basis of our A3 Thesis.

The Three Pillars of AI

It looks like the earliest documented mention of the three pillars of AI was from Wired magazine in October 2014, these being:

·         Appropriate Algorithms

·         Data Availability

·         Computational Power

1.       Algorithms

Per Wired (and numerous academic journal articles) the algorithms that power much of the hyper AI tech of 2017 originated much earlier (the 50’s, the 80’s etc.) and have just been waiting. Geoff Hinton, now infamously sparked a resurgence in neural nets in the 2012 ImageNet competition with again now ubiquitous ‘deep learning’ approach. It seems to us that the commercial value in algorithm development may be in specialized use cases (outside traditional image recognition etc.) but it hard to imagine such opportunities today. Also much of this value lies in the prodigious minds of the human capital of the company or University; hard to monetize. It appears much of the value in AI algorithm development has been commoditized.

Opportunity: Large public companies (FANG’s); recruitment and retention of key human capital (large public tech may have the means necessary to provide utopian research environment for academic talent).

2.       Data

Incredible opportunities for early stage technology companies in data, specifically ‘data generative assets’. Any company that creates differentiated, deep, clean, useful data through sensors or incentives or any other means possible. Ginni Rometty has called data the new natural resource of our time. In fact, it is even better; companies can create their own data (effectively creating their own oil).

Opportunity: Startups: industrial sensors, new business models that incentivize people to track data, satellites for more pervasive, cleaner data, collaborative business models etc. are all attractive.  

3.       Computation Power

Given the capital intensity of developing deep learning-specialized GPUs we had originally thought the best investment opportunity here was with NVIDIA. Indeed, if you had invested at the beginning of 2016 it would have been a remarkable investment (+260% and best performed in the S&P 500). However, recently we came across Cerebras Systems, a startup run by a team who created SunMicro and sold it to AMD. Cerebras is funded by Benchmark proving that perhaps there is room for a proven team, specialized focus and deep pocketed, long term, committed investors. 

Opportunity: Market neutral NVDA/INTC; NVIDIA has been aggressive in its adoption of deep learning based AI even as early as 2014. Intel has missed the mobile revolution and risks being left behind in deep learning. Very hard to bet against Intel. More to come.