Modernists and Mavericks: New York Tech

“Gayford is in effect recounting the fall of Paris as the adjudicatory centre, the supreme court, of modern art. From the Impressionists, to Cézanne, to Matisse and Picasso, Paris ruled.”

The FT on Martin Gayford’s ‘Modernists and Mavericks’

In 1969 Francis Bacon, while in London, created “Three Studies of Lucian Freud”. In 2013 Christie’s would sell the work for $142m. The sale was a commercial representation of the social and economic network that was created in the modern art world in London since 1940. At that time however, Paris, not London, was considered the “supreme court” of modern art. Read from the perspective of a participant in the developing New York tech ecosystem, “Modernists and Mavericks” which tells the story of how London created the required social and economic network to pull the center of gravity away from Paris, provides an interesting parallel to what’s happening today between SV and New York.

“Three Studies of Lucian Freud” Francis Bacon, 1969

“Three Studies of Lucian Freud” Francis Bacon, 1969

Shifting the center of gravity of something so deeply geographically centered doesn’t happen often. It doesn’t happen often because the geographic center is a physical instantiation of the underlying social and economic networks present within that geography. To examine this, we require the tools of modern network theory. For example, the below diagram represents a selection of the most influential technology companies and associated investors of the last 6 decades. It is clear that a network representation is essential to our understanding of how this (and these) ecosystems develop.


Visually, the overwhelming degree centrality (number of links incident on a node) of both Fairchild Semiconductor and KPCB is clearly evident (indeed the K in KPCB was an employee of Fairchild). It is also clear how this node can influence the ecosystem for decades: Fairchild is first degree connected to Apple, KPCB, Intel, Sequoia, and second degree connected to Google, Amazon, and Netscape.

The influence of the network is particularly exacerbated in the venture industry given the presence of the positive feedback loop: Sequoia invests in a company, better talent is attracted, better guidance is provided, less competition, the investment outperforms, other founders want to be associated with Sequoia, they pitch Sequoia, Sequoia sees better deals. In network theory, this resembles the idea of positive assortativity: relatively high-degree nodes have a higher tendency to be connected to other high-degree nodes. This is exactly why the Power Law is the topic of choice for VC dinner conversations.

These networks are ‘easy’ to generate in a historical, static capacity What’s more interesting (and difficult) is to generate these in a dynamic, contemporary way, that may enable us to both forecast the development of an ecosystem in real-time and understand the requirements for the new ecosystem to flourish. For SV, once Fairchild gained the incredible concentration of talent, and then that talent dissipated, the ecosytem exploded. Such is the present hope for New York tech.

“People feel that it is very important for artists to have an aim. Actually, what’s vital is to have a beginning. You find your aim in the process of working. You discover it.” - Bridget Riley, “Modernists and Mavericks”

Can You Build a Quant VC Fund?

Many people have pointed out the irony that venture funds invest in the most cutting edge technology yet still operate with excel, quickbooks, and "gut feel". Given the current awareness of the value of data and the rise of machine learning, is there an opportunity for technology to radically alter the venture landscape like it did for the public markets in the 80s?

At my previous fund, Hone Capital based in Palo Alto, we thought so. We built machine learning models to enhance the GP decision in investing in over 350 early stage technology companies. You can read more about our approach in McKinsey Quarterly.

In this post I want to first look at quantitative approaches in the public markets, how the market structure influenced these strategies and how these don't directly translate to the private markets. Then look at how the private market architecture is changing and how that might present new opportunities for quant strategies in VC.  

The Different Goals of Traditional vs. Quant

The goal of traditional investing, put simply, is to find undervalued companies. Traditional investors have always used data (well maybe I shouldn't say always; Benjamin Graham introduced the idea of an "investment policy" rather than speculation and hoped to implant in the reader of his canonical "The Intelligent Investor" a "tendency to measure or quantify.") The traditional investors' data consists of revenue, margins, growth rates, etc.: metrics I call 'primary' to the company.

Perhaps inevitably, the data used by traditional investors is growing. Now the term 'quantamental' is used for those using "alternative data" like satellite images, app download numbers etc. This is still however, using data (albeit new forms of data) to achieve the same goal: identify undervalued businesses.

It's important to note when translating public quant strategies to VC, quant hedge funds don't use data to enhance the traditional goal, but rather they have created an entirely new goal and have profited handsomely to say the least. The quant funds used data to create a completely different goal: find repeatable, short term statistical patterns. In effect, the quant strategies grew out of the "data exhaust" of traditional investing.

Comparing Public and Private Market Structures

Longer term trading makes algorithms less useful
— Jim Simons

The architecture of public and private markets is very different. Below is an examination of the elements of the public markets that led to the development of the quant strategies that have been so successful. To make the comparison clear with the private markets (since our goal here is to explore opportunities for quant VC) I also list the challenges of translating the public market strategies to the private market. 

1. Investment Time Horizon

Public market: Short. The ability to trade within minutes (seconds, microseconds etc.) allows the quant hedge funds to isolate a statistical driver of profit. This is originally how Renaissance Technologies started: they "predicted" the very short term response to a macroeconomic announcement (non-farm payrolls, consumer price index, GDP growth etc) by analyzing a huge (analog) database of how securities responded to those announcements in the past. By trading in and out within minutes, no other exogenous factors influence the security response in that time (theoretically.) 

Private market: Very Long. Venture investments have a long time horizon and the investment is extremely illiquid (this is changing now to some extent with companies like EquityZen, SharesPost and Equidate - but even still these are mostly for later stage secondaries and still only offer "trade" windows of months at least.) The long time horizon between opportunities to exit an investment mean that many exogenous, unplanned and unpredictable factors undermine the potential for statistical patterns to provide any alpha. These exogenous factors lead to an exponential decay in the accuracy of any forecast over time.

2. Feature Extraction

Public market: Difficult. Quant hedge funds compete on identifying a "signal" with which to trade. RenTech famously released one signal they had found - fine weather in the morning correlated to upward trend in the stock market in that city. However, the trend wasn't big enough to overcome transaction costs - so they released it publicly. The point here is that the quant hedge funds have an almost unlimited amount of data to mine: intra-second, machine readable, tic-by-tic price movements on thousands of securities and derivatives all over the world. 

Private market: Difficult. By definition, private companies keep information private. There are no tic-by-tic data libraries of valuation movements (indeed valuations only move in discrete steps.) There is also a limited historical set of information on startups (changing rapidly thanks to PitchBook, Crunchbase and CBInsights.) This means that if you want to build quant models for the private market you need to get creative (beyond PitchBook, Crunchbase etc.)  For example, although it was still for their public quant hedge fund, Winton released a blog post systematically examining a companies proclivity to register their domain name in different countries and whether that could be a signal for the competence of company's technical leadership. Systematic feature extraction seems to be the original direction of Google Ventures when they discussed strategy more publicly at launch back in 2013: 

There are a number of signals you can mine and draw patterns from, including where successful entrepreneurs went to school, the companies they worked for and more
— Graham Spencer, Founding Partner - GV
Graham Spencer and Bill Maris, New York Times,   Google Ventures Stresses Science of Deal Not Art of Deal

Graham Spencer and Bill Maris, New York Times, Google Ventures Stresses Science of Deal Not Art of Deal

3. Availability of Data

Public market: Rich. Long historical record of continuous, machine readable and easily accessible data.

Private market: Sparse. Various sources offering incomplete data (missing data; missing founding rounds, conflicting reports etc.) The data is often not easily machine readable: names of funds could be First Round, First Round Capital, FRC, Josh Kopelman, FirstRound.) Extensive data cleaning is required (not to say this doesn't happen at the quant hedge funds, but less is required given the maturity of the data market for quant funds.)

4. Ease of Signal Expression

Public market: Easy. The public markets are continuous and liquid. One only needs to identify a signal and expression of that signal is trivial (assuming liquidity is not an issue which it most certainly is in some high frequency trading scenarios.) 

Private market: Hard. Even after the data is acquired, cleaned, made machine readable and a signal is found, the venture investor has to identify a new opportunity that matches that signal and also win access to that deal. Whereas the public markets are liquid and freely accessible, the private markets, again almost by definition, require 'permission based access.' 

5. Required Accuracy of Signal:

Public market: Low. The CIO of a $30b+ quant hedge fund once told me that if the signal is >50.1% accurate it is in play. The only way this works if if there are thousands of possible trades using this signal. Invoking the Law of Large Numbers, a 50.1% signal becomes profitable. So the extremely high number of possible trades (given the highly liquid, global, permission-less public market architecture) makes it a lot easier to identify a signal that can be used. 

Private market: High. In contrast, a venture fund has to have some concentration in portfolio companies to achieve traditional 3.0x+ ROIC that LPs expect. This translates to a low number of portfolio companies (to build/maintain high ownership.) A low number of "trades" therefore requires a highly accurate signal (much greater than 50% (a false negative in venture is very bad.) One could push back on this however: in trying to develop a benchmark for our ML models, I once asked a Partner at Sequoia what he considers a "high" seed - series A conversion rate. His answer: 50%.) I've explored the mathematical dynamics and efficacy of a larger, index-style VC fund here.

So how do we address these limitations of the private markets? Well the early stage private market architecture itself has been changing.

Changing Private Market Architecture

In an almost Soros-like reflexive loop, the number of deals done and the amount of data on these deals has dramatically increased over time. 


The graph below (from PwC Moneytree) shows the number of deals done (line) and amount of capital deployed to 'seed stage' deals from 1Q02 to 1Q18. It shows a 30x increase in the number of seed deals and a 40x increase in capital deployed at seed. 

Over that time companies were created to operationalize the data generated from these deals: PitchBook (2007), Crunchbase (2007), CBInsights (2008), AngelList (2010). Over this exact same timeframe, LinkedIn profiles grew from 10 in 2003 at launch to over 500m today (same trend can probably be seen in AngelList, Twitter, ProductHunt profiles.) This increase in the number of deals and data available (on those deals, founders, and other features) means more training data for machine learning models. The amount ad quality of data will only increase.

pwc deals.PNG


Liquidity has substantially increased in the early stage private market for three reasons: incredible proliferation of 'micro-VCs', mega-funds (SoftBank et. al.) and new secondary market options. Senior Managing Director at First Republic (and great VC blogger) Samir Kaji recently mentioned in a post they are tracking close to 650 micro-VCs and characteristically offers some insight into how this growth might influence the market architecture going forward. 


Given this dispersion of ownership interests in companies among many distinct funds, like Samir I find it very likely that consolidation will occur - and potentially the development of a new market - something I'm calling Synthetic Liquidity. This would be when a fund sells their ownership position prematurely (like at Series A or B) and the buying fund pays the selling fund carry. Obviously the selling fund is forgoing potentially lucrative upside but they are buying quick liquidity. I see AngelList being very well positioned to be the intermediary here. 

Shortening the time to realization may make some quant strategies viable (like using ML to forecast 1 - 2 years out rather than 10.) The idea here is that the early stage venture market is in a period of realignment which may introduce opportunity for new quant strategies.

Not much needs to be said here about the other secondary options available, just that they have contributed to the changing landscape of liquidity in the private market: SharesPost (2009), EquityZen (2013) and Equidate (2014) and also that the innovation and growth here unfortunately seems to have plateaued. 

Types of Data

Public quant funds used the data exhaust of the traditional investors as their fuel to build incredibly successful funds. The data exhaust here is the time-series price fluctuation in security markets. Quant funds ran statistical models on these time-series data sets and identified repeatable statistical patterns that became the foundation of their fund.

So what is the data exhaust in venture? "Secondary" information about funding rounds: who the investors in the round were? Who was lead investor? Were they new investors or follow-on investors? What industry is the company in? How much VC funding has gone into this industry this year? Growth of VC funding in this industry this year? Number of VC deals in this industry? Location of startup? Schools of founders? Are they first time founders? etc. etc.


The above data can be ripped from Crunchbase, AngelList, PitchBook, CBInsights, LinkedIn, SEC EDGAR, PwC MoneyTree and many other creative sources. 

    Future Directions - Quantamental VC

    I believe the attempt to make a thinking machine will help us greatly in finding out how we think ourselves.
    — Alan Turing

    Part of the benefit of examining the efficacy of quant approaches to VC (and indeed in building ML models to support VC investment decisions) is that it forces an examination of the way we currently understand the venture business. Here we've systematically analyzed the the architecture of the public and private markets which I think greatly helps to understand, in Turing's words "how we think ourselves."

    Today, there is a question of how much data could/should be used in venture. In High Output Management, Andy Grove said "Anything that can be done, will be done, if not by you then by someone else." I believe leveraging data in the venture investment process can be done. Building a fully standalone quant fund may still be some years off. The reason for this I believe lies in the developing architecture of the private market. The 'electronification' of the public markets in the 80's greatly enhanced the ease with which quant strategies could be built and deployed in the public markets and we are seeing an equivalent "datafication" of the venture business today.

    I believe the near future is quantamental VC funds. This is already starting to be realized; Sequoia has a data science group, Social Capital (had) a Head of Data Science and many other funds are not public with their data effort in the hopes of maintaining competitive advantage. The combination of the relentless growth of data available, the changing architecture of the early stage market, and the extreme need for differentiation given the explosion of funds, I believe will lead to inevitable innovation in the venture industry of the coming years.

    Industry Deep Dive: Open Source Databases

    This industry is incredibly interesting not only because databases are the core of every data strategy (and data is becoming the core of every strategy) but also because the business model is in itself still in question. With Benchmark's recent Series A lead of Timescale, a PostgreSQL packaged time series database and MongoDB's 2017 IPO, I thought it might be a good time to dig in a little to the industry, companies, business models and prospects for startups. 

    Primer: Ajay Kulkarni, Co-Founder and CEO of Timescale has a great post on the history of SQL versus NoSQL and where the industry is heading today.

    In looking to get context on growth rates, margins and revenue targets for new open source database startups, I looked at the IPO filings of 4 recent open source enterprise companies (Hortonworks, MongoDB, Pivotal and Cloudera) to understand what "success" looks like - at least what Wall St. is expecting should the companies make it to IPO. Importantly this influences the internal growth targets and financial forecasts of new startups. 

    Business Model and Growth Targets

    Open source database companies typically generate revenue through subscription agreements (for enterprises to use their software for commercial purposes) and through professional services. Red Hat was able to build a multi-billion dollar business through services revenue alone (but this is most likely due to the fact that they were part of the generational shift from UNIX to LINUX.) Although the code is open source, companies implement an "open-core" model where the core of the code is open source but enterprise grade features require a license.

    The below tables show the Subscription revenue and gross margin at IPO (Yr 0) and growth rates and margins for the 2 years preceding the IPO. We can use the table to generate a Subscription revenue target of $150m at 85% margin (and $30m Services revenue at 15%) at IPO for our fictional new open source database company. 


    The above graphs show why open source database companies are moving away from relying on services revenue as a key driver. Both the growth rates and margins are unreliable: the average services revenue growth rate at IPO was 24% at a gross margin of only 13%.

    Funding Required to Achieve Growth Targets and IPO

    The amount of capital required to go public (or just build something of value) depends very heavily on the industry of the company. This is why software businesses have allowed for another golden age of VC (there are obvious exceptions however, according to Crunchbase Snap raisde $4.6b in VC funding.)

    Below are graphs showing the amount of capital raised by each company before and after monetization. This distinction is important. The burn for engineers to build the database is small and as such the amount raised (and therefore founder dilution) is low. However, during the monetization phase significant amounts of capital are required. This money goes into the sales and marketing process for the new database to compete against Oracle, Amazon and the other open source solutions available.  


    The practical implication of raising this much (MongoDB's Series F was $150m, Cloudera's Series F was $740m - Crunchbase) is that for this amount of capital to be invested the valuation must reach commensurately high levels. For example if in these funding rounds the founders are willing to give away 20% of the company, the pre-money valuation would have to have been $600m. It would be interesting to see how this was justified, especially given monetization had only begun. 

    Industry Development and Future Prospects 

    The shift back to SQL from the NoSQL sidetrack is going to offer massive potential for value generation. The question is who will benefit? AWS CEO Andy Jassy recently said on CNBC that Amazon Aurora (the scalable SQL database) is their fastest growing product - ever. And Google's leadership in the database community basically ushered in the move back to SQL (after they were the ones to usher in the first move from SQL to NoSQL.) Their Spanner product, given their reputation is dominant. 

    But the startups are taking advantage of companies growing despondency with a perceived lock-in with larger companies. Oracle has been the most damaging in this regard and this is directly translated to Amazon and Google. Companies want strategic flexibility, especially given the mission critical nature of the database. They can only really get this by deploying a truly independent offering like that from Cockroch Labs, Timescale or the many other independent solutions. It is hard to bet against Amazon and Google. It will be very interesting to see how the future unfolds. 

    Thoughts on Fund Sizing

    This question seems to be in the mind of many investors today. From SoftBank's giant $100b to the proliferation of hundreds of "micro-VCs", understanding the importance of fund size has become critical. 

    In this post I want to provide some ways to formalize the fund sizing question for the emerging manager. Simply put, the returns of a fund are the summation of exited portfolio company valuations multiplied the ownership of those companies at exit. Although the mathematical formulation of this is trivial (see below) I believe it helps in understanding the dynamics of fund size and ownership:


    This equation shows us why ownership is so important for VC funds: if ownership is decreased, to maintain the same returns the exit value of the company must increase. However, given the power law of VC outcomes, this is exponentially more difficult. 

    This is shown below, with three hypothetical funds, $100m, $150m and $200m. Assuming average exit ownership of 3.0%, the graph shows the required exited market cap to generate various fund multiples. Again, moving up the y-axis is exponentially more difficult given power law VC.


    In a recent post, Partner at Founder Collective, Micah Rosenbloom stated "it's easier to make money on carry if you make money on fees." To drive this home for the emerging manager, next we look at how carry dollars are influenced by fund size. Table 2.0 provides the graph above in table form (for example: with a $150m fund and 3.0% exit ownership returning 3.0x requires a combined market cap on exit of $15b.) Table 1.0 links to Table 2.0 and shows the carry dollars to the Partner group in each fund size and outcome scenario.   


    Here again, moving along the x-axis (higher exits) is exponentially difficult. But what is most interesting in the Tables above is comparing the likelihood of scenarios. What is more likely, generating $4.2b of exits on a $50m fund or $7.5b on a $150m fund? Both generate $15m in carry for the Partner group. Sounding like a broken record now (power law of VC) I would say the smaller the exit value required is better (but as is the game in VC both scenarios are very unlikely.)

    There are many factors that can alter the above calculus (initial ownership, ability to execute pro-rata, exposure to quality companies etc.) but hopefully this formalization can contribute to building the most optimal fund given your specific circumstances and goals.

    Note: Most material in the above post is taken from my introductory Fund Sizing Deck I use when consulting with emerging managers. For a copy of the full deck and/or the excel model please email

    The Power of Networks in the Golden Age of Silicon Valley

    A wise man once told me that a business is just a group of people. We have a tendency to think of some very successful companies as larger than life for many reasons (like the "taste the feeling" ephemeral joy of Coca Cola or the minimalist sophistication of Apple.) But history is littered with once vaunted companies that suffered leadership complacency or willful blindness in the face of change.

    What I want to explore here however, is the network element of the "group of people" description of a business. We will see how valuable the network was to leadership roles in new startups and new investment opportunities for early VC funds. The 1950 - 1985 period of U.S. and Silicon Valley history is immensely rich network effects permeating the birth of the "minicomputer" to the development of synthetic insulin. 

    There are incredible resources that explore in more detail the narratives behind these networks (see notes at end.) Below is a concise filtering of the key network elements in rough chronological order and a diagram for further effect.


    • HBS Professor Georges Doriot ("the General") created the first modern venture fund in the U.S called American Research and Development (ARD) on June 6, 1946 [1]. 
    • In 1956, Nobel Laureate and inventor of the transistor, William Shockley created Shockley Semiconductor. In 1957 the "traitorous eight" left Shockley to create their own firm [2]
    • Arthur Rock, who was a student of Georges Doriot at HBS was introduced to these eight engineers by Eugene Kleiner (yes THAT Kleiner) whose Father had a brokerage account with Rock's firm. Arthur Rock convinced Sherman Fairchild to provide the capital to create Farichild Semiconductor [1].
    • So many new companies, like AMD, were created from Fairchild alums they were called the "Fairchildren" [2] - analagous to the TIger Cubs in the hedge fund industry
    • In 1957 Doriot and ARD's big win was Digital Electronics Corporation, an early computer and software developer.
    • During succession discussions, Doroit reached out to Thomas Perkins at HP - Perkins would later create KPCB with Eugene Kleiner [1].
    • In 1965 William Elfers who worked as Doroit's "right hand man" left ARD to create his own venture fund; Greylock Partners (based on east coast, they only opened their first Silicon Valley office in 1999) [1].
    • In 1968, Gordon Moore and Robert Noyce (part of the Shockley "traitorous eight") left Fairchild to create Intel - funded again by Arthur Rock
    • In 1972 ex Fairchild marketing executive Don Valentine created Sequoia Capital.
    Don Valentine (Sequoia) in front of early Apple advertisement

    Don Valentine (Sequoia) in front of early Apple advertisement


    • Atari founders Nolan Bushnell and Ted Dabney (who met at the widely successful Ampex along with employee number 1, Al Alcorn) reached out to Sequoia partner Valentine in 1972
    • In 1972, Eugene Kleiner (former Shockley and Fairchild engineer) and Thomas Perkins (from HP) created Kleiner Perkins
    • In 1975, ex-Atari employee Steve Jobs was introduced to Valentine at Sequoia. Valentine thought Jobs and Wozniak needed a more business minded 3rd co-founder so he brought in a former Fairchild colleague, Mike Markkula.
    • In 1975, Robert Swason was a former VC looking to form a company. He had heard about an exciting new technology, recombinant DNA and got his hands on a list of industry professionals who had attended an international conference in this field. He began cold calling scientists on this list alphabetically. Robert Boyer (B) was the first to agree to take the call [3]
    • Robert Swanson pitched his former boss Thomas Kleiner for $500k. Kleiner Perkins invested $100k and they created Genentech (Gen-etic -En-gineering Tech-nology) [3]

    Now, a couple points from the above network. There are a number of instances where new people were introduced to the network with limited connections or personal history (Atari founders to Valentine, Robert Swanson cold calling scientists etc.) So even though the power of networks here is self evident, there are exceptions for exceptional people [See the John Doerr - Amazon bonus note below.]

    Second, working at Fairchild (or in any other part of this network) was not a necessary, nor sufficient condition for building a successful startup in this period. It took an intense amount of hard work and (probably) equal parts luck. But investors in startups often talk about reducing risk as much as possible. Perhaps being in this network was a tangible risk reducer.

    Think about Valentine's investment in Apple. He didn't invest directly, he brought in Mike Markkula (who he must have trusted given his professional relationship) because Valentine though Jobs and Wozniak didn't have "any sense of the size of the market, they weren't thinking anywhere near big enough".  Even after bringing in Markkula, Valentine only invested in the next round [1].


    • In 1995, former Intel executive and Kleiner Perkins Partner John Doerr invested in Amazon founded by Jeff Bezos
    • In 1998, Jeff Bezos invested in the Angel round of Google
    • Less than a year later the Series A was led by.... John Doerr at Kleiner Perkins.
    • How did Jeff Bezos get to John Doerr? Well it was actually the other way around. Listen to this podcast where Madrona Partner and lead investor in Amazon seed round Tom Alberg describes the connection:
      • Tom Alberg was on the Vizio Board with one of Doerr's Partners
      • Alberg got home one night and his wife said "someone called John Doerr has been calling every 15 minutes and says he needs to speak to you now!"
      • Such an awesome story... not the power of networks but the power of persistence! 


    • [1] Creative Capital - Georges Doroit and the Birth of Venture Capital, Spencer E. Ante
    • [2] Troublemakers, Leslie Berlin
    • [3] The Gene, Siddhartha Mukherjee

    The End of our Shared Experience in Physical Space

    A few months ago I came across Jido Maps, a startup enabling persistent AR. While I'm not exactly sure yet what could act as an adoption catalyst for the developer community to build persistent AR apps (perhaps another Pkemon Go or equivalent?) I do believe that AR will inevitably transform the city environment. 



    But whereas traditional, physical advertising (and indeed every other physical object in the city) is limited in that it can only ever be one instance of a billboard, neon sign, or display in a shop window, with AR these advertisements can be individually target in physical space just as they are online today. This idea is not new, back in 2007 (!) Microsoft patented "Personal augmented reality advertising" - see below for an image from that patent.



    Taking this to the extreme, we could be walking together through the city (with some sort of AR type contact lenses - hopefully soon!) and our experiences could be very different. I'm interested in exploring what will happen in this "hyper-unique", fragmented cultural environment. We've already seen how dangerous this is online, will it be equally troubling in a augmented offline world?

    The beginnings of this dangerous world were implicitly foreshadowed in "Cellular Convergence and the Death of Privacy" by Stephen B. Wicker, published by the Cambridge University Press. The book describes the (forgotten narrative) of how the explosive adoption of the smartphone (by users and developers) created a single failure point for privacy (that is to say, one device had data on every aspect of your life.) What will happen when we have, as Steve Gu at AiFi phrased it so elegantly: pervasive, perceptual computing?

    To me the technology underlying this "pervasive, perceptual computing has already been discovered and the applications that could run on top of this platform could save lives, save money and entertain. It seems inevitable that it will be deployed and adopted (persistent AR, fragmented and hyper-personalized city environments and pervasive, perceptual computing.) In this world therefore, the most key questions are those that are being explored around data ownership and monetization.

    From a technology perspective, the decentralization and self-sovereignty movements being enabled by blockchain technology provides an enticing potential solution to many of these problems. From a legal and social responsibility perspective it's hard to look past the incredible content coming out of the Yale Law School Information Society Project. Despite running for a long time, Jack Balkin's ISP seems so perfectly timely for the questions we are currently raising today.



    I don't know what the right answers are, but I am definitely interested in finding them. 


    Exploring Startup Activity: NYC vs. SF (03/05/18 - 03/12/18)

    Running the startup miner uncovers great startups almost every day (like Thread Genius, Trove and Lively.) Side note: you can see the updated list of mined, high potential startups here. I started to get curious whether some value could also be extracted a 'meta-data' style analysis.

    The output of the miner is the list of startups that were listed on AngelList the previous day per geographic region, I run it for NYC and SF Bay (note: you could also cut the newly listed startups by market, so you could scrape new blockchain, healthcare or AI startups from the previous day.) So now we can explore the raw number of startups listed on AngelList per week and the market composition of these startups. This could help to create a real time awareness of what founders are excited about and potential differences between regions. 

    To explore this idea, I ran these numbers for the week of 03/05/18 to 03/12/18.


    So we see approximately 1.7x number of startups listed in SF Bay than NYC. This seems to be much smaller than common wisdom suggests (given the prominent position SV holds in the tech community.) Indeed this data is mirrored in data from PwC Moneytree. Taking the median of the number of startups and the amount of capital deployed to NYC and SF Bay seed stage startups (since this most closely approximates AngelList startups) over the last 2 years we see a 2.1x number of startups in SV over NYC and a 2.0x in capital deployed.

    The market breakdown for the NYC startups is shown below: 


    It might be hard to make any meaningful observations from this data in isolation [1]. But there are some we can see clearly. Consumer dominates and Blockchain and AI /ML/Data startups are low (based on my prior expectation.) But comparing with SF Bay will be most helpful. The market breakdown for SF Bay area startups for the previous week is shown below. 


    Here we can see a (nice, somewhat predictable) balance between Enterprise Software and Consumer startups (this may be representative of the "maturity" of SF Bay as a startup ecosystem.) Healthcare and Blockchain seem low and Education surprisingly high. 

    As common VC wisdom suggests, I think the (ongoing) market examination of these startup ecosystems will be helpful in a contradictory way: the best startups are often tackling markets that are not hot (home sharing, transportation, social etc.) and many are resistant to, and in fact break rigid data structures by definition being highly innovative (which is precisely the point.)

    I'm looking forward to continuing this series (with more than just one weeks worth of data!)


    [1] AngelList's UI allows users to write free form text for their market categorization when creating a new startup profile. If it matches a previous tag it autofills but if not a new market tag can be created. This makes it a little difficult to run analytics (some of the best market categories for scraped startups last week include: swimming, USA and livestock options.) So i created my own 'umbrella' market tags to consolidate free form text tags. Disclaimer obviously this could introduce distortion, but it is assumed to be negligible.

    Exogenous Factors and Company Success

    The timing of Uber’s launch may be another (less talked about) important element to its eventual success. According to Crunchbase, it was founded in 2009, superficially not the best year to start building a business, right after the biggest financial collapse since the Great Depression.


    But for Uber’s business model the crisis may have in fact been beneficial. Uber relied on the assumption they could make it an attractive option for the ‘non-taxi driver’ public to start driving and earn a bit or money on the side (Uber even had a whole marketing campaign later centered around promoting the “side hustle.”) At a time when the unemployment rate was 8.5%, having an ‘easy’ way to earn more money probably attracted many people to join Uber as a driver, and if their model is sticky which I think it is, they were able to create a stable, large network or ‘contractor’ drivers for the Uber network. Indeed there are similarities with Airbnb, founded in 2008, enabling people to list their spare space and earn a bit of “side hustle” income at a time when it was needed by most.

    This then begs the question, how important are exogenous factors (stock market value, unemployment, social unrest, political climate, pop culture) in successful early stage technology company building? This isn't an altogether new question: in the wonderfully prophetic philosophical exploration of the potential of the Internet, published in 1997, Internet Dreams (only $7.00!), Mark Steifik provides a passage on the '"Gutenburg Myth" form Scott D. N. Cook:

    "At the very least, this account of the printing revolution suggests that the traditional, one dimensional model of new technologies (or a single new material gadget) causing broad social change must be regarded with deep suspicion."

    Cook introduces the beautiful phrase: 'political and moral myopia' in revisionist history that weights too heavily the technological innovation in a whirlpool of social, political and economic upheaval (the revolutionary period in the France and United States created a social climate of equality leading to increased education, literacy and therefore demand for printing - which the printing press could accommodate.)

    So what are the exogenous political, moral, economic factors most important today?

    • All time high on the Dow, NASDAQ and S&P500 (despite Govt. shutdown, political turmoil, NK etc.)
    • Unemployment at 4.1% ('full employment') with interest rates rising
    • Potential social and political backlash against monopolist technology
    • General lack of appetite for going public, with SoftBank Vision Fund elephant in the room
    • Increasing polarization of social and political beliefs?
    • Breakdown of the Section 230 safe harbor provision for 'non-publisher' online platforms?
    • Ageing population as Baby Boomers retire?
    • Millennial generation adopting network based, rental (house, furniture, clothes, experiences) economy versus ownership (car, house etc.) economy?
    • Stagnating wages, rising inequality?

    The benefit of this exploration is in the brainstorm and not the answer (even if there is one.) Many ways to explore the above. 

    This website could be helpful in exogenous factor exploration:

    Things That Change and Things That Stay the Same (Or The Technological Event Horizon)

    This Jeff Bezos quote has done the rounds for a while and for good reason (Bill Gurley and Collab Fund). It's equal parts obvious and contrarian. It has been in the back of my mind for a while and I think I finally understood why.


    “I very frequently get the question: ‘What’s going to change in the next 10 years?’ And that is a very interesting question; it’s a very common one. I almost never get the question: ‘What’s not going to change in the next 10 years?’ And I submit to you that that second question is actually the more important of the two — because you can build a business strategy around the things that are stable in time. … [I]n our retail business, we know that customers want low prices, and I know that’s going to be true 10 years from now. They want fast delivery; they want vast selection. It’s impossible to imagine a future 10 years from now where a customer comes up and says, ‘Jeff I love Amazon; I just wish the prices were a little higher,’ [or] ‘I love Amazon; I just wish you’d deliver a little more slowly.’ Impossible.”  - Jeff Bezos

    For me understanding what doesn't change is almost half the equation. The other half is understanding where these things that don't change happen: the technological event horizon. OK yes that is super jargon but hear me out. It seems as though there are values that don't change and outcomes that don't change. We can look at this using personal transportation as an example:

    • Outcome that doesn't change: getting from point A to point B.
    • Value that doesn't change: getting there the cheapest, quickest, safest
    • Where it happened: walking, horse, horse carriage, electric street car, car, taxi, uber, AV

    Personal entertainment offers another example: 

    • Outcome that doesn't change: personal entertainment
    • Value that doesn't change: choice, value, quality, ease of access
    • Where it happened: book, nickelodeon, cinema, radio, TV, internet

    In both cases the value doesn't change, the outcome doesn't change but where these things happen, the technological event horizon does change. Even Jeff Bezos knows this. Amazon was built on people wanting wide selection at low prices enabled by (the thing that has changed:) the internet. 

    A Framework for Company Formation Diligence

    Disclaimer: Even before I write this post I know it's going to appear very buzzwordy. I'm going to attempt to make it as practical and actionable as possible.

    How do you know your idea for a company has potential? I'm not talking about superficial halo effect success metrics like VC funding, co-founder excitement or press. I mean how do you structure a rigorous examination of the potential of your idea?

    I looked online briefly and didn't really find any compelling information. Google offered "10 Ways to Know if You Have a Good Business Idea" which seemed to be the online equivalent of a tax "expert" at a strip mall. First Round's new First Search tool offered Chris Dixon's "Why You Shouldn't Keep Your Startup Idea Secret," still not exactly what I was looking for. Sequoia has a brief but interesting post on "Writing a Business Plan" that is definitely worth checking out.

    So it seems there may be room for some examination of the structured diligence that could be applied to the brainstorming session of a co-founding team. Below is that examination. It is still a work in progress but, like most helpful frameworks I think, it breaks the problem down with 3 main components; the thematic layer, the strategic layer and the tactical layer.

    Framework for company diligence.PNG

    Thematic Layer

    The math of venture requires funds invest in companies that have, as Sequoia puts it, "legendary" potential. These legendary companies generally take advantage of tectonic shifts in history. These shifts can be difficult to identify by incumbents since they have a financial incentive to believe in the stability of their value proposition. As 'The Sovereign Individual' points out "if you know nothing else about the future, you can rest assured that dramatic changes will be neither welcomed nor advertised by conventional thinkers...the tendency will be to downplay the inevitability of these changes." Famous themes that led to the development of legendary companies include:

    • Increased necessity and potential of personal computation as a tool and platform
    • Efficiently and effectively organizing the worlds increasingly abundant digital information
    • Allowing the world to maintain, build and share social connections online

    Examples of decades-long themes that could form the core of a company today could be:

    • Increased decentralization and modularization of information and cultural production
    • Increased regulatory burden on data-centric technology companies
    • Pervasive perceptual computing as a new consumer interface

    Resources that help with the thematic layer diligence of a company idea:

    • Technological Revolutions and Financial Capital, Carlota Perez
    • The Rise and Fall of American Growth, Robert J. Gordon 
    • The British Industrial Revolution in Global Perspective, Robert C. Allen
    • A whiteboard

    Strategic Layer

    Assuming you are able to get funding, ship product, get customers... what is your company's competitive differentiation that allows you to keep those customers and what gives your company escape velocity? Is it internally building a proprietary database? First mover and product leader? Brand differentiation? Design differentiation? Interestingly in First Round's 2017 State of Startups only 5% of founders said they think they could fail because a "competitor outdid them."

    Resources that help with strategic layer diligence: 

    • The Innovators Solution, Clayton M. Christensen and Michael E. Raynor
    • Creative Confidence, David M. Kelley and Tom Kelley
    • Zero to One: Notes on Startups, or How to Build the Future, Peter Thiel and Blake Masters
    • Crossing the Chasm, Geoffrey Moore
    • Stratechery
    • Public company filings, SEC Edgar

    Tactical Layer

    Getting from 0 to 1: The hardest, loneliest stage. Where do you start, what are your priorities? How much runway do you have? Which verticals do you start with? Who do you know in VC? Who could you convince to join in the earliest stages? Who else is out there trying to do what you are doing (be honest)? Why are you a better team to build this company instead of them?

    Resources that help with the tactical layer of diligence:

    • First Search, First Round
    • High Output Management, Andy Grove
    • The Hard Thing About Hard Things, Ben Horowitz
    • Experience

    China, Technology Development and the Future

    Legendary Sequoia Partner Mike Moritz recently penned an article in the Financial Times, as he is wont to do, titled "Silicon Valley Would be Wise to Follow China's Lead." It describes the culture of working at a startup in China; long hours, family sacrifices and a "furious" pace of work. This would be fine in my opinion, if it were a singular exploration of a very unique culture. By not only contrasting it with Silicon Valley, but also being prescriptive in saying we should "follow" the "lead" of China, Moritz misses a chance to focus on output as the main measure of success rather than the outward impression of how busy you are. The two cultures are so drastically different, a suggestion that one should simply follow the other seems sub-optimal.

    Having said all that, the comparison of China and the U.S. technologically, culturally, intellectually, has been a long one. Now, in exploring output or value creation as the main comparison factor we see there are many elements of this story that are worth investigating:

    • Will the relative lack of privacy concerns on personal data in China enable them to create massive databases to train their models, unavailable to the U.S. stricter data controls?
    • Will different government policies have a long term impact on the locus of power in A.I? (In an official report China's government issues a strong call to arms: "we must take the initiative to seek change and keep a firm grasp of major historical opportunities in the development of artificial intelligence." Meanwhile the current U.S. administration is cutting funding to A.I. research.)
    • Does the U.S. offer the state of the art in mobile design? China effectively skipped the PC revolution and directly adopted the smartphone with interesting consequences. Dan Grover, former Product Manager at WeChat (now at Facebook) detailed in 2014 and 2016 the differences in mobile design between the U.S. and China, not hinting at which is better but it was very fascinating. 
    • Does the U.S. offer state of the art in AI/ML? Andrew Ng is not shy to burn SV from time to time

    In my view, a major story that will play out over the next decade is the internal conflict of technological regulation. It is both a necessity to maintain the core value of privacy and a potential hindrance on technology advancement. Will China's more lax regulations allow it to adopt and deploy autonomous vehicles quicker? virtual reality as entertainment quicker? A.I for medical diagnosis? For everything?

    People have seen this coming. a16z have Partners that worked in the White House under Bush and Obama and Tusk Ventures was recently created specifically to help startups "thrive in heavily regulated markets." It's going to be a fascinating decade. 

    Notes: The British Industrial Revolution in Global Perspective

    Buy it, Cambridge University Press, Robert C. Allen

    When you stop to think about it, the idea of using the past to help predict the future of technological development seems kind of self-contradictory. But there is obvious value in understanding the path of development. Below are some interesting insights from renowned historian Robert C. Allen on the British Industrial revolution that I think can be translated to the present, with the goal of shedding light on the highest potential that exists today for company builders. 

      • 7: "Turning scientific knowledge into working technology was an expensive proposition, and it was a worthwhile investment only in Britain where the large coal industry created a high demand for drainage and an unlimited supply of virtually free fuel"
      • 15: "This book explores how Britain's high wages and cheap energy increased the demand for technology by giving British businesses an exceptional incentive to invent techniques that substituted capital and energy for labor."
      • 42: "In the mid-eighteenth century it was the high wages in Britain...that played the important role of imparting a labor-saving bias to technical change"
      • 57: "Greater food production and lower farm employment led to an expanded urban population. The result was greater manufacturing production and economic growth"
      • 79: "London and the proto-industrial sectors were the engines of growth. Their expansion raised wage rates and drew labor out of agriculture. Small farmers either sold out and moved to the city or improved their methods and raised their yields in order to keep up with urban incomes and participate in the consumer revolution"
      • 82: "Abundant coal [in Northern England] made energy very cheap. Coal was also important...for its technological spin-offs, the steam engine and the railway."
      • 92: "Copying and elaborating innovations was the way the coal burning house evolved. In this model, which is called 'collective innovation', the rate of experimentation depended crucially on the rate of house building"
      • 105: "One of the puzzling features of the high wage economy was how British firms could pay more for their labor than French firms...One reason is that cheap energy offset the burden of high wages"
      • 128: "The intercontinental trade boom was a key development that propelled northwestern Europe forward."
      • 137: "Any theory that explains British success by positing a British genius for invention is immediately suspect."
      • 141: "A change in the relative prices of the factors of production is itself a spur to innovation and to inventions of a particular kind - directed at economizing the use of a factor which has become relatively expensive."
      • 149: "The French shifted to mineral fuel smelting very quickly: a 'tipping point' was reached. The French jumped directly to the most advanced blast furnace technology and skipped all the intermediate stages through which the British progressed. Britain's competitive advantage had been based on the invention of technology that benefited it differentially. It is ironic that the success of Britain's engineers in perfecting that technology destroyed the country's competitive advantage."
      • 151: "Macro-inventions are characterized by a radical change in factor proportions"
      • 173: "The real cost of rotary power in the mid-1840's was about one-third what it had been...while the real cost of pumping power dropped by about half. The efficiency of the pumping engine had doubled and that of the rotary engine tripled"
      • 190: "Most macro-inventions were inspired by knowledge or practice from outside the industry"
      • 199: "The real issues involved in 'inventing' mechanical spinning...was not in thinking up the roller; rather, the challenges were the practical issues of making the roller work in the application."
      • 225: "In the early 1730s, he proposed to expand the business by producing cast iron parts for steam engines, which he anticipated would be a growing business after the expiration of the Savery-Newcomen patent in 1733."
      • 241: "The third aspect of the Industrial Revolution is the application of the scientific method to the study of technology through experimentation. The 'legitimization of systematic experimentation'."
      • 255: "Experimentation was, therefore, the common feature that characterized eighteenth century inventions"
      • 273: "Steam technology accounted for close to half of the growth in labor productivity in Britain in the second half of the eighteenth century"
      • 273: "The steam engine was invented to drain coal mines"
      • 275: "The British were simply luckier in their geology"

      Questions that arose after reading and reviewing:

      1. What is the 'bias' in today's technological change? What factor of production are we economizing for? Does it depend on industry? Energy/labor/materials/attention/human creativity?
      2. Should we look to China as the leader in the AI revolution? China skipped PC and straight to (superior?) mobile platform. What does this mean in terms of leadership position in the future? What are they doing now? Specifically in consumer driven companies? Perspectives from Mike Moritz (FT), Dan Grover, Facebook

      Founder Preparation/Diligence and 'Dynamic Balance'

      Looking back on it now, I was unequivocally unprepared for the process of becoming a founder. I say this with the benefit of hindsight obviously, but also because in my current job after meeting with many founders, I’ve devoted more time to thinking about the “profile” of a “good” founder (even exploring whether we could build predictive algos around this profile). I put these terms in quotations very deliberately because this profile does not exist. But that doesn’t mean there is nothing we can do to prepare ourselves to become or identify creative, high potential founders.

      I want to outline how hard it is being a founder by exploring some of the things that we try and look for in the best ones. What we’re looking for is an idea of dynamic balance. I don’t mean work life balance (although that is important.) What I mean is, how do they manage two perhaps contradictory, and potentially both “good” responses to key questions, strategy directions, decisions etc.

      Ben Horowitz hints at this idea in “The Hard Thing About Hard Things” when referring to CEO stress, saying CEOs make one of the following mistakes “1. They take things too personally, 2. They do not take things personally enough. The “right” response is highly context dependent; sometimes taking this personally is the “right” call sometimes not. What is important here is that the founder has the ‘dynamic balance’ to make this correct judgement.

      For me the best way to think about it is using a fulcrum to visualize the competing response. The key founder criteria, with competing responses are shown below:


      Let's investigate these a little further.


      In 'Make Your Bed' a deceptively simply titled book, former U.S. Navy Admiral William McRaven highlights the benefits of being "unshackled by fear." Founders need this too. But he also outlined how detailed and calculated each move they made was. The dynamic balance here is understanding the fact (more than just a cursory acceptance) that your company will most probably die and still being unafraid (and excited!)


      This one is really interesting. Jim Simons at Renaissance Technologies, arguably the best quant hedge fund in the history of the world, famously does not hire anyone out of Wall Street. He wants physics PhDs straight out of school because he doesn't want them to be 'corrupted' by the 'wrong' ways of making investment decisions. 

      MIT Professor Andrew Lo has a similar take on this; in his new book Adaptive Markets , he describes the scene of a shark thrashing about on the shore of a beach. Lo does this to illuminate how a creature of such (terrifying) hunting perfection can be reduced to ridicule simply by changing context. Now, this may be obvious, but the point here is to say the shark is so hyper-adapted to hunting in the ocean that it is useless on land. I say this to explore the idea of why Hilton didn't make Airbnb. Hilton (or any other global hotel chain) was so hyper-adapted to hunting in their context they missed (and arguably could never have seen or executed) on this multi-billion dollar opportunity. 

      So, depending on the market with which the founder is operating in we either tolerate (or indeed seek out) creative, contrarian outsiders or the market will necessitate domain knowledge (like building ML-specific ASICs for example.)

      Bias for Action

      Is decisive, takes action, makes decisions, fails fast, etc etc. versus has been deliberate about building the "communication architecture" as Ben Horowitz calls it around seeking feedback from team members, investors, advisers on important questions. Again, the idea here is that every situation will be different, what we want to diligence is the founders awareness of when to tip towards one or the other. 

      Vision and Adaptability

      Andy Grove pivoted his huge, public company away from what they've been doing successfully for many, many years to save the entire company. This one for me is actually one and the same thing. You need to think of adaptions to support having a thriving, continuing vision; the shark needs to think if he'll ever needs legs. 

      There is no "right" answer to the dynamic balance on these questions and they change depending on may factors. But we believe this high level, translatable framework is helpful in learning more about how the founder will run her business in the future.

      (also let’s talk! You can reach me at and


      An Adaptive Markets Based Approach to Venture Capital Investments

      “Financial markets don’t follow economic laws. Financial markets are a product of human evolution and follow biological laws instead”

      - Andrew Lo, Adaptive Markets

      In attempting to understand (and exploit) the operation of modern financial markets, academics and investors alike have long found comfort in the reductionism of all-encompassing equations. The problem is these equations can sometimes be wrong, and when they are wrong, they are destructively wrong. Richard Feynman once quipped “imagine how difficult physics would be if electrons had feelings,” pointing out with characteristic wit, the inappropriateness of translating physics to human financial markets. In trying to solve this problem there has been a recent shift towards recognizing the role that humans play in these markets (and the irrationality and unpredictability they create) and by extension the role that biological laws play in finance. From Herbert Simon’s bounded rationality, to Daniel Kahneman’s heuristics and biases, this focus on human biology in financial markets has been a long time coming.



      Our focus here will be on how we incorporate (and exploit) these new techniques and tools as an investor in the venture capital market. In this context., we can see that attempting to incorporate the ‘physics’ of machine learning alone will be suboptimal. We need to leverage biological laws in optimizing our investments. We do this by using interpreting the venture capital market as a complex adaptive system, and draw on insights from machine learning, theoretical computer science, graph theory, and evolutionary game theory.  

      Why Machine Learning Alone is Not Enough

      Building quantitative tools to support investment decisions is valuable in itself. Alan Turing once said “I believe that the attempt to make a thinking machine will help us greatly in finding out how we think ourselves.” I believe all venture investors, for every decision, invoke an internal ‘model’ that they’ve ‘learned’ over their career through the all companies reviewed, decisions made, successes and failures. In building a model we can learn more about how we make decisions, and how we can improve them. The is based on the problem that humans forget, are biased, and generally make sub-optimal, heuristic based short cut decisions. Also, how many sufficiently detailed deals could a human have possibly seen? 2,000, 10,000? And of these deals how many ‘features’ do they remember about each of the deals? Psychologist George Miller of Princeton famously found that humans can only hold 7 objects (plus or minus 2) in their working memory.

      But a machine learning model does not forget, is not biased (as long as the training data is appropriate) and can evaluate all 30,000+ deals in making a decision. But what happens when the first Blockchain deal is reviewed by the model? What ‘market’ feature do we assign to this new market? Here we see the breakdown of using only machine learning to make decisions; it violates the invariance assumption (from Theoretical Computer Science Professor Leslie Valiant), the invariance assumption states that the context in which the generalization (prediction) is to be applied cannot be fundamentally different from that in which it was made.

      But in almost every successful case, the entrepreneur is deliberately trying to violate this assumption. ‘We are doing something completely unique’, every entrepreneur is deliberately trying to break the current context and introduce something new, and if it sufficiently new that it is unrecognizable to the model that has learned over the past 10 years (like blockchain technology) the model is broken.

      Incorporating Elements of Systems Biology

      “I remember the first time I met Edsger Dijkstra. He was noted not only for his pioneering contributions to computer science, but also for having strong opinions and a stinging wit. He asked me what I was working on. Perhaps just to provoke a memorable exchange I said, “AI.” To that he immediately responded, ‘Why don’t you just on work on I.’ ”

      Harvard Professor Leslie Valiant, Probably Approximately Correct

      So it is that highly complex, non-linear systems must be treated (at the time of writing) with more than just artificial intelligence. We are trying to get to a more complete understanding, and with that goal in mind we introduce elements of biology into the mental model.

      The venture capital market lends itself naturally to biology; it is completely driven by human interactions, networks and relationships, it’s constantly evolving, and involves concepts of competition and survival analogous to evolutional biology. Indeed, with some (all?) companies it is inherently human; Sequoia Capital reportedly analyzes which of the ‘7 Deadly Sins’ the company under question exploits.

      Applying X to venture capital:

      • Evolutionary biology
      • Graph theory
      • Diversity and complexity
      • Theory of ‘Ecorithms’

      To be continued.




      Running the Math on a Larger VC Portfolio

      Note: Kendrick Kho contributed heavily to the modeling in this post

      There are a few examples of funds building very large portfolios of early stage companies, mainly being YC, 500 Startups, SV Angel and to some extent A16Z. The reasons for building a larger portfolio vary between funds (since they have different incentives) but common among all of them is the benefit of a higher likelihood of having a hyper-successful company in the portfolio. The trade-off however (assuming that the fund dollar amount if constant) the ownership per portfolio company will be lower (assuming that the post-money valuations are steady also.) So is the benefit of higher likelihood of hyper-successful portfolio company worth the lower ownership per portfolio company?

      Given the Power Law sensitivity in VC (the Nassim Nicholas Taleb famously called the Black Swan effect) to the point where one Airbnb is (currently) worth roughly 30 'regular' unicorns [1]). But what we can do is build a higher level abstraction of what happens with a larger portfolio, shown below. 

      Effectively what we are doing is moving from the light blue line (high likelihood of low return, but very small likelihood of extremely high return - far right) to that of the dark blue line (high likelihood of 'decent' return but lower likelihood of extremely high return - far right.)

      This is because the small portfolio (say 20 - 30 companies) is concentrated and there will be some simulations where Airbnb (or equivalent) is in this concentrated portfolio which will lead to extremely high returns. In the larger portfolio, most of the simulations will exhibit a hyper-success company but we have lower ownership and therefore this hyper-success company doesn't contribute 10x the fund but maybe 1.5x. But this happens a lot more than in the smaller portfolio since now we are very diversified. 

      The above is merely meant to represent the idea of portfolio diversity and concentration. We now have simulated these two scenarios using the Power Law Hybrid outlined in a previous post. Results of this (1,000 trial Monte Carlo simulation) are shown below. 

      Here we see (in a bit more detail) the effect described above. Almost 50% of the time, our smaller portfolio (light blue) loses money (negative IRR). But, 10% of the time we return >30% IRR. Compare this with the larger portfolio (dark blue). Here we see (almost) 0% chance of returning <0% IRR (graph rounds very small numbers to 0%) but also (almost) 0% chance of returning >30%. Herein lies the tradeoff. 

      The larger portfolio is more reliable in returning an IRR of 15%+ but we miss the chance of returning 30%+. For LPs in many cases they can deploy funds that reliably generate 15% in different asset classes. They choose early stage venture specifically for the extreme and out-sized returns (despite this occurring only 10% of the time - in this simulation.)

      So now we can look at estimating how many portfolio companies are required to build a 'diversified' early stage portfolio. The graph below holds the fund $ size constant and adjusts the ownership per portfolio company along the x-axis representing the number of companies in our portfolio. The dark blue line is the median expected ROIC of the portfolio (again over a 1,000 trial Monte Carlo simulation) and the shaded blue represents the 3rd and 1st quartile. From the graph we can see that at around 250 - 300 companies the benefits of diversification are realized (disclaimer: this simulation is highly stylized and for discussion purposes only.)



      Modeling Returns in Venture Capital: Power Law Hybrid

      Before building models to sensitize the returns of different portfolio constructions we need a representation of an individual company's return. Empirically we know roughly 50% - 75% of seed stage companies die and that the hyper success case is, well, hyper-rare, maybe 1 in 200. Of course any mathematical representation of the returns likelihood of a seed stage company is steeped in uncertainty. The best we can do is use a distribution that doesn't seem too implausible. 

      By many people's estimation the Power Law is the most 'usable' and 'accurate' distribution to model seed stage returns. The Power Law is also the basis of our approximation. But we create a mutant hybrid with the Log Normal because in generating and experimenting with various Power Law configurations it seems a little too harsh on the death rate (>75% die in many cases.) The Power Law also seems a little too willing to produce hyper-success cases (1 - 2 in 100.)

      In sensitizing and understanding the returns of our portfolio constructions, these two 'errors' (the lower than realistic death rate and higher than realistic unicorn case) can be accepted. Indeed in any fund there must be some 'unfair advantages' that the fund manager has (superior network, proprietary deal flow, superior selection rate etc.) for he fund to exist in the first place. For Hone Capital machine learning models support our lower seed stage death rate and the AngelList network supports the higher than normal unicorn rate. 

      With return (measured in multiple of original post-money valuation) the Log Normal is run with the mean at 0.3x and the standard deviation at 1.0x. The Power Law is modeled using a Power Law coefficient of 1.159. Both shown below (vertical axis cut for clarity.)

      With these coefficients we can see that the 'effective' death rate is approximately 40%, another 40% return 1.0x - 3.0x heavily weighted towards the former and around 1.5% go to exit at 'unicorn' status. It is pretty clear these are just approximations which help us get a sense of the sensitivity of a portfolio construction. They are at best over-engineered and at worst wrong. 


      Graph Theory Applications in Venture Scout Programs

      Investors in early stage private technology companies produce a very rich graph to investigate. The graph is a secondary result of the investment activity (no one focuses on the graph while they invest). In this regard it is more like 'data exhaust'.

      But the information is very valuable since it allows you to quantify the 'connectedness' of an individual, or their centrality. There are many different forms of centrality (eigenvector centrality, betweenness centrality etc.) that are appropriate depending on the context of the problem. 

      In investigating emerging early stage investors, one could look at the 'ego network' of the investor in question at two points in time (after 2 or 3 deals and then again a year later, assuming they've executed more deals in that time). Using open source software from Stanford [1], you can quantify the difference in that investor's centrality. In an environment where access via relationships is important (as in early stage tech investing) this quantification of an investors increase in centrality could be very valuable as a proxy for their future success.


      Notes from 'High Output Management'

      "First, everything happens faster. Second, anything that can be done will be done, if not by you then by someone else."

      Notes taken from Andy Grove's High Output Management, originally published in 1983:

      Page xv (Introduction): The motto I'm advocating is "Let chaos reign, then rein in chaos"

      You could almost extract value out of each word in this sentence; 'Let' as in - you are in control and allowing this chaos to reign, 'chaos' as in - things can devolve into absolute disorder and yet you are still in control (allowing it) and on and on. 

      Page xvii (Introduction): You need to plan the way a fire department plans. It cannot anticipate where the next fire will be, so it has to shape an energetic and efficient team that is capable of responding to the unanticipated as well as to any ordinary event. 

      Here we see the benefits of building a dynamic and flexible business versus a defensible business. Some investors ask/evaluate how the current team/product etc. can be defended against would be attackers. However, as elegantly outlined in the above quote, it's more important to build flexible teams and products that can actively respond to competitive threats.

      Nassim Nicholas Taleb (author of The Black Swan and Fooled by Randomness) calls a company (or any other entity) that follows this idea antifragile. The human immune system is an example of an antifragile system: one that improves with every new threat [1].

      Page 16: You should guard against overreacting. This you can do by pairing indicators, so that together both effect and counter-effect are measured. 

      Page 60: Delegation without follow-through is abdication.

      Page 104: You should attempt to determine your customers' expectations and their perception of your performance.

      Page 189: [On giving feedback] Level, listen and leave yourself out...The purpose of the review is not to cleanse your system...but to improve his/her performance. 

      Page 194: Moving from blaming others to assuming responsibility constitutes an emotional step, while the move from assuming responsibility to finding the solution is an intellectual one.


      The Startup Genome

      Hidden within the magisterial work, The Gene by Siddhartha Mukherjee was this incredible quasi-equation about the interplay between the code of humanity, the genotype and the physical representation of the code, the phenotype. Basically it is not a one to one mapping, code != reality. This is true since the genotype is influenced by the environment the person is in and by random chance, hence:

      phenotype = genotype + environment + triggers + chance

      It seems plausible to port this into the realm of early stage companies and can support the discussion around using quantitative tools for private investment. Here we can think of the phenotype representing the eventual outcome of the venture, the genotype the attributes of the company, the environment the competitive and general endogenous financial landscape and the triggers and chance representing their usual meanings.

      Building machine learning tools for venture investment selection really only uses the genotype of the deal (and perhaps the environment). This allows us to see the inherent limitations of using quantitative tools in private investment (as Chamath Palihapitiya recently quipped on twitter, to paraphrase: it's hard to use excel to predict the future) since the phenotype is made up of triggers and chance

      But this hasn't stopped medicine from turbo-charging genomics to do what we can to eradicate disease so shouldn't we do what we can (build databases, ML models, graph theory etc.) to at least understand the startup genome as best we can so to remove as much uncertainty as possible? Obviously I think the answer is yes, which is why we've built and deployed ML models to support our investment decisions.