## Abstract

There are many methods for conducting performance attribution with portfolios containing only liquid assets. A lack of periodic asset return data and a clear definition of what constitutes an appropriate market benchmark thwarts efforts to perform similar types of attribution analyses for portfolios of private equity funds (and other illiquid investments). In this article, the authors propose a method for decomposing private fund portfolio performance into effects from timing, strategy selection, geographic focus, sizing of fund allocation, and fund selection attributes. They test the method with a simulation study and derive approximate confidence intervals for assessing attribute selection skill using a large historical dataset of buyout and venture capital funds.

**TOPICS:** Private equity, portfolio construction, performance measurement

**Key Findings**

▪ The method provided for evaluating the value created from decisions about the timing, size, geography, and asset class of private fund commitments in this article also provides a residual component. Analysts and investors can interpret this component as value-added by fund selection.

▪ The method provides intuitive estimates in terms of contributions to return multiples (MOICs) and IRRs.

▪ The authors conduct a simulation study using a large sample of private equity buyout and venture capital funds to estimate approximate confidence intervals for each allocation characteristic. Practitioners can use these intervals to statistically separate skill from luck.

While there exists a huge body of work examining performance attribution of portfolios holding publicly traded securities (see for example, Lo 2008 and citations therein), there is a paucity of research decomposing performance in the private asset space.^{1} However, the need for solutions is growing, as private funds are increasingly common in institutional portfolios. For example, large university endowments hold more than half of their portfolios in alternative investments, and the average US public pension fund holds around 20% of assets in alternatives (Hochberg and Rauh 2013 and Binfare et al. 2019).

There have been few efforts to perform a rigorous quantitative analysis on the performance decomposition of private fund portfolios for at least two reasons. First, there are no periodically reported returns for private funds. The returns on private funds are derived from cash paid into the fund by limited partners (LPs) relative to cash paid out of the fund to LPs over periods commonly reaching 15 years. Occasionally (and increasingly), LP ownership stakes will transact in the secondary market, but these are irregular, and prices are not typically observable. Consequently, fund performance is most commonly measured in terms of a multiple of invested capital (henceforth, MOIC) or an internal rate of return (hereafter, IRR) using all cash flows from a fund and often an assumed terminal net asset value (when the fund has not yet exited all investments). Second, there is no established benchmark for private funds. There are many different types of private funds, such as venture capital, buyout, credit, real estate, and infrastructure. Thus, an investor may want to benchmark a private fund portfolio to a set of other similar funds or even an appropriate public market index (e.g., the Russell 2000 for US buyouts). Given the lack of publicly available, comprehensive cashflow data for private funds, the performance analysis of private fund portfolios faces many practical challenges.

Despite these issues, we present a practical and transparent framework for private portfolio attribution analysis within this article. Our method isolates the importance of several factors associated with private equity fund performance and allows for attribution of those factors to a portfolio of private funds. In particular, we decompose performance into a market component and components due to differences between the portfolio of interest and simply investing in the market portfolio. Our method allows for a user-defined market based on the investor’s mandate or investment policy statement, but in our analysis, we consider the market to be the full universe of comparable private funds with available cashflow data. We then compare the performance of a given portfolio to that market portfolio based on differences in the timing of investments, the strategies of constituent funds (e.g., venture versus buyout), the geographic focus of constituent funds (e.g., North America versus the rest of the world), the relative size of commitments, and a residual component that we characterize as a selection component (though it can include other unspecified attributes and interaction effects between attributes).

Traditional private market benchmarking aims to measure relative performance generated by an LP’s portfolio. While these analyses vary on the benchmark used (either public or private) and the underlying calculation methodology, the end goal of measuring performance differences (i.e., alpha) is the same. As an LP, knowing if you outperformed an appropriate benchmark is important. However, in order to make performance data useful for evaluating past investment decisions and thus informing future allocation decisions, it is crucial to understand how the decisions made while constructing the portfolio impacted the performance. Our methodology aims to explain how an LP’s tactical and strategic allocation decisions contributed to performance relative to an appropriate private markets benchmark. Like methods commonly used in public markets, our methodology provides estimates of performance contributions for each characteristic relative to what would have been achieved by investing in the benchmark. Consequently, a positive attribution number implies an allocation decision that outperformed the market benchmark and vice versa. The primary difference is that in our analysis, the relative performance is measured in terms of IRR or MOIC instead of annualized public-market returns. For convenience, we call these attributions alphas.

While our analysis will generate estimates of attribution alpha for each portfolio management characteristic we examine, it does not provide any measure of statistical significance. The notion of statistical significance is vague because the null hypothesis of “no ability” is not well defined given the absence of a passive investible benchmark for private assets. To provide measures of statistical significance, we generate bootstrapped confidence intervals of attributes based on specific null hypotheses for each of the characteristics we examine. Our analysis relies on the Burgiss Manager Universe (BMU) sample of buyout and venture capital funds and utilizes data for vintage years ranging from 1987 to 2018. We estimate and report confidence intervals for each characteristic so that users implementing our method can have an approximate indication of where their specific attributions fall in the distribution of random outcomes associated with feasible LP allocations. Specifically, we report the distribution of attribution alphas for portfolios that choose ten funds from each vintage at random based on particular allocation criteria. We do this separately for venture capital and buyout funds (as well as a combined portfolio) and both IRRs and MOICs. In practice, confidence intervals for a specific investment universe or time horizon can be generated using the same bootstrapping method. Likewise, a set of customized confidence intervals can be created using the same average number of randomly chosen funds per vintage year instead of a fixed number (as we report).

We consider the case where the full history of a private portfolio is evaluated, but it is possible to isolate effects from sub-periods. If the goal is to evaluate investment ability, there is the inherent challenge of assessing a long enough performance history that estimated results reflect actual results. Given that the typical private fund has an investment period of around five years and then holds significant assets for another 5 to 10 years, there is no practical way to evaluate skill over short horizons of a year or two using our method (or likely any other). In addition, analysis over short and recent horizons presents the inescapable limitations of relying on fund net asset values (NAVs) at the analysis ending date. NAVs are known to be systematically biased, but this is a limitation for all types of private fund analysis, not just attribution analysis.^{2} Nonetheless, evaluation could be done for a sub-period by simply conducting the attribution analysis using the funds selected during the sub-period. Likewise, an analysis for a specific team or even an individual investment professional can be conducted by isolating the fund investments they recommended and inside a larger portfolio. The method allows for complete customization of the funds of interest and so could, for example, only consider funds in the portfolio that are past their investment period. This would prevent commingling with funds not of interest because they are not mature enough to have reliable performance characteristics. The benchmark can also be customized, for example, by only including funds in which the portfolio could have invested (e.g., incorporating size and geography restrictions or limited ability to access top venture funds). This can be done to ensure that the benchmark (and all of its associated cash flows) represent a realistic investible universe for comparison.

In sum, our new method offers several benefits to practitioners:

▪ First, the effects of important investment decisions such as allocation timing, size, geography, and strategy that are well-known determinants of public asset portfolio performance can be identified in private portfolios. Knowing these effects can help asset managers understand where skill has been added by active decision-making.

▪ Second, our simulation study provides individual confidence intervals for each allocation characteristic that can be used to (statistically) separate skill from luck. Being able to identify skill is useful for properly rewarding performance as well as allocating resources to the highest-value analysis.

▪ Third, understanding the source of return variation from active decision-making allows for a better understanding of risks not just in the private fund portfolio but also in a broader portfolio including other assets. This is especially important if active management decisions are correlated across public and private assets.

This article proceeds as follows. The next section defines the methodology. The third section describes the Monte Carlo simulation analysis results on a simulated market that verifies the accuracy of the method and the convergence rate. The fourth section presents the results of the confidence interval estimation for real-world data. The final section concludes.

## RETURN ATTRIBUTION

For a private markets portfolio, portfolio construction is driven by decisions regarding investment size, time, location, and type. To benchmark the effectiveness of those decisions, one can ask the following questions:

▪

**Commitment Timing:**Compared to the broader market, did you invest in better performing or underperforming vintages?▪

**Strategy Selection:**You split your commitments between different strategies (Buyouts, Venture, and others). Would you have gained or lost in performance if you had committed to these strategies as the broader market did?▪

**Geography Selection:**You split your commitments between different geographies (North American, Europe, and others). Would you have gained or lost in performance if you had committed to these geographies as the broader market did?▪

**Commitment Sizing:**Compared to a portfolio with equal commitments to all funds, did you overcommit to your portfolio’s strongest performers?▪

**Residual Factors:**Other effects drive the difference between the performance of the overall private market and your portfolio. These are not limited to but can include interactions of the above variables, foreign exchange fluctuations, as well as manager selection (your commitments matched market trends, strategies, and geographies, yet the particular funds in your portfolio outperformed the market). Did the contribution of these factors help or hurt your performance relative to the broader market?

Given the historical performance of a portfolio, an effective attribution method should help you learn what decisions led to lower or higher performance than the general market. Exhibit 1 provides an example of a private markets attribution analysis that decomposes a portfolio’s MOIC into the above decision categories. In this example, the portfolio’s MOIC was 1.82x, while the general market achieved a 1.35x MOIC. The portfolio invested in better vintages and strategies than the overall market (contributing 0.13x and 0.17x to the portfolio’s MOIC). The geography selection was below market (taking away 0.24x in performance). The portfolio invested more, on average, into its top-performing funds (sizing performance gain of 0.15x). After accounting for those factors, there was a positive residual performance of 0.26x. Below is a description of how to calculate these attribution figures using the decision categories listed above.

## THE SETUP

The market is a universe of *M* funds. The portfolio is a subset of *N* funds sampled from the market, *N* < *M*. Let *r _{m}* be the market return,

*r*the portfolio return, α

_{p }*the timing alpha, α*

_{v}*the strategy selection alpha, α*

_{s}*the geographic selection alpha, α*

_{g}*the commitment sizing alpha and α*

_{c}_{ε}the residual alpha, then:

which applies for both MOIC and IRR.

A market of *M* funds has the following structure: We index each fund in the market by *i* for *i* = 1, …, *M*. Each fund has a set of attributes:

where ν_{i} is the fund’s vintage, *g _{i}* is the fund’s geography and

*s*is the fund’s strategy. Let

_{i}*V*be the set of vintages in the universe,

*G*the set of geographies and

*S*the set of strategies. For example, we could have a market where

*V*= {2012, 2014, 2015},

*G*= {North America, Rest of World} and

*S*= {Buyout, Venture}. Further, we have one time grid,

*t*for

_{j}*j*= 1, …,

*T*, on which we observe cash flows for all funds across all vintages. For simplicity, let

*t*

_{1}= 0. Then, the contributions for fund

*i*at time

*t*will be denoted:

_{j}or *c _{i}*(

*t*) for short. Similarly the distributions are

_{j}*d*(

_{i}*t*; v

_{j}_{i},

*g*,

_{i}*s*).

_{i}Note that these values can be zero: If the time point *t _{j}* is earlier than the fund’s first cashflow,

*c*(

_{i}*t*) = 0. Similarly,

_{j}*c*(

_{i}*t*) = 0 if

_{j}*t*is after the fund has liquidated. At the last observed time point, when

_{j}*j*=

*T*,

*d*(

_{i}*t*) is the sum of distributions and any remaining NAV for fund

_{T}*i*.

The performance of the *overall market* in terms of IRR is obtained by solving for *r _{m}* in

In terms of MOIC, it is defined as:

The portfolio is *N* funds out of the overall market, *N* < *M*; assume funds *i* = 1, …, *N* index the portfolio. The exposition below is most applicable to a market of liquidated or nearly liquidated funds. However, the procedure can be applied to a younger portfolio subject to the caveats discussed in the introduction.

## TIMING ALPHA

The calculation of the timing alpha seeks to answer the question: Did the portfolio invest more money in better-performing vintages? To assess this, we take the portfolio investments made in each vintage and invest it into a cross-section of funds available in the broad market that year. This eliminates performance effects from the selected set of funds while evaluating the relative over- or under-investment into each vintage. To do this, we first compute the fraction of capital invested in the portfolio each vintage year as:

for each ν ∈ *V*. The numerator is the total capital the portfolio invested in vintage ν. The denominator is the total capital the portfolio deployed across all funds and vintages. Next, we invest in the broad market but commit dollars to each vintage. Then the new portfolio will have a cross-section of all funds in each vintage. Note, however, that a vintage that experienced large fundraising in the general market may receive a small commitment if is small.

Concretely, we adjust all cash flows in the market as follows:

for *i* = 1, …, *M* and *j* = 1, …, *T*. The denominator is total portfolio contributions in vintage v_{i}. Similarly, for . This amounts to committing dollars, *pro rata*, to all vintage v funds (more of goes to larger funds, less to smaller funds).

The performance of this portfolio, *r _{v}*, in terms of IRR is obtained by solve for

*r*in:

_{v}In terms of MOIC, it is:

The additional performance, α_{v}, from over-weighting and under-weighting certain vintages is then:

## STRATEGY & GEOGRAPHY SELECTION

For strategy and geography selection, the approach is the same. For simplicity, we only show the calculations for strategy selection. Here, the question is: Did the portfolio invest in better performing strategies (e.g., relatively more in venture if venture had higher returns)? For each vintage ν ∈ *V*, we compute the *market vintage weight:*

This is the percent of investments that are linked to vintage ν and can be thought of as the level of fundraising for that vintage. For each strategy σ ∈ *S* in the portfolio, we calculate the *portfolio strategy-vintage weight*, the fraction of capital invested in each strategy for that vintage, as:

The numerator is the total portfolio contributions in strategy *s* and vintage ν, and the denominator is total portfolio contributions in vintage ν. This is the fraction of contributions split across strategies for a given vintage. To examine the effect of strategy selection alone, we keep the commitment sizing of the general market, but the strategy split of the portfolio. This means we need to invest to each vintage and strategy. Then we adjust all cashflows in the market to be:

for *i* = 1, …, *M* and *j* = 1, …, *T*. Similarly, for . This amounts to committing dollars, *pro rata*, to all strategy *s* funds of vintage v.

The performance of this portfolio *r _{s}*, in terms of IRR is obtained by solving for

*r*in:

_{s}In terms of MOIC, it is:

The additional performance, α_{s}, from over-weighting and under-weighting certain strategies is therefore:

To calculate additional performance α_{g} from over/under-weighting certain geographies and vintages, we can replace *s* with *g* ∈ *G* in the above so that:

We note that the commitment amount in a given year is proportional to the total value of funds in that vintage year. If instead, the benchmark is a fixed commitment amount each vintage year, these terms must be adjusted to ensure that the impact of investing in a given vintage is not double counted.

## COMMITMENT SIZING

Calculating an alpha for commitment sizing lets us answer the question: Did the portfolio commit relatively more assets to better performing funds? To assess this, we compare portfolio return, *r _{p}*, with the performance of a portfolio that makes equally sized commitments to all funds. We normalize cash flows by their associated fund sizes (assuming the commitment is 100% drawn) so that:

Then the performance of this equal-sized portfolio *r _{eq}* in terms of IRR is obtained by solving for

*r*in:

_{eq}In terms of MOIC, it is:

The additional performance, α_{c}, from investing more or less in strong performing funds is:

## SELECTION AND RESIDUAL FACTORS

The alphas calculated above will not capture all the differences in performance between the portfolio and the market. Consequently, we define an additional category that serves as a residual component and bridges the performance between the portfolio’s actual performance and that which can be attributed to the previous components. Specifically, we define:

This term can be considered at least partially a fund selection component in so far as the performance of the other characteristics has been accounted for. However, this component also would include any other factors not considered as well as interaction effects among the factors considered above.

## SIMULATION EXPERIMENT

We now turn to run a set of Monte Carlo simulations that evaluate randomly sampled portfolios from a simulated market. The cash flows in this market are stylized versions of actual fund cash flows but are randomly generated with known distributions. This process provides confirmation that the methodology isolates the attributes we seek to measure. In short, we create a fictitious universe of 2,400 funds for which we know the average return differences by strategy, geography, vintage year, etc. We then create portfolios that are weighted toward these factors and confirm that our methodology recovers the appropriate return attributes that we have hard-wired into the portfolios. This sort of simulation analysis is useful because we do not have analytical solutions for the attribution factors that can be used to check for the consistency of estimates.

We define our simulated fund universe in the following manner: We assume there are *M* = 2,400 funds over 20 vintage years. Each fund is in existence for 60 quarters (or 15 years). Half of the funds are buyout funds with relatively low average returns and risk (i.e., dispersion in cash flows), and the other half are venture capital funds with relatively high average returns and risk. Likewise, half of the buyout and venture capital funds are from one geography (i.e., North America) with relatively high average returns. The other half is from another geography (i.e., the rest of the world ROW) with relatively low average returns. We also create good (i.e., odd) and bad (i.e., even) vintage years with differential average returns. Finally, half of the funds have inherently better returns on average (i.e., are “good” funds), so there exists an opportunity for fund selection (holding other attributes the same). All funds are of unit size (1.0), but we vary commitment size in our simulations as needed to examine sizing effects. Every attribute is a random variable with known (uniform) distributions, so no two simulated funds have the same cash flows. Thus, it is possible, for example, to have a “bad” ROW venture capital fund underperform a “good” North American buyout fund.

Once this market of *M* funds has been established, the portfolio is defined as a subset of *N* funds from this market (where *N* < *M*). Portfolios are created either completely randomly or by deliberately oversampling certain types of funds. We calculate return attributions for both MOIC- and IRR-based measures for each portfolio. We then repeat the random portfolio creation and return attribution calculation process several times to make sure attributes converge to their specified values. Our analysis primarily examines portfolios with an average of ten randomly selected funds each vintage year, but the method’s validity is not sensitive to the average number of funds per vintage in the portfolio.

## BASE CASE: RANDOM SAMPLING

Our first set of tests shows the results from completely randomly sampling ten funds per vintage. Because there is no bias in fund selection to any particular attribute, all attribution returns should converge to zero with a sufficiently large number of simulation trials. Exhibit 2 shows the results of this analysis for MOICs. The market MOIC is 2.50 by definition. For low numbers of simulations, some of the attribute returns are not zero; however, all attribute returns are less than 0.01 when we average across 512 or more simulations, suggesting that convergence of the simulation results will be fairly efficient and unbiased for MOICs.

We repeat the calculations using IRRs (in annual return percentages) as the return metric instead of MOICs and report the results in Exhibit 3. Our simulated market has an IRR of 25.0 percent. For low numbers of simulations, some of the attributes are different from 0.0 percent. For simulations with 512 or more trials, only the fund selection attribute has a value not equal to 0.0 percent (rounded). This result indicates that IRRs also converge fairly quickly, but the persistent, albeit small, bias on fund selection suggests that IRR-based measures may not be quite as accurate as MOIC-based measures. We return to this topic when examining the real data. We subsequently report results primarily for 1,024 trials because the simulations appear to converge at around 512 trials.

## BASE CASE: CONVERGENCE TEST

To further assess the precision of our estimates from the simulation analysis, we also alter the test described above to select a different number of funds per vintage. In general, when the number of funds per vintage is higher, the portfolio will converge to the market in a smaller number of simulations. We examine cases with 2, 5, 10, 20, and 25 funds per vintage. To test the rate of convergence, we compute the following metric for different numbers of simulations where each variable is the MOIC attribution (e.g., values shown in Exhibit 2):

2We would expect this to eventually converge to zero in the case of random sampling. The results are displayed below in Exhibit 4 for 500, 1000, and 1500 simulations.

These values suggest that sampling of 10 funds per year with about 1,000 simulations is sufficiently close to the underlying market to have attributes round accurately within 0.01 for MOICs.

## SAMPLING SPECIFIC CLASSES OF FUNDS IN EACH PORTFOLIO

We now examine whether our method recovers the assumed return attributes when we select portfolios based on those attributes. Exhibits 5 and 6 summarize results from these experiments. The first two rows of Exhibits 5 and 6 show results when the portfolio is invested in “good” or “bad” vintages. By construction, these funds have MOICs that are 0.40 greater or less than the overall market. The results indicate that the timing MOIC attributions are ±0.40, and all other attributes are zero. For IRRs, the timing attribution is 5.4%, and all other attributes are small (sizing is −0.2 percent and fund selection is an offsetting +0.2 percent). This indicates that the methodology is very accurate for MOICs and slightly biased for IRRs.

The next two rows of Exhibits 5 and 6 show what happens when we only select venture capital or buyout funds which have, by construction, average MOICs that are ±0.5, different than the market as a whole, respectively. Results indicate that the strategy selection MOIC is 0.50 for venture funds and other attributes are close to zero. For buyout funds, the strategy selection MOIC is −0.49, which is very close to the assumed value with the 0.01 difference spread across other attributes. When we evaluate IRRs, the strategy selection attribute has a value of 6.7% for venture funds and −7.2% for buyouts with other attributes contributing 0.2 percent or less in magnitude. As with the timing analysis, the strategy selection results for IRR are close to the assumed values with net errors below ±0.2%.

Exhibits 5 and 6 also show results when we select only North American funds or ROW funds with MOICs that are ±0.3 relative to the market, respectively. We find, as expected, that the Geography (Geo) Selection attribute has MOIC values of almost exactly ±0.30, and other attributes have values within 0.01 of zero. When we examine IRRs, we find that the Geographic Selection attribute has a value of 3.7% for North American funds and −4.2% for ROW funds. Other attributes have magnitudes of 0.1 percent or less for North American Funds, but are somewhat larger in magnitude (up to 0.4%) for ROW funds.

The next sections of Exhibits 5 and 6 show results when commitment sizing is adjusted to overweight funds with higher returns so that MOICs would be an expected 0.20 higher. We find that the commitment sizing attribute has a MOIC contribution of 0.21 and other attributes are zero. Regarding IRR, the commitment sizing attribute is 3.0%, and other attributes have magnitudes of 0.1 percent or less. Finally, the last sections of Exhibits 5 and 6 show results when portfolios only include “good” or “bad” funds with MOICs that are on average ±0.3 relative to the market. The results show that the fund selection attributes are ±0.3 and all other attributes are zero. When we examine IRRs, the “good” (“bad”) fund selection attribute is 3.9% (−4.0%), and other attributes are 0.3 percent in magnitude or smaller.

Overall, the results in Exhibits 5 and 6 confirm that our attribution model recovers the attributes in our simulated portfolios. The precision and accuracy are both very high for MOIC-based performance attribution. The IRR-based measures appear to be accurate but not quite as precise, showing some deviations on the order of a few tenths of a percent for portfolios with an average of ten funds per vintage year over our 35-year simulation period.

## BOOTSTRAPPED CONFIDENCE LEVELS

The simulation results in the previous section provide us with a high degree of confidence that the methodology works as designed. However, any portfolio smaller than the market will have variation in performance attribution that is due to chance as well as skill. Unlike performance attribution in traditional markets that can easily generate the statistical significance of performance attributions (e.g., using regression analysis and well-known statistical properties of least squares estimates), there is no easy way to obtain the statistical properties of our private fund performance attributes. Consequently, this section undertakes another large simulation analysis using actual fund data from the Burgiss Manager Universe to generate bootstrapped confidence intervals for each attribute. We do this for both MOIC-based and IRR-based estimates using reasonable assumptions on portfolio allocations (described below). Because of the well-known differences in risk and the findings shown above, we conduct the bootstrap simulations for buyout and venture capital funds separately, as well as for a portfolio that has a mix of buyout and venture capital funds.

Specifically, we utilize fund cash flow data from the Burgiss Manager Universe that covers more than 6,000 buyout and venture funds with vintage years from 1987 to 2018. Burgiss gathers data from the financial transactions of LPs and reports of GPs provided to the LPs who are Burgiss clients. Because Burgiss observes all cash flows and reports, there are no selection biases or missing values in the data. Additional details on the Burgiss data are available in Brown et al. (2015).

## ALL FUNDS

To generate bootstrapped confidence intervals, we generate 2,500 random portfolios across strategy and geography and vary the timing and sizing of commitments. For strategy and geography, we use the full sample in our base case; we select ten funds per vintage year and sample with replacement for all the funds associated with a given vintage. We invest an average of $100 per year on a fund-size-weighted basis across selected funds. To generate randomness in our sizing of positions, we over- or under-weight each fund commitment using a randomly generated value between 0.5 and 1.5 (from a uniform distribution). We use the same fund-specific value to scale all the cash flows in that fund. To generate randomness in the timing of commitments, we over- or under-weight the value of vintage year commitments by selecting a value using randomly generated values between $50 and $150 (from a uniform distribution). While these ranges are admittedly arbitrary, they are consistent with the ranges discussed in Brown et al. (2020).

Exhibit 7 shows results for combined portfolios of buyout funds and venture capital funds. We report the average and standard deviation for return attributes as well as various percentiles of performance factors. Panel A shows MOIC-based results. As expected, the mean attribution results are equal to or close to zero for all attributes. Only commitment sizing has a value that does not round to 0.00. The range of bootstrapped portfolio outcomes is reasonably wide, with 90% of outcomes falling between 1.49 (5th percentile) and 1.86 (95th percentile), and the standard deviation of overall portfolio returns is 0.11. This gives us a feel for (and some comfort interpreting) the confidence intervals for individual attributes.

Looking more closely at the attribute results, the standard deviations vary from a low of 0.02 for timing to 0.12 for fund selection. So, for example, a low level (say 0.05) of timing attribution is likely the result of skill (i.e., is around the top 2.5% of simulations). In contrast, the same amount of performance attribution in fund selection is more likely due to chance (i.e., inside the interquartile range). The various percentiles of the distributions of each MOIC-based measure provide more precise estimates of the reliability of each performance measure. Overall, the confidence intervals are symmetric around zero for MOICs.

Exhibit 7 Panel B shows IRR-based results for the same set of simulations. Here again, most performance factor attributions are small; however, the values are not as close to zero as for MOICs (relative to the standard deviation of portfolio returns). For example, strategy selection has an average value of 1.0% and a 90% confidence interval of −1.8% to +4.1%. As we show next, there is significant skewness in the IRR-based measure that is particularly pronounced for venture capital funds, making inference for strategy selection with venture funds with IRRs less reliable. Other attributes have confidence intervals that are more symmetric around zero. Overall, the results in Exhibit 7 suggest that it is possible to identify reasonable confidence intervals from real data for the various performance attributes. Yet, the confidence intervals depend substantially on the individual attribute and the measure (MOIC-based or IRR-based).

The results in Exhibit 7 are worth reflecting on further. While the previous section showed that the model provides estimates as expected, it does not mean that this is the “true” model for performance attribution. In other words, there may be other important factors that performance attribution should consider but that we do not account for here. The relatively high standard deviation of the residual (Fund Selection) component could signify additional factors are needed in the model. Consequently, our estimates present too high of a bar for identifying fund selection skills. In other words, if omitted factors exist and were then included in the analysis, the standard deviation of the residual term would decline. On the other hand, private fund returns have always been characterized by greater cross-sectional variation than diversified (long-only) public funds such as mutual funds. Consequently, it is intuitive that there should be more unexplained variation and that, therefore, the threshold for identifying skilled fund selection is truly high. One way to perhaps achieve a clearer view of skill is to limit the market comparison set to funds with more similar strategies, and we do this next for just buyout and venture funds.

Another concern is that attribution factors are correlated, and it is difficult to isolate them empirically (which might also result in a high standard deviation of the residual). To gain insight into this possibility, we estimate the correlations across attributions for our bootstrapped portfolios. High correlations could indicate an identification problem. However, we find that none of the pairwise Pearson correlations are statistically different from zero, which suggests individual attribution identification is not difficult in the actual data.

Given the large differences between the return distributions of buyout and venture capital funds, and the potential bias documented above, we now turn to analyze these fund types separately. We repeat the same simulation as described above separately for venture and buyout funds. We run the experiment 2,500 times and allocate dollars in the same fashion as was done for the full market.

## BUYOUT FUNDS ONLY

Exhibit 8 shows results from a simulation with only buyout funds. Panel A shows values for simulated MOICs. MOIC attributes are zero on average except for timing which has a value of 0.01. In each case, the estimated standard deviations are much tighter for buyouts than for the full sample including buyout and venture capital funds. Fund selection still has the largest standard deviation (0.07), suggesting selection has the highest hurdle for differentiating skill from luck. Panel B shows IRRs are also better behaved for portfolios with just buyout funds than for mixed portfolios. The largest (in magnitude) average attribution is 0.4% for fund selection, and standard deviations of IRR attributes are about half those of the mixed portfolios.

## VENTURE CAPITAL FUNDS ONLY

Exhibit 9 shows results from a simulation with only venture funds. Panel A shows MOICs. Average attributes are close to zero, as was the case for the other portfolios. However, standard deviations are much larger than for the portfolios with just buyout funds or a mix of funds. This is as expected given the much greater range of venture capital fund returns. The results suggest that even a fund selection attribution of 0.20 MOIC in a venture portfolio could not be reliably attributed to skill over luck (i.e., 0.2 is inside the 90% confidence interval). Panel B shows that IRRs also have large confidence intervals for portfolio attribution characteristics, suggesting that the wide range of performance in venture funds means that it is difficult to differentiate skill from luck.

## CONCLUSIONS

This study proposes a method for providing attribution analysis to a portfolio of closed-end drawn-down funds such as private equity buyout funds and venture capital funds. Our method is intuitive in isolating performance attributes related to fund strategy, fund geography, commitment timing, and commitment sizing. A residual (unexplained) component can be viewed as a fund selection attribute. However, the residual component also includes performance that cannot be explained by the chosen attributes and interaction effects between attributes, so it should be interpreted with some caution.^{3}

Our method provides attribution magnitudes both in terms of contribution to portfolio investment multiples (MOICs) and contribution to portfolio internal rates of return (IRRs). Consequently, a geography factor attribution of, for example, 0.05 MOIC and 1.6% IRR have a very intuitive interpretation as adding those amounts to overall performance over the period examined. We also can utilize the large historical data set of Burgiss to generate confidence intervals for attribution factors for portfolios of buyout and venture capital funds (as well as portfolios with both types of funds). While these confidence intervals are specific to the periods we use for our analysis, they provide a reasonable gauge for understanding what thresholds could be considered a significant skill for each attribute over a fairly long history.

Our bootstrap analysis has the potential to be combined with a database of portfolio holdings for institutional investors to identify what decisions investors are good at making. Analysis of LP portfolios could determine if active management adds value overall, as in Cremers and Petajisto (2009). This could lead to more efficient portfolio management decisions (e.g., where to allocate effort and dollars). For example, a better understanding of investment skills could mitigate underperformance attributed to poor geographic and fund selection by public pension funds, as documented by Hochberg and Rauh (2013). In aggregate, this could result in more efficient capital formation in the broad economy. More generally, we believe that the type of analysis presented here could offer a guide and tool for managers to interpret and understand the landscape of private investments in a broader portfolio. For example, the framework can be extended to include liquid and semi-liquid assets by including those assets in the benchmark and accounting for periodic investment (or continuous re-investment) in these additional “funds.” In this sense, traditional public market performance attribution models can be seen as special cases of our private fund attribution model.

Further research could also generate more modest extensions of our methodology. First, we consider just two geographies and strategies, but it would be straightforward to have more granular regions (e.g., North America, Europe, Asia/Other) and additional strategies (real estate, private credit, infrastructure, etc.). Second, our metrics do not adjust for market returns or risk, so future research could derive metrics for public market equivalents (see, for example, Kaplan and Schoar 2005). Third, our method combines interaction effects between factors into the residual component, so a future analysis could examine how important these effects are for typical portfolios. Fourth, subsequent work could devise an efficient method for empirical bootstrapping of confidence intervals for each factor for a specific benchmark and investment policy regime. For example, researchers could incorporate limits on portfolio weights for strategies and geographies) that would allow gauging statistical reliability across portfolios with different periods and assets.

## ACKNOWLEDGMENTS

The authors thank Burgiss, the Private Equity Research Consortium, and the Institute for Private capital for the generous support of this project. We also thank Lisa Larsson and an anonymous referee for their valuable comments.

## ENDNOTES

↵

^{1}A couple exceptions include Ng (2015) and Ott and Pfister (2017). Ott and Pfister present a case study of two North American pension funds and argue for the value of sticking with a predefined asset allocation strategy over tactical decision making. See also Brown et al. 2020 for evidence on tactical allocation to private investments. Ng constructs a model to separate allocation effects and selection effects when looking at fund performance.↵

^{2}For example, Brown, Gredil, and Kaplan (2019) find that around fundraising for a new fund, NAVs are systematically downwardly biased for top performing funds and upwardly biased for poor performing funds.↵

^{3}For example, in our simulation exercise we find that the fund selection component always has the largest standard deviation and this, in turn, provides a specific “higher hurdle” for identifying statistically reliable selection skill.

- © 2021 Pageant Media Ltd