We Scored Every All-In Podcast Stock Pick Against the S&P 500 — Here's Who's Actually Right

We scored every All-In podcast stock pick against the S&P 500. The full besties leaderboard — Chamath, Sacks, Friedberg, Calacanis — ranked by real returns.

The All-In podcast has spent five years turning private-jet group chats into a public market thesis machine. The besties call stocks, crypto, gold, bonds — "this is a generational buy," "I'm long this," "short that into the ground." The audience nods. Nobody writes it down.

So we wrote it down. Every public-market call we could attribute to a name, scored on a rubric, opened on the date it was made, and run through a portfolio engine against a single, brutally simple benchmark: what if you'd just held the S&P 500 instead?

That's the whole game. Not "did the stock go up" — stocks go up. The question is did the call beat the index it was implicitly competing with. Here's what the numbers say.

How we scored it

Every call on the podcast gets the same treatment, so nobody gets graded on vibes:

Attribution. A specific person says a specific thing about a specific ticker on a specific date. No date, no ticker, no score.
Rubric. Each call is scored for conviction and direction (a hedged "I kind of like it" is not a fist-on-the-table buy), then translated into a position — bullish or bearish.
Portfolio engine. The position opens at the call date and runs to its natural close (a year later, or until the thesis is closed out). Capital is sized by conviction.
SPY-as-cash benchmark. This is the part that matters. Money not in a live call sits in SPY, and every position is measured against what SPY did over the exact same window. We call it the "balanced" methodology. Beating the market in a bull market isn't beating the market — beating SPY over your own holding window is.

The headline number per host is vs-SPY in percentage points (pp): average outperformance, simple and dollar-weighted. Positive means they beat the index. Negative means they'd have been better off doing nothing.

Full methodology, every cycle, every ticker, live and updating: standardpoorly.com/pundit-poorly/all-in.

The leaderboard

Ranked by simple average outperformance vs SPY (percentage points), for everyone with more than one scored call. "Beat rate" is the share of individual calls that beat SPY over their window.

Rank	Host	Scored calls	Beat rate	Avg vs SPY (pp)	$-weighted vs SPY (pp)
1	Gavin Baker	5	40%	+15.4	+12.5
2	Ray Dalio	3	33%	+8.0	+7.2
3	David Sacks	9	67%	+6.6	+6.0
4	David Friedberg	16	50%	+5.9	+10.2
5	Bill Ackman	5	40%	−1.1	−0.4
6	Chamath Palihapitiya	26	38%	−1.8	−0.3
7	Jason Calacanis	19	37%	−6.5	−2.6
8	Cathie Wood	3	0%	−30.9	−27.6

A second tier of single-call guests is too small to rank seriously, but for the record: Kyle Samani (+34.2 pp, one crypto call), Elon Musk (+1.6 pp, one Tesla call), Mark Benioff (−4.9 pp), Dan Loeb (−3.3 pp), Thomas Laffont (−17.3 pp), and Ben Shapiro, whose lone year-long bet on long-dated Treasuries trailed SPY by a punishing 36.4 pp. One call is an anecdote, not a track record — we list them for transparency, not for the standings.

Note Brad Gerstner: across 10 scored calls his simple average is a gaudy +33.3 pp, dragged way up by a single 2023 NVDA call that beat SPY by 179 points. His dollar-weighted figure — what you'd actually have felt in your account — is a more honest +19.9 pp. We left him out of the main eight only because that NVDA outlier distorts the simple-average ranking so heavily; on a dollar-weighted basis he'd top the board. The lesson is in the gap between those two numbers.

What surprised us

Sacks is the quiet winner. No single call detonated like Gerstner's NVDA, but David Sacks beat SPY on 6 of 9 scored calls — a 67% beat rate, the best of anyone with real volume. NFLX (+53.6 pp), NVDA (+43.7 pp), and META (+39.0 pp) did the heavy lifting. Consistency, not fireworks.

Chamath is almost exactly the market. This is the headline most people get wrong. For all the conviction, the most prolific bestie's 26 scored calls land at −1.8 pp simple and essentially flat (−0.3 pp) dollar-weighted. Some monster wins — a 2021 META bear call that beat SPY by 90 points, gold up 25.7 pp — get offset by SPCE (−65 pp), Solana (−82 pp), and SHOP (−55 pp). Net result: a very expensive, very loud way to roughly match an index fund.

Gold and China were the besties' best ideas. The single best non-outlier calls across the whole panel were unglamorous: Friedberg's BABA (+80.5 pp) and gold (+49.9 pp), Dalio's gold (+79.5 pp). The flashy AI-darling calls were a coin flip; the boring macro hedges quietly won.

The index is undefeated as a baseline. Half the named besties are below SPY. That's not a roast — it's the base rate. Beating a roaring index over multi-year windows is genuinely hard, and the people doing it for a living split roughly 50/50.

The caveats (read these)

We're a serious newspaper about an unserious idea, so here's where the data is thin:

Small samples. Half this list has five or fewer scored calls. Gavin Baker's +15.4 pp rests on five calls, two of which (NVDA, SMH) were semis during a semis melt-up. Cathie Wood's −30.9 pp is three calls in one rough window. Don't mistake a small-n number for destiny.
Timing is our interpretation. We open positions at the call date and close them on a rule. Real people enter and exit when they want. A great call held one quarter too long can flip to a loss in this engine.
Survivorship and attribution. We only score calls we can pin to a name, a ticker, and a date. Vague enthusiasm doesn't get logged — which means the record skews toward the calls confident enough to be specific.
Recency. A chunk of 2026 calls are days old as of writing, held for under two weeks. Those move the average around and will keep moving. Treat anything with a four-day holding window as noise, not signal.

The honest read: nobody here is a fraud, and nobody here is an oracle. A couple of besties genuinely beat the market on a real number of calls; most cluster within a few points of just owning SPY; and the single biggest returns came from a handful of outlier bets that are easy to celebrate after the fact and were not obviously right before.

See the receipts

Every call, every cycle, every ticker — scored, dated, and updating — lives here:

→ standardpoorly.com/pundit-poorly/all-in

Argue with the methodology. Check our math. Watch the leaderboard reshuffle the next time someone says "this is a generational buy" on a podcast. That's the point: opinions are free, but they should leave a paper trail.

Not investment advice.