researching alpha in kalshi markets

2026.03.24

kalshi is a regulated us prediction market where you buy and sell yes/no contracts on real-world events. elections, sports games, fed decisions, things like that. every contract eventually settles at $1 or $0, which makes the math clean. every trade has two sides. the maker is the side that was resting on the book waiting for someone to come. the taker is the side that crossed the spread and hit them. if one of those two sides has a structural edge over the other, the trade tape will show it in dollars, because every position here eventually collapses to a real settlement price.

the dataset i’m using is jonathan becker’s prediction-market-analysis archive, which covers about 72M trades and $18B of notional volume, already cleaned and categorized by market type. my own cross-sections on top of it live at github.com/SahilSC/kalshi. the short answer to the question i went in with is that the maker–taker gap is real, and where it concentrates is quite a bit weirder than i expected.

numbers below are in percentage points, which i’ll write as “pp” from here on. a move from a 50% market price to a 52% market price is 2 pp, not 2%. it’s the raw difference between two percentages, and it’s the natural unit for thinking about “how much extra money does one side make, in absolute dollar terms,” when every position eventually resolves at either $1 or $0.

the overall picture

Quarterly maker-taker excess return over time — quarterly maker–taker excess return, pooled across all markets. the gap peaks at +2.86 pp in 2024 q4 and keeps climbing into 2025.

averaged across everything, the maker side beats the taker side by somewhere around 2 pp. the real question is where that edge concentrates, because the answer is not “evenly.”

nfl is the only sport market that's not tight smoking gun

Calibration scatter plots across six sports — calibration across six sports. a well-calibrated market should sit on the diagonal (a contract priced at 60c should resolve yes 60% of the time). five of the six basically do. nfl shows a visible s-curve dip at 55 to 70c favorites of about 5 to 7 pp, meaning those favorites resolve yes less often than the price implies.

i went in expecting a general sportsbook pattern. the literature on horse racing and traditional sportsbooks has documented favorite–longshot bias forever, and i assumed it would show up across most sports here too. it doesn’t. mlb, nba, tennis, ncaa hoops, and ncaa football all hug the diagonal. the only market with a clear structural mispricing in the favorite-side pocket is nfl, and it’s concentrated tightly enough in that 55 to 70c band to be worth looking at on its own. more on that one another time.

but the markets open sharp

Pre-game calibration of Kalshi sports markets — calibration of *pre-game* prices across the same six sports. for each market, the price plotted on the x-axis is the volume-weighted average of every trade in the window [market close minus five hours, market close minus three hours], which covers the hours before tipoff for most sports. the y-axis is the empirical fraction of those markets that resolved yes. five of the six sports sit almost exactly on the diagonal.

this is where it gets interesting. if nfl has a structural favorite-side mispricing, you’d expect to see it in pre-game prices too. the traders setting those prices have access to the same information (injuries, weather, lineups) that the later-game retail supposedly gets wrong. and yet: nfl pre-game calibration is one of the cleanest on the entire chart. deviations from the diagonal almost never exceed 2 to 3 pp, and the s-curve dip from the previous chart is basically gone.

the implication is pretty concrete. the nfl mispricing develops during the game, not before it. pre-game prices are set by traders who have had hours to stare at the matchup and line it up against external sportsbooks. once the game starts, the tape shifts to reactive retail, and that is where the 55 to 70c favorite-side bleed actually shows up. this is consistent with the earlier hour-of-day chart: the sports for which the gap is largest during play are the same ones whose pre-game prices already look efficient.

if you want to reproduce the chart yourself, here are the pieces:

start from jonathan becker’s trade archive (link above). pull the finalized kalshi sports markets only, keyed by market id, with per-trade fields (timestamp, price, side, size) and the market’s final resolution (yes or no).
for each market, take the subset of trades whose timestamp falls in [market_close − 5h, market_close − 3h]. the two-hour window is a rough pre-game proxy: it sits before tipoff for nfl and nba, and mostly before first pitch for mlb. some sports (nfl long games, early nba games) will sneak a few in-game trades into the window, which is a real wart worth knowing about.
compute the volume-weighted average price in that window per market. this is that market’s one data point on the chart. a market with 10 trades contributes the same as a market with 10,000 trades. this is deliberate. weighting by trades would let a few popular contracts dominate.
drop markets that had no trades in the window. the final per sport sample sizes are: nfl = 66,166, ncaa football = 11,353, nba = 5,271, mlb = 5,121, ncaa basketball = 3,486, nhl = 2,346.
bin the vwap on the x-axis (say, 20 bins of width 5c between 0 and 1). for each bin, plot the empirical fraction of markets in that bin that resolved yes. that is one sport-colored series on the chart. repeat per sport.

the one sport that does not fit the “markets open sharp” story is mlb. two points sit noticeably below the diagonal at roughly 75c and 85c, meaning mlb favorites priced that high only win 70 to 75% of the time. so pre-game mlb favorites look something like 5 to 10 pp overpriced, and it lines up with the sharp pro flow the hour-of-day chart already picked up at 11:00 et. the best-fitting story is that sharps are actively fading the retail mlb favorite bias in the hours before first pitch.

things this chart does not prove:

the window isn’t truly pre-game for every sport. nfl runs long, nba short, and so a universal 5-hour-to-3-hour window will straddle tipoff differently per sport.
excluding markets with no trades in the window introduces survivorship bias toward liquid contracts. thin markets (pre-season games, minor college matchups, weeknight nhl) are systematically underrepresented here.
nhl looks loose on the chart: multiple points in the 40 to 55c zone fall 5 to 10 pp below the diagonal. that might be a real sharp-vs-retail edge, or it might just be the smallest sample on the plot showing sampling noise. i wouldn’t trade on it without more data.

the edge moves violently with the clock

Maker-taker gap by hour of day — maker–taker gap by hour of day in eastern time. nba spikes to +9 pp around 01:00 et, which is market close and roughly when retail is cleaning up positions. mlb dips to around -3 pp at 11:00 et, which is the hour with the sharpest pro flow on the tape. gray bars at the bottom are volume.

two things here that matter if you’re actually trading this. the gap is not stable over time. it breathes with when retail is online, and the aggregate “~2 pp maker advantage” number from the first chart is really a weighted average over some hours where the number is enormous and other hours where it flips sign. a lot of the volume, and therefore a lot of the signal, clusters exactly at market close and around game time, which are the hours you care about most if you’re sizing a strategy. the edge lives where the money lives.

one sport stands out, downward

Maker-taker gap by sub-sport, horizontal bars — maker–taker gap broken out by sub-sport. every category shows a positive maker gap except atp tennis, which comes in at -0.09 pp on $226M of volume. nfl games dominate on the other end at +2.54 pp on $1.45B.

atp is the cleanest pro-versus-retail split in the dataset. tennis has had a reputation as a pro-dominated market for about as long as people have been betting on it, and here’s the receipt: the one sub-sport where the makers, who are typically the more sophisticated side, don’t have a meaningful edge. the story i find most plausible is that the retail side of the atp book has mostly been eaten by sharps over the years, so there isn’t much donor flow left for the makers to extract from.

caveats

everything above is historical and in-sample. nothing here is a live recommendation.

the split between “pro-dominated” and “retail-dominated” is a hypothesis i can’t verify from the trade tape alone. i’m inferring it from behavior, which is a little circular if you squint. calibration plots can also hide real noise in their tails, because the extreme bins have many fewer trades than the middle ones do. and the ordering of most of these bars is more stable than the exact numbers, which shift noticeably depending on whether you volume-weight or equal-weight.

if you wanna build a kalshi market-making bot, lmk.