The Scouting Gap: What the NFL Combine Teaches Everyone Who Makes High-Stakes Talent Decisions
The most sophisticated evaluation machine any industry has ever built still gets it wrong about half the time.
The 2026 NFL Scouting Combine is the most intensive talent evaluation process in American professional life. Over the next four months, NFL teams will invest somewhere between 150 and 400 person-hours evaluating each first-round prospect. They’ll still get it wrong about half the time.
The NFL isn’t bad at this, really, nobody does it better. How the NFL built this machine, and where it still breaks down, has something to teach everyone who makes high-stakes talent decisions.

The NFL wasn’t always this obsessive
Before 1982, NFL scouting was a mess. Teams ran their own evaluations independently including scheduling individual visits with prospects, flying scouts around the country, duplicating each other’s work with no shared infrastructure. The closest thing to organization came from three rival scouting cooperatives: LESTO (formed in 1963 by the Lions, Eagles, and Steelers), which became BLESTO when the Bears joined; Troika, launched in 1964 by the Cowboys, Rams, and 49ers (later renamed Quadra when the Saints joined in 1967); and National Football Scouting, Inc. Each ran separate camps, evaluating overlapping pools of players for their member clubs. Redundant, expensive, and inconsistent.
Then Tex Schramm—the Dallas Cowboys president and general manager who’d already pioneered computer-assisted player evaluation—proposed something obvious that nobody had done: put everyone in one room. In 1982, National Football Scouting held the first National Invitational Camp in Tampa, Florida. A total of 163 players showed up. Sixteen teams participated. The primary purpose was basic: share medical information on draft-eligible prospects so every team wasn’t paying for its own set of X-rays.
For the first three years, the rival camps kept running their own events. But in 1985, all 28 NFL teams agreed to merge into a single camp—splitting costs, pooling data, standardizing evaluation. After two years in Tampa, followed by New Orleans, Arizona, and New Orleans again, the Combine settled permanently in Indianapolis in 1987, where it’s been ever since.
What started as a cost-sharing arrangement for medical exams evolved into something no other industry has replicated: a centralized, standardized, four-day evaluation of the top ~300 prospects in a talent pool, attended by every hiring organization in the field simultaneously. Psychological testing, athletic measurables, formal interviews, informal hallway conversations, medical records shared through a unified electronic system. The Combine didn’t just change how NFL teams scout. It created an evaluation infrastructure that makes corporate hiring look like guesswork by comparison.
Which, to be fair, it mostly is.
What 319 prospects just endured
The Combine is just the showcase. The real evaluation started months ago.
Greg Gabriel, who spent nine years as Director of College Scouting for the Chicago Bears and more than three decades in the NFL, detailed the full cycle: six area scouts divide the country into regions covering roughly 15 major schools each, visiting each a minimum of three times per year. December meetings narrow over 1,000 names to 450–500 workable prospects. Cross-check scouts evaluate roughly 30 players each, watching four to six games per prospect. Then the character investigation, the pro day visit, the private workout. Gabriel’s standard: “I always told scouts that they could miss on the talent evaluation because we had others to cross check that area. They had to be completely accurate on character evaluation.”
At the Combine itself, each of the 32 teams conducts 45 formal interviews at 18 minutes each—a format updated in 2020 from the previous standard of 60 interviews at 15 minutes. Players undergo orthopedic exams, X-rays, MRIs, and specialist consultations. They run the 40-yard dash, bench 225 pounds, and sprint through position-specific drills while Sparta Science force plate technology generates “Movement Signatures” from a database of over two million scans.
Scouting departments run annual budgets of $2–3 million, according to agent Jack Bechta, with typical department sizes of 15–25 people. Former Washington and Houston GM Charley Casserly has noted that Green Bay’s scouts spend approximately 17 days reviewing three games on every single draftable player—something few teams match.
Forty-four years after Tampa. Hundreds of millions of dollars in cumulative investment. The most sophisticated talent evaluation apparatus any industry has ever built.
And it still whiffs constantly!
The busts that haunt the process
JaMarcus Russell. The Raiders selected him first overall in 2007 out of LSU. Six-six, 265 pounds, a cannon for an arm. He signed a six-year, $68 million contract with $31.5 million guaranteed. Three seasons later: 7-18 as a starter, 52.1% completion rate, released. Matt Millen warned Al Davis not to draft him: “Do not draft this guy. … I don’t think he is the guy who people believe he is.” Head coach Lane Kiffin wanted Calvin Johnson instead. Overruled by ownership.
Vernon Gholston. The Jets took him sixth overall in 2008 after he crushed the Combine: 37 bench press reps at 225 pounds, a 4.67 40-yard dash, a 35.5-inch vertical. At Ohio State, he’d racked up 14 sacks in 13 games. In the NFL he recorded zero sacks in 45 games. Rex Ryan later: “Well, then I failed as far as the numbers go.” A five-year, $32 million contract for a defensive end who couldn’t get to the quarterback.
Darrius Heyward-Bey. The Raiders—again—picked him seventh overall in 2009. Why? He ran a 4.30-second 40-yard dash, fastest at the Combine. Al Davis loved speed above all else. Heyward-Bey signed for $38.25 million with $23.5 million guaranteed and never became a number one receiver. Michael Crabtree, picked three spots later at No. 10, became a productive starter for a decade—637 receptions and 7,499 yards over 11 seasons.
Three different teams. Three different years. Three guys who dominated every measurable the Combine could throw at them. The scouting machine did everything it was designed to do—it measured what it could measure. The problem was in what it couldn’t.
And then there’s Brock Purdy
If the busts show the system failing at the top, Purdy shows it failing at the bottom—in the opposite direction.
The San Francisco 49ers selected him 262nd overall in the 2022 draft. Last pick. Mr. Irrelevant—a nickname the league has given to the final selection since 1976, because historically, that player almost never matters. Purdy’s father joked about it. His mother brought cake and balloons. He was ready to go undrafted entirely.
The same San Francisco 49ers had, one year earlier, traded three first-round picks and a third-rounder to Miami to move up to the No. 3 overall pick and draft quarterback Trey Lance out of North Dakota State. Lance was the prototype: young, athletic, enormous arm, limitless upside. The 49ers bet four draft picks—including first-rounders in 2021, 2022, and 2023—on Lance being their franchise quarterback. The ten picks taken after Lance in that 2021 first round included eight Pro Bowlers: Ja’Marr Chase, Kyle Pitts, Penei Sewell, Jaycee Horn, Micah Parsons, Patrick Surtain II, DeVonta Smith, Rashawn Slater.
Lance played eight games in a 49ers uniform. Four starts. Then injuries, then a demotion to third string behind Brock Purdy and Sam Darnold, then a trade to Dallas for a fourth-round pick. Three first-rounders in, one fourth-rounder out.
Meanwhile, pick 262 stepped in after Lance and Jimmy Garoppolo both went down with injuries in 2022 and won every regular-season start. Led the 49ers to the NFC Championship as a rookie. Led the league in passer rating (113.0) and yards per attempt (9.6) the following year. Took them to Super Bowl LVIII. Finished fourth in MVP voting.
In May 2025, Brock Purdy signed a five-year, $265 million extension with $182.55 million guaranteed—the largest contract in 49ers franchise history. He earned $2.6 million total over his first three seasons. His new deal pays him $2.9 million per week.
The scouting apparatus evaluated both players. It correctly identified Lance as an elite physical talent. It rated Purdy as the least valuable prospect in the entire draft class.
Steve Young (Hall of Famer, former 49er) explained what went wrong: “The [quarterback] position is really about guile and an innate gift from heaven, in some ways, to be able to have your heart rate go down when everyone else is in anxiety and pressure… The draft doesn’t understand that thing.”
A week into training camp in 2022, coach Kyle Shanahan told 49ers CEO Jed York he thought Purdy was their best quarterback. The third-string rookie, not the franchise investment. Shanahan saw it in practice. The Combine never measured it.
The data confirms what the stories suggest
RotoWire analyzed all 800 first-round picks from 2000–2024 and found picks 22 and 26 share the highest bust rate at 57%. Pick 32 sits at 50%. Even picks 1–10 produce Pro Bowlers only about half the time. ESPN’s Paul Hembekides analyzed 20 drafts—first-round quarterbacks hit at just 46%, wide receivers at 27%. The 33rd Team found only 31% of first-round picks from 2010–2017 signed second contracts with the team that drafted them.
Ravens GM Eric DeCosta, one of the sharpest evaluators in football, acknowledged on the Ravens’ team podcast “The Lounge” in May 2025 that some view the draft as “inherently sort of a luck-driven process.” He hedged, said he doesn’t fully believe that, but argued the only rational response is to accumulate more picks to get more at-bats against long odds.
Roughly 40–50% of first-round picks don’t work out. After 44 years of Combine evolution, billions in scouting investment, and technology that can measure the force a player generates when he pushes off the ground. The machine is extraordinary. And about half the time, it’s wrong.
So. Why does this matter if you’ve never scouted a football player?
The corporate parallel nobody wants to hear
Because your company fails at the same rate with a fraction of the effort. Which, is some small comfort.
CEB (now Gartner) studied nearly 30,000 leaders and found 50% of executive transitions underperform, 3% fail outright, and 47% fall short. That study includes both external hires and internal promotions into new roles. Leadership IQ tracked 5,247 hiring managers across 312 organizations: 46% of new hires fail within 18 months. The reasons: coachability (26%), emotional intelligence (23%), motivation (17%), temperament (15%). Technical competence—what interviews are supposed to test—accounted for just 11%. (Leadership IQ is a corporate training firm. Proprietary research, not peer-reviewed. Still widely cited.)
CareerBuilder’s 2017 Harris Poll of 2,257 hiring managers: 74% admitted to making a bad hire, average cost $14,900.
The NFL invests 10–20x more per hire, uses dedicated evaluation teams who do nothing but assess talent year-round, and deploys medical technology no corporation could justify. Corporate America runs unstructured interviews and gut-feel decisions. And the failure rates land in the same range.
That convergence has a mathematical explanation.
The ceiling nobody can break
Schmidt and Hunter’s 1998 meta-analysis in Psychological Bulletin, synthesizing 85 years of selection research, established the benchmarks. General mental ability (GMA) tests predicted job performance at .51. Structured interviews: .51. The best combination of GMA plus integrity testing reached .65.
Then Sackett, Zhang, Berry, and Lievens published a reanalysis in 2022 in the Journal of Applied Psychology that shook the field. Validity had been “substantially overestimated.” Structured interviews emerged as the best single predictor at .42. GMA dropped from .51 to .31. Years of experience collapsed to .07.
A validity of .65 explains approximately 42% of performance variance. The majority of what determines whether a hire succeeds can’t be measured at the point of hiring. Sackett’s revisions push the ceiling even lower.
NFL draft bust rates: ~40–50%. Corporate executive failure rates: ~40–50%. Best-case predictive validity: ~42% of variance explained. Three independent lines of evidence, same range.
Steve Young nailed it without knowing the research. “The draft doesn’t understand that thing”—the intangible quality that separates Purdy from Lance, the trait no Combine drill captures. Selection science says roughly half of what determines success lives outside anything any evaluation process can reach. Young was describing the ceiling from the inside.
The organizations getting closest aren’t spending more
Google abandoned brainteasers in 2013 after Laszlo Bock told the New York Times: “Brainteasers are a complete waste of time. They don’t predict anything.” Internal data: four interviews capture 86% of the predictive value—the “Rule of Four” from Bock’s book Work Rules! After four rounds, you’re adding noise. Google uses hiring committees reaching decisions by consensus. The hiring manager can’t unilaterally approve.
Al Davis could.
Amazon’s Bar Raiser program, founded in 1999, puts an independent evaluator from a different department in every hiring loop—with veto power. Over 10,000 Bar Raisers and Bar Raisers in Training globally. An independent eye whose job is to challenge the primary assessors’ judgment.
Automattic—which runs WordPress.com—supplements standard interviews with a paid trial: 25 to 40 hours for most roles, at $25/hour. CEO Matt Mullenweg has described this as the closest thing to watching someone actually perform instead of listening to them talk about performing. That’s the closest thing in corporate America to what scouts do with game film.
The Cleveland Browns—rated the NFL’s most analytically advanced organization in ESPN’s annual survey—are the football version. GM Andrew Berry (Harvard economics and computer science, youngest GM in NFL history at 32) and former Chief Strategy Officer Paul DePodesta (yes, Moneyball Paul DePodesta) built the league’s largest analytics staff before DePodesta departed for Major League Baseball in late 2025. The Browns swept all three categories in ESPN’s 2024 analytics survey.
What do these organizations share? They didn’t add more hours. They structured evaluation differently: independent checks, work samples over interview talk, consensus decisions, systematic reduction of individual bias. They accepted the ceiling and focused on getting as close to it as possible.
The accountability gap is the real scandal
SHRM’s 2025 benchmarking survey found roughly one in five organizations track quality of hire. Dan Rymer, Executive Vice President at executive search firm Redgrave, wrote: “Few organizations invest even 10 percent of their search budget into ensuring a new executive’s success.”
NFL teams grade every draft class against projections. They know which scouts hit and which don’t. They run accountability loops through the entire process—and even with all that, they hover around 50%.
Most companies don’t track. Don’t measure. Don’t learn. No feedback loop. No post-hire audit. No systematic study of why hires fail. If you’re running a business, ask yourself: do you know the hit rate on your last five senior hires? Could your CHRO tell you? Is anyone feeding failure data back into your process?
The NFL can’t break 50% with obsessive accountability. Corporate America isn’t going to break it by not even keeping score.
A Raiders fan’s sidebar: the ghost of JaMarcus Russell
I’m a Raiders fan, the Combine just wrapped, and this franchise is staring down its past.
The Raiders hold the No. 1 overall pick and everybody, every executive ESPN polled, expects them to take Indiana quarterback Fernando Mendoza. GM John Spytek spoke at the Combine podium this week, saying he’s “not necessarily in favor of running him out there right away either” and would want to take pressure off a young quarterback as much as possible.
The last time the Raiders picked first was 2007. JaMarcus Russell. We know how that ended.
What haunts me about Russell isn’t the bust because busts happen, that’s the whole point of this piece. It’s how it happened. Millen warned Davis. Kiffin wanted someone else. The scouting reports flagged work ethic. No structural check on the owner’s preference existed. Al Davis wanted the big arm, and nobody could say no.
Google requires consensus. Amazon gives Bar Raisers veto power. JaMarcus Russell happened because one man could override every evaluator in the room.
Mendoza, Heisman winner, national champion, is the consensus pick. That’s not the risk. The risk is the structure around the pick. Does Spytek have genuine authority, or can ownership, including minority owner Tom Brady, override the room the way Davis did? That’s not a question about Fernando Mendoza. It’s a question about organizational design. And it matters more than any 40 time at the combine. We haven’t even dug in to the importance of the team around the quarterback.
The prediction problem doesn’t get solved. It gets managed.
Three hundred and nineteen prospects were in Indianapolis this week. Thirty-two teams deploying the most sophisticated talent evaluation apparatus any industry has ever built. And roughly half of the resulting first-round picks won’t earn a second contract with the team that drafted them.
The 49ers traded three first-round picks for Trey Lance and found their franchise quarterback with pick 262. The Combine measured Lance’s arm strength and Purdy’s 40 time and missed the thing Steve Young says actually matters—the ability to stay calm when everyone around you is panicking.
Better process matters. Picks 1–10 produce Pro Bowlers at roughly twice the rate of picks 21–32. Structured interviews predict at .42 versus .19 for unstructured. Google’s Rule of Four captures 86% of value. Those margins are real.
But the organizations that thrive, both in football and in business, aren’t the ones trying to crack the prediction problem. They build systems that assume roughly half their bets are wrong: fast identification of misses, rapid reallocation, cultures that treat evaluation failure as data rather than shame.
The scouting gap between the NFL and corporate America is real and enormous. The prediction gap is the same on both sides.
Have a take on the scouting gap, or a better example of structured evaluation done right? Let me know!



