There's a number I keep hearing in conversations with deal teams that I can't get out of my head.
Two to three weeks.
That's how long it takes most healthcare M&A teams to go from a raw data pull to a cleaned, enriched, prioritized target list that's actually ready for outreach. Not because the team is slow. Because the tools they're paying six figures a year for don't finish the job.
I've spent the last few weeks talking to fund origination leads, sell-side bankers, and platform corp dev teams about what they're actually using to source deals. Every single one of them has a stack. Most are spending $50K to $200K a year across three to five platforms - real line items on real P&Ls, renewed annually without anyone asking whether the output justifies the cost. And all of them have the same complaint: the data gets them to a universe, not a map.
The Stack Everyone's Running
If you're in healthcare M&A origination right now, you probably have some combination of these:
PitchBook ($12-40K/year depending on seat count) - The default. Good for transaction comps, fund activity, and identifying who's been acquired. Weak on independent practices that haven't transacted. One origination lead I spoke with recently said it plainly: "PitchBook has no value for what we're trying to do." That's a $40K line item.
Capital IQ ($12-30K/year) - Strong financial data on larger entities. Falls off completely below $10M revenue. Most independent physician practices don't show up at all.
ZoomInfo ($15-60K/year) - Contact data and firmographics. Useful for getting a name and phone number. Does not tell you whether a 4-physician orthopedic group in Tampa is independent, MSO-affiliated, or already under LOI with someone else.
Definitive Healthcare ($25-100K/year) - The closest thing to a healthcare-specific data platform. Claims and referral data, physician affiliations, facility profiles. Good for understanding market share. Does not tell you whether the physician owner is 63 years old with no succession plan and a lease expiring in 18 months.
Grata (formerly SourceScrub, now under Datasite) ($15-25K/year estimated) - Private company search with web-scraped enrichment. Better than PitchBook for finding companies that haven't transacted yet. Still gives you a list of names, not a diligence-ready profile. One fund I spoke with uses Grata alongside Apify for web scraping and Claude for manual categorization - three tools bolted together to do what none of them do alone.
There are newer entrants too. Rubi.ai is trying to use AI to match buyers with deals. Cyndx offers AI-powered deal sourcing at mid-five-figure annual contracts. Inven runs around $10K/year for AI-assisted target identification. Axial operates as a deal marketplace, charging per-deal fees.
Every one of these tools solves a piece of the problem. None of them solve the whole thing.
The Gap
Here's what none of these platforms deliver:
Provider-level intelligence. How many physicians are at the practice? Are they all revenue-generating, or are two of them semi-retired and listed on NPI but seeing 5 patients a week? What's the service mix - is it all evaluation and management, or do they run infusion, imaging, procedures? These distinctions are the difference between a $2M revenue practice and an $8M revenue practice, and no subscription database will tell you.
Transaction readiness signals. Is the founder approaching retirement age? Did a junior partner just leave? Is the lease coming up for renewal? Are they in a market where a competitor just got acquired, which means a platform is consolidating around them? These are the signals that separate "interesting practice" from "practice that might actually transact in the next 12-18 months." No platform tracks this. It requires triangulating across multiple data sources that most deal teams don't have time to access, let alone cross-reference.
Financial estimation from public data. You can build a defensible revenue range for most physician practices using publicly available benchmarks - specialty-specific revenue per provider, site-of-service adjustments, ancillary revenue multipliers for things like in-house pharmacy or imaging. But it requires knowing which benchmarks to use, when to adjust, and how to present ranges that an IC committee will take seriously. The platforms give you revenue data for companies that self-report. Independent practices don't self-report.
Prioritization. This is the one that costs deal teams the most time. You pulled 200 names from a combination of PitchBook, Definitive, and your own network. Which 20 should your team spend time on this month? Not the largest - the most likely to transact, the best strategic fit for your platform thesis, the ones where outreach timing matters. That prioritization layer requires judgment layered on top of data, and it's the part that takes your VP of Corp Dev 40 hours per market.
What $100K+ in Platforms Actually Gets You (And Where It Stops)
Let me walk through exactly what I'm seeing in the sourcing workflows of the teams I've spoken with recently.
The buy-side fund workflow typically looks like this: an investment theme gets approved, someone pulls a universe from PitchBook or Grata filtered by SIC/NAICS code and geography, an analyst starts scrubbing - removing hospitals, removing practices that already transacted, removing anything clearly too large or too small. That scrubbing process alone takes days. Then comes enrichment - adding provider counts, checking affiliations, estimating size. More days. Then someone has to actually look at each one and make a judgment call about fit and approachability.
The total elapsed time from theme approval to "here's a list of 15-20 practices worth calling" is consistently 2-3 weeks across every fund I've talked to. Not because anyone is doing it wrong - because the tools don't automate the middle layer.
The sell-side workflow has the same gap from a different angle. The banker needs to identify comparable platforms or potential strategic acquirers for a client engagement. They pull from CapIQ and PitchBook for transaction comps. They use Definitive or ZoomInfo for market participants. Then an analyst manually builds the competitive landscape and buyer universe. Same 2-3 weeks. Same bottleneck.
What the newer AI tools promise vs. what they deliver. I've looked at every AI-powered sourcing tool I can find in this space. Most of them are doing one of two things: either they're building better search interfaces on top of the same underlying data (which means the data gaps persist), or they're using LLMs to generate summaries of company profiles (which means they're dressing up the same incomplete information in more readable format). Neither approach solves the fundamental problem: the data that matters for healthcare M&A origination - provider-level clinical and operational intelligence - isn't in any of these databases.
Where the Data Actually Lives
The information that makes a sourcing report actionable in healthcare exists. It's just not where most deal teams look.
There are public data sources - CMS files, state licensing databases, NPI registries, facility linkage files - that contain provider counts, practice affiliations, geographic coverage, and clinical service indicators at a level of detail that no commercial platform matches. The issue is that these files are massive, unstructured, and designed for regulatory purposes, not for M&A teams. Knowing they exist and knowing how to extract actionable intelligence from them are two very different things.
The teams that figure out how to work with this data have a structural sourcing advantage. They're not waiting for PitchBook to add a practice to its database after the deal already happened. They're building target universes from primary source data that most of their competitors don't know how to access.
I'm not going to walk through the specific methodology here. That's the work product. But I'll say this: the gap between what $100K in platform subscriptions tells you and what's available in public data sources is massive. And it's the reason most sourcing reports look the same - because everyone is pulling from the same five platforms.
The Real Question for Your Team
If you're running origination at a fund, bank, or platform, here's the question worth asking this week: what percentage of your analyst's time on a sourcing project is spent on data gathering and cleaning vs. actual strategic analysis?
I wrote about this capacity problem in Your Corp Dev Team Is Running a 2026 Pipeline with 2020 Capacity - the tools are a big part of why that ratio stays stuck. From every conversation I've had, roughly 60-70% of analyst time on a sourcing project goes to data gathering and cleaning. 30-40% goes to actual strategic judgment. That ratio is backwards.
Your analysts and VPs should be spending their time on the judgment calls - which practices fit the thesis, how to approach the founder, what the integration path looks like. The research and enrichment layer is the part that should be compressed or outsourced entirely.
That's the gap I'm building Healthcare M&AI to fill. Not another platform. Not another database subscription you log into and still have to do the work yourself. The actual finished work product - sourced, cleaned, enriched, financially estimated, and prioritized with transaction readiness signals - delivered in days instead of weeks, from someone whose full-time job is building this intelligence layer. I don't have a day job running deals and sitting in corporate meetings. This is what I do, all day, and I've spent years figuring out how to pull more signal out of public data than most teams know exists.
If you're staring at a sourcing project right now and your analyst just started the scrubbing process, reply to this email with the market and specialty. I'll tell you what I can see from public data that your platforms are missing.
-Shawn
This newsletter is for informational purposes only and does not constitute investment, legal, or financial advice.

