How We Get Our Data

11,322

Healthcare sites indexed

860+

Procedures cataloged

3M+

Payer rate records

15

Major insurers covered

98.7%

Validation pass rate

50

States covered

Built on federally mandated data

We aggregate pricing data across 11,322 healthcare sites — including 4,756 hospitals, 5,258 freestanding ambulatory surgery centers (ASCs), and 1,308 hospital-operated ASCs. The CMS Price Transparency Rule requires all US hospitals to publish their negotiated rates, and the Transparency in Coverage Rule requires insurers to publish theirs — making this data publicly available for the first time.

We supplement these federally mandated files with 20+ government data sources to build the most comprehensive picture of healthcare pricing available.

Hospital & ASC MRFs — Negotiated rates published by hospitals and surgery centers as required by federal law. Under CMS v3.0 (effective April 1, 2026), MRFs include actual allowed amount statistics (median, 10th–90th percentiles) from insurer remittance data, organizational Type 2 NPIs, and senior official attestation
CMS Cost Reports — Actual hospital operating cost data from Medicare
Government Benchmarks — Medicare reimbursement rates, physician fee schedules, and geographic cost indices
Payer Rate Data — 3M+ negotiated rate records from 15 major insurers including Aetna, Anthem, UnitedHealthcare, Cigna, Humana, and BCBS plans
Quality & Safety Data — CMS CAHPS patient satisfaction surveys, ASC quality reporting, and safety metrics across 36,896 quality records
Financial Assistance — IRS 501(r) financial assistance policies from nonprofit hospitals, covering 33,032 assistance records
Compliance Data — CMS price transparency enforcement actions and NCCI coding edits
Verified Public Sources — State all-payer claims data, FDA drug databases, and other publicly available datasets

Rigorous validation

Every price record passes through an automated validation pipeline before it reaches you. We don't just collect data — we verify it across multiple dimensions including price plausibility, data completeness, and cross-validation against independent sources.

98.7% of our negotiated price records pass validation (782,309 of 792,375 records) — only the highest quality data makes it to the platform. Records that fail critical checks are excluded entirely.

Detector confidence tiers

Our Bill Intelligence Engine uses 21 specialized detectors to surface likely billing issues. Each detector is assigned a confidence tier based on how deterministic its findings are:

Tier A — Confirmed issue (7 detectors) — Deterministic checks with high confidence. These are clear errors like duplicate charges, arithmetic mismatches, or inverted values.
Tier B — Likely issue (8 detectors) — Evidence-based findings that need provider verification. Examples include upcoding patterns, unbundling signals, and balance billing concerns.
Tier C — Potential opportunity (5 detectors) — Pattern-based estimates and optimization hints, such as negotiation opportunities and financial assistance eligibility screening.

This tiered approach ensures transparency about the confidence level of each finding. Taven is a decision-support system — it surfaces likely issues and estimated opportunity ranges to help you take informed action, not an autonomous advisor that predicts exact outcomes.

ERA-based allowed amounts (CMS v3.0)

Under the CMS v3.0 schema (effective April 1, 2026), hospitals must publish actual allowed amount statistics when a negotiated rate is expressed as a percentage of billed charges or an algorithm rather than a fixed dollar amount. These statistics are calculated from EDI 835 Electronic Remittance Advice (ERA) data — the electronic records insurers send to providers detailing how claims were actually paid.

What the statistics mean

Median allowed amount — The midpoint of actual insurer payments. Half of all claims were paid more and half were paid less. This is the best single estimate of what you'd likely owe for a procedure.
10th percentile allowed amount — The lower bound of the payment range. Only 10% of claims were paid less than this amount. Useful for understanding the best-case scenario.
90th percentile allowed amount — The upper bound of the payment range. Only 10% of claims were paid more. Useful for understanding worst-case cost exposure.
Count of allowed amounts — The number of claims used in the calculation. Higher counts indicate more reliable statistics. We display the count so you can gauge confidence.

How they're calculated

Hospitals calculate these statistics from 12–15 months of ERA data. Each insurer's actual payments for a given procedure at that hospital are aggregated, and the median and percentile boundaries are computed. This means the numbers reflect real payments, not theoretical rates or estimates.

How they improve on v2.0

Under the previous v2.0 schema, hospitals could publish an estimated_allowed_amount — a single estimate that was often inaccurate or missing. The v3.0 approach is superior in three ways:

Based on actual claims data — real payments from insurers, not hospital estimates
Shows the full range — the 10th–90th percentile spread shows you how much prices vary, not just a single number
Includes sample size — the count of claims lets you judge how reliable the statistics are

When you'll see them

On Taven hospital pages, allowed amount columns appear when a hospital's data includes v3.0 ERA statistics. The "Median Allowed" and "Range (10th–90th)" columns are displayed alongside standard negotiated rates, giving you both the contracted rate and what insurers actually pay. When v3.0 data isn't yet available for a hospital, we continue showing the standard negotiated rate averages.

Limitations & honesty

⚠️ Important to understand

Prices shown are negotiated rates, not what you'll actually pay. These are the rates hospitals and insurers have agreed to. Your actual cost depends on your insurance plan, deductible, and specific circumstances.

Not all hospitals comply fully. While federal law requires price transparency, some hospitals publish incomplete data or update infrequently. We flag these issues when we detect them.

This is not medical advice. Taven is a pricing transparency tool. Choosing a provider should involve clinical quality, your doctor's guidance, and factors beyond price alone.

🔒 Want the full methodology?

Our complete data methodology — including our validation framework, confidence scoring model, and data pipeline architecture — is available to verified partners.

Full validation framework Confidence scoring model Data quality reports Pipeline architecture

Request Partner Access →