# CMS Medicaid Provider Spending Dataset

### About the Dataset

The **CMS Medicaid Provider Spending Dataset** delivers **227+ million** outpatient **Medicaid** claims aggregated by provider, procedure, and month across 50+ states — **automatically updated** when **CMS** publishes new **T-MSIS** (Transformed Medicaid Statistical Information System) data. This analytics-ready product enriches spending records with **NPI** provider demographics from **NPPES** and state-level data quality ratings from the **CMS DQ Atlas**, so you know which states to trust before you query. With 6 states flagged unusable and 16 more rated high concern by CMS, the built-in quality scoring saves analysts weeks of cross-referencing. Designed for health policy analysts, Medicaid program administrators, and healthcare consultants who need provider-level Medicaid expenditure intelligence without months of data wrangling.

{% hint style="success" %}
**Now Available** on the Snowflake Marketplace — [Get Listing](https://app.snowflake.com/marketplace/listing/GZT1Z7QRSYVE)
{% endhint %}

### Quick Access

**Tables**: PROVIDER\_SPENDING, BILLING\_PROVIDERS, SERVICING\_PROVIDERS, HCPCS\_CODES, STATE\_QUALITY\
**Sources**: CMS T-MSIS Analytic Files, NPPES, CMS DQ Atlas\
**Automatic Updates**: Pipeline monitors CMS for new releases and loads data within days of publication\
**Coverage**: 50+ US states and territories, 2018-present (7+ years)

## Overview

The CMS Medicaid Provider Spending Dataset provides comprehensive access to Medicaid outpatient spending data including:

* **PROVIDER\_SPENDING** (spending) - 200M+ claims aggregated by billing provider NPI, servicing provider NPI, HCPCS procedure code, and month — with total paid, claims, and beneficiary counts
* **BILLING\_PROVIDERS** (billing\_providers) - 600K+ billing provider demographics including name, address, specialty taxonomy, and NPI assignment date
* **SERVICING\_PROVIDERS** (servicing\_providers) - 1.5M+ servicing provider demographics with the same NPI-enriched fields
* **HCPCS\_CODES** (hcpcs\_codes) - 20K+ HCPCS Level II and CPT procedure code descriptions from CMS reference files
* **STATE\_QUALITY** (state\_quality) - CMS DQ Atlas state-level data quality ratings for T-MSIS expenditure data across 54 states/territories

### Metadata Tables

Every Dataplex data product includes these standard metadata tables:

| Table             | Purpose                                                        |
| ----------------- | -------------------------------------------------------------- |
| `FEEDS`           | Dataset catalog — available tables, descriptions, update dates |
| `FEEDS_FILES`     | Batch load history with `is_latest` flag for data freshness    |
| `CHANGELOG`       | Change log — data loads, schema changes, corrections           |
| `DATA_DICTIONARY` | Column descriptions for all tables                             |

## Entity Relationship Diagram

![CMS Medicaid Provider Spending Entity Relationship Diagram](/files/KxeXIhBol9dKhXtdkypu)

PROVIDER\_SPENDING is the central fact table. Join to BILLING\_PROVIDERS via `BILLING_PROVIDER_NPI_NUM = NPI`, to SERVICING\_PROVIDERS via `SERVICING_PROVIDER_NPI_NUM = NPI`, and to HCPCS\_CODES via `HCPCS_CODE` (LEFT JOIN — \~83% coverage). STATE\_QUALITY joins to BILLING\_PROVIDERS via `STATE = STATE_CODE` for quality-adjusted analysis. All tables link to FEEDS and FEEDS\_FILES via `feed_id` and `feeds_files_id` for data lineage.

## Medicaid Provider Spending Tables

### Provider Spending (PROVIDER\_SPENDING)

Medicaid outpatient claims aggregated by billing provider, servicing provider, HCPCS procedure code, and month. The core fact table with 200M+ rows spanning 7+ years of monthly data.

**Key Features:**

* 200M+ rows — one per billing NPI + servicing NPI + HCPCS code + month
* Total paid, total claims, unique beneficiaries per aggregation
* Computed metrics: average paid per claim and per beneficiary
* CLAIM\_FROM\_DATE enables native date arithmetic and time-series functions
* Privacy floor: minimum 12 beneficiaries per row (CMS suppression)

#### Column Reference

| Column                        | Type      | Description                                                                  |
| ----------------------------- | --------- | ---------------------------------------------------------------------------- |
| BILLING\_PROVIDER\_NPI\_NUM   | VARCHAR   | National Provider Identifier of the billing provider                         |
| SERVICING\_PROVIDER\_NPI\_NUM | VARCHAR   | National Provider Identifier of the servicing provider                       |
| HCPCS\_CODE                   | VARCHAR   | Healthcare Common Procedure Coding System Level II code                      |
| CLAIM\_FROM\_MONTH            | VARCHAR   | Month of claim service (YYYY-MM format)                                      |
| CLAIM\_FROM\_DATE             | DATE      | First day of the claim month (enables date arithmetic)                       |
| TOTAL\_UNIQUE\_BENEFICIARIES  | NUMBER    | Number of unique Medicaid beneficiaries served (min 12 for privacy)          |
| TOTAL\_CLAIMS                 | NUMBER    | Total number of Medicaid claims submitted                                    |
| TOTAL\_PAID                   | NUMBER    | Total Medicaid amount paid in USD                                            |
| AVG\_PAID\_PER\_CLAIM         | NUMBER    | Average payment per claim (TOTAL\_PAID / TOTAL\_CLAIMS)                      |
| AVG\_PAID\_PER\_BENEFICIARY   | NUMBER    | Average payment per beneficiary (TOTAL\_PAID / TOTAL\_UNIQUE\_BENEFICIARIES) |
| feed\_id                      | VARCHAR   | FK to FEEDS — identifies which dataset this row belongs to                   |
| feeds\_files\_id              | VARCHAR   | FK to FEEDS\_FILES — identifies which batch loaded this data                 |
| created\_at                   | TIMESTAMP | When the source data was loaded into the warehouse                           |
| updated\_at                   | TIMESTAMP | When dbt last rebuilt this table                                             |

***

### Billing Providers (BILLING\_PROVIDERS)

Billing provider demographics enriched with NPI data — 600K+ unique NPIs with name, address, specialty, and credentials.

**Key Features:**

* 600K+ unique billing NPIs
* NPI-linked demographics: name, address, phone, taxonomy code
* Entity type distinguishes individuals (1) from organizations (2)
* State field for geographic analysis (mostly 2-letter codes; \~0.01% non-standard from NPPES)

#### Column Reference

| Column            | Type      | Description                                                  |
| ----------------- | --------- | ------------------------------------------------------------ |
| NPI               | VARCHAR   | National Provider Identifier                                 |
| ENTITY\_TYPE      | VARCHAR   | 1 = Individual provider, 2 = Organization                    |
| ORG\_NAME         | VARCHAR   | Organization name (NULL for individual providers)            |
| LAST\_NAME        | VARCHAR   | Provider last name (NULL for organizations)                  |
| FIRST\_NAME       | VARCHAR   | Provider first name (NULL for organizations)                 |
| MIDDLE\_NAME      | VARCHAR   | Provider middle name                                         |
| CREDENTIAL        | VARCHAR   | Provider credential (MD, DO, NP, etc.)                       |
| ADDRESS\_LINE1    | VARCHAR   | Practice location street address                             |
| CITY              | VARCHAR   | Practice location city                                       |
| STATE             | VARCHAR   | Practice location state (2-letter code)                      |
| ZIP               | VARCHAR   | Practice location ZIP code (5-digit)                         |
| PHONE             | VARCHAR   | Practice phone number                                        |
| SEX               | VARCHAR   | Provider sex (individuals only)                              |
| TAXONOMY\_CODE    | VARCHAR   | Primary healthcare provider taxonomy code                    |
| ENUMERATION\_DATE | DATE      | Date NPI was assigned                                        |
| feed\_id          | VARCHAR   | FK to FEEDS — identifies which dataset this row belongs to   |
| feeds\_files\_id  | VARCHAR   | FK to FEEDS\_FILES — identifies which batch loaded this data |
| created\_at       | TIMESTAMP | When the source data was loaded into the warehouse           |
| updated\_at       | TIMESTAMP | When dbt last rebuilt this table                             |

***

### Servicing Providers (SERVICING\_PROVIDERS)

Servicing provider demographics — 1.5M+ unique NPIs. Same schema as BILLING\_PROVIDERS. The servicing provider is the individual or organization that actually performed the service.

**Key Features:**

* 1.5M+ unique servicing NPIs
* Same NPI-linked fields as BILLING\_PROVIDERS
* Larger population because servicing includes both billing and non-billing providers

#### Column Reference

Same columns as BILLING\_PROVIDERS above.

***

### HCPCS Codes (HCPCS\_CODES)

HCPCS Level II and CPT procedure/service code descriptions. Combines CMS Physician Fee Schedule (CPT codes) and CMS HCPCS quarterly file (Level II alpha-prefix codes).

**Key Features:**

* 20K+ procedure codes with descriptions
* Covers \~83% of distinct codes in PROVIDER\_SPENDING
* Remaining \~17% are temporary codes, state-specific codes, and ADA dental codes — use LEFT JOIN

#### Column Reference

| Column           | Type      | Description                                                  |
| ---------------- | --------- | ------------------------------------------------------------ |
| HCPCS\_CODE      | VARCHAR   | HCPCS Level II (alpha prefix A-V) or CPT (numeric) code      |
| DESCRIPTION      | VARCHAR   | Short description of the procedure or service                |
| feed\_id         | VARCHAR   | FK to FEEDS — identifies which dataset this row belongs to   |
| feeds\_files\_id | VARCHAR   | FK to FEEDS\_FILES — identifies which batch loaded this data |
| created\_at      | TIMESTAMP | When the source data was loaded into the warehouse           |
| updated\_at      | TIMESTAMP | When dbt last rebuilt this table                             |

***

### State Quality (STATE\_QUALITY)

CMS DQ Atlas state-level data quality ratings for T-MSIS spending data. CMS rates most states as "Unusable" under their stringent DQ Atlas methodology — this reflects CMS's quality standards, not that the spending data is invalid.

**Key Features:**

* 2,400+ quality assessments across states and topic areas
* OVERALL\_QUALITY rollup across 4 spending-relevant topics
* Use to flag states with known CMS-identified reporting gaps
* Source: download.medicaid.gov DQ Atlas bulk CSV

#### Column Reference

| Column           | Type      | Description                                                                 |
| ---------------- | --------- | --------------------------------------------------------------------------- |
| STATE\_CODE      | VARCHAR   | Two-letter US state/territory code                                          |
| STATE\_NAME      | VARCHAR   | Full state/territory name                                                   |
| MEASURE\_NAME    | VARCHAR   | CMS DQ Atlas measure name (e.g., 'IP Stays', 'Enrollment Counts')           |
| TOPIC\_AREA      | VARCHAR   | DQ Atlas topic area grouping                                                |
| RATING           | VARCHAR   | Quality rating for this state + measure (Low/Medium/High Concern, Unusable) |
| OVERALL\_QUALITY | VARCHAR   | Worst-of quality rollup across spending-relevant DQ Atlas topics            |
| feed\_id         | VARCHAR   | FK to FEEDS — identifies which dataset this row belongs to                  |
| feeds\_files\_id | VARCHAR   | FK to FEEDS\_FILES — identifies which batch loaded this data                |
| created\_at      | TIMESTAMP | When the source data was loaded into the warehouse                          |
| updated\_at      | TIMESTAMP | When dbt last rebuilt this table                                            |

## Data Quality

### Standardization

* All columns use UPPERCASE naming consistent with Snowflake conventions
* CLAIM\_FROM\_DATE computed from CLAIM\_FROM\_MONTH for native date arithmetic
* AVG\_PAID\_PER\_CLAIM and AVG\_PAID\_PER\_BENEFICIARY pre-computed for convenience
* Provider NPI fields validated as 10-digit identifiers
* STATE\_QUALITY filtered to 4 spending-relevant topic areas with OVERALL\_QUALITY rollup
* HCPCS\_CODES merged from 3 CMS reference sources with deduplication

### Data Freshness

Check when data was last updated:

{% tabs %}
{% tab title="Snowflake" %}

```sql
SELECT source_name, data_period, row_count, created_at, is_latest
FROM DWV.FEEDS_FILES
WHERE is_latest = TRUE;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
SELECT source_name, data_period, row_count, created_at, is_latest
FROM cms_tmsis_provider_spending_dwv.feeds_files
WHERE is_latest = TRUE;
```

{% endtab %}
{% endtabs %}

## How to Query CMS Medicaid Provider Spending Data

### Platform Schema Reference

This dataset is available on both Snowflake and Databricks. Queries use schema-only references — the database is already set by the share or catalog context:

| Platform       | Schema                            | Example                                             |
| -------------- | --------------------------------- | --------------------------------------------------- |
| **Snowflake**  | `DWV`                             | `DWV.PROVIDER_SPENDING`                             |
| **Databricks** | `cms_tmsis_provider_spending_dwv` | `cms_tmsis_provider_spending_dwv.provider_spending` |

### Discover Available Data

Start with the `FEEDS` table to see what's available, and `FEEDS_FILES` to understand data freshness and load history.

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Browse the data catalog
SELECT table_name, description, latest_data_period,
       latest_row_count, last_loaded_at, total_batches
FROM DWV.FEEDS
ORDER BY table_name;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
-- Browse the data catalog
SELECT table_name, description, latest_data_period,
       latest_row_count, last_loaded_at, total_batches
FROM cms_tmsis_provider_spending_dwv.feeds
ORDER BY table_name;
```

{% endtab %}
{% endtabs %}

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Check data freshness per source
-- is_latest = TRUE marks the current batch for each source
SELECT source_name, data_period, row_count, row_count_delta,
       created_at AS loaded_at, is_latest
FROM DWV.FEEDS_FILES
ORDER BY source_name, created_at DESC;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
SELECT source_name, data_period, row_count, row_count_delta,
       created_at AS loaded_at, is_latest
FROM cms_tmsis_provider_spending_dwv.feeds_files
ORDER BY source_name, created_at DESC;
```

{% endtab %}
{% endtabs %}

### Working with Data Lineage

Every data row links to `FEEDS_FILES` via `feeds_files_id`, which tells you exactly which batch loaded that data. Use this to filter to the current data version or trace any row back to its source load.

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Get current spending data with batch lineage
SELECT
    ps.BILLING_PROVIDER_NPI_NUM,
    ps.HCPCS_CODE,
    ps.TOTAL_PAID,
    ps.TOTAL_CLAIMS,
    ff.data_period,
    ff.created_at AS data_loaded_at
FROM DWV.PROVIDER_SPENDING ps
JOIN DWV.FEEDS_FILES ff ON ps.feeds_files_id = ff.id
WHERE ff.is_latest = TRUE
LIMIT 20;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
SELECT
    ps.BILLING_PROVIDER_NPI_NUM,
    ps.HCPCS_CODE,
    ps.TOTAL_PAID,
    ps.TOTAL_CLAIMS,
    ff.data_period,
    ff.created_at AS data_loaded_at
FROM cms_tmsis_provider_spending_dwv.provider_spending ps
JOIN cms_tmsis_provider_spending_dwv.feeds_files ff ON ps.feeds_files_id = ff.id
WHERE ff.is_latest = TRUE
LIMIT 20;
```

{% endtab %}
{% endtabs %}

### Top Billing Providers by State

Identify the highest-spending Medicaid billing providers in any state.

{% tabs %}
{% tab title="Snowflake" %}

```sql
SELECT bp.STATE, bp.NPI, bp.ORG_NAME, bp.LAST_NAME, bp.FIRST_NAME,
       SUM(ps.TOTAL_PAID) AS total_paid,
       SUM(ps.TOTAL_CLAIMS) AS total_claims
FROM DWV.PROVIDER_SPENDING ps
JOIN DWV.BILLING_PROVIDERS bp ON ps.BILLING_PROVIDER_NPI_NUM = bp.NPI
WHERE bp.STATE = 'CA'
GROUP BY 1, 2, 3, 4, 5
ORDER BY total_paid DESC
LIMIT 25;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
SELECT bp.STATE, bp.NPI, bp.ORG_NAME, bp.LAST_NAME, bp.FIRST_NAME,
       SUM(ps.TOTAL_PAID) AS total_paid,
       SUM(ps.TOTAL_CLAIMS) AS total_claims
FROM cms_tmsis_provider_spending_dwv.provider_spending ps
JOIN cms_tmsis_provider_spending_dwv.billing_providers bp ON ps.BILLING_PROVIDER_NPI_NUM = bp.NPI
WHERE bp.STATE = 'CA'
GROUP BY 1, 2, 3, 4, 5
ORDER BY total_paid DESC
LIMIT 25;
```

{% endtab %}
{% endtabs %}

### Monthly Medicaid Spending Trends

Track Medicaid spending and provider participation over 7+ years of monthly data, including pandemic impact periods.

{% tabs %}
{% tab title="Snowflake" %}

```sql
SELECT CLAIM_FROM_MONTH,
       COUNT(DISTINCT BILLING_PROVIDER_NPI_NUM) AS unique_providers,
       SUM(TOTAL_CLAIMS) AS total_claims,
       SUM(TOTAL_PAID) AS total_paid
FROM DWV.PROVIDER_SPENDING
GROUP BY 1
ORDER BY 1;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
SELECT CLAIM_FROM_MONTH,
       COUNT(DISTINCT BILLING_PROVIDER_NPI_NUM) AS unique_providers,
       SUM(TOTAL_CLAIMS) AS total_claims,
       SUM(TOTAL_PAID) AS total_paid
FROM cms_tmsis_provider_spending_dwv.provider_spending
GROUP BY 1
ORDER BY 1;
```

{% endtab %}
{% endtabs %}

### Spending by HCPCS Procedure Category

Analyze which procedures drive the most Medicaid spending.

{% tabs %}
{% tab title="Snowflake" %}

```sql
SELECT ps.HCPCS_CODE, h.DESCRIPTION,
       SUM(ps.TOTAL_PAID) AS total_paid,
       SUM(ps.TOTAL_CLAIMS) AS total_claims,
       SUM(ps.TOTAL_UNIQUE_BENEFICIARIES) AS total_beneficiaries
FROM DWV.PROVIDER_SPENDING ps
LEFT JOIN DWV.HCPCS_CODES h ON ps.HCPCS_CODE = h.HCPCS_CODE
GROUP BY 1, 2
ORDER BY total_paid DESC
LIMIT 50;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
SELECT ps.HCPCS_CODE, h.DESCRIPTION,
       SUM(ps.TOTAL_PAID) AS total_paid,
       SUM(ps.TOTAL_CLAIMS) AS total_claims,
       SUM(ps.TOTAL_UNIQUE_BENEFICIARIES) AS total_beneficiaries
FROM cms_tmsis_provider_spending_dwv.provider_spending ps
LEFT JOIN cms_tmsis_provider_spending_dwv.hcpcs_codes h ON ps.HCPCS_CODE = h.HCPCS_CODE
GROUP BY 1, 2
ORDER BY total_paid DESC
LIMIT 50;
```

{% endtab %}
{% endtabs %}

### State Quality-Adjusted Spending Analysis

Cross-reference spending totals with CMS DQ Atlas quality ratings to identify states where data quality may affect analysis reliability.

{% tabs %}
{% tab title="Snowflake" %}

```sql
SELECT bp.STATE, sq.OVERALL_QUALITY,
       SUM(ps.TOTAL_PAID) AS total_paid,
       COUNT(DISTINCT ps.BILLING_PROVIDER_NPI_NUM) AS unique_providers
FROM DWV.PROVIDER_SPENDING ps
JOIN DWV.BILLING_PROVIDERS bp ON ps.BILLING_PROVIDER_NPI_NUM = bp.NPI
LEFT JOIN (
    SELECT DISTINCT STATE_CODE, OVERALL_QUALITY
    FROM DWV.STATE_QUALITY
) sq ON bp.STATE = sq.STATE_CODE
GROUP BY 1, 2
ORDER BY total_paid DESC;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
SELECT bp.STATE, sq.OVERALL_QUALITY,
       SUM(ps.TOTAL_PAID) AS total_paid,
       COUNT(DISTINCT ps.BILLING_PROVIDER_NPI_NUM) AS unique_providers
FROM cms_tmsis_provider_spending_dwv.provider_spending ps
JOIN cms_tmsis_provider_spending_dwv.billing_providers bp ON ps.BILLING_PROVIDER_NPI_NUM = bp.NPI
LEFT JOIN (
    SELECT DISTINCT STATE_CODE, OVERALL_QUALITY
    FROM cms_tmsis_provider_spending_dwv.state_quality
) sq ON bp.STATE = sq.STATE_CODE
GROUP BY 1, 2
ORDER BY total_paid DESC;
```

{% endtab %}
{% endtabs %}

### Billing Outlier Detection

Find providers billing significantly above state medians for specific procedures — useful for fraud detection and billing compliance.

{% tabs %}
{% tab title="Snowflake" %}

```sql
WITH state_medians AS (
    SELECT bp.STATE, ps.HCPCS_CODE,
           MEDIAN(ps.AVG_PAID_PER_CLAIM) AS median_per_claim
    FROM DWV.PROVIDER_SPENDING ps
    JOIN DWV.BILLING_PROVIDERS bp ON ps.BILLING_PROVIDER_NPI_NUM = bp.NPI
    GROUP BY 1, 2
    HAVING median_per_claim > 1.00
)
SELECT ps.BILLING_PROVIDER_NPI_NUM, bp.STATE, ps.HCPCS_CODE,
       h.DESCRIPTION,
       ps.AVG_PAID_PER_CLAIM, sm.median_per_claim,
       ps.AVG_PAID_PER_CLAIM / sm.median_per_claim AS ratio_to_median
FROM DWV.PROVIDER_SPENDING ps
JOIN DWV.BILLING_PROVIDERS bp ON ps.BILLING_PROVIDER_NPI_NUM = bp.NPI
JOIN state_medians sm ON bp.STATE = sm.STATE AND ps.HCPCS_CODE = sm.HCPCS_CODE
LEFT JOIN DWV.HCPCS_CODES h ON ps.HCPCS_CODE = h.HCPCS_CODE
WHERE ps.AVG_PAID_PER_CLAIM > sm.median_per_claim * 3
ORDER BY ratio_to_median DESC
LIMIT 100;
```

{% endtab %}
{% endtabs %}

### Tracking Data Changes Over Time

`FEEDS_FILES` records every batch load with `row_count_delta` showing what changed. Use this to monitor source data updates.

{% tabs %}
{% tab title="Snowflake" %}

```sql
SELECT source_name, data_period, row_count, row_count_delta,
       created_at AS loaded_at, is_latest
FROM DWV.FEEDS_FILES
WHERE source_name = 'spending'
ORDER BY created_at;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
SELECT source_name, data_period, row_count, row_count_delta,
       created_at AS loaded_at, is_latest
FROM cms_tmsis_provider_spending_dwv.feeds_files
WHERE source_name = 'spending'
ORDER BY created_at;
```

{% endtab %}
{% endtabs %}

## Who Uses This Data

### Common Use Cases

* **Medicaid billing outlier detection** — Identify providers billing significantly above state medians by procedure code and geography, flagging potential fraud or billing errors
* **Medicaid spending trend analysis** — Track spending patterns across 84 months (2018-2024) including COVID-19 pandemic impact on Medicaid utilization and costs
* **Cross-state Medicaid comparisons** — Compare per-provider and per-procedure spending across 50+ states with quality-adjusted analysis using CMS DQ Atlas ratings
* **Provider network analysis** — Map billing-to-servicing provider relationships to identify coverage gaps, network concentration, and referral patterns
* **HCPCS procedure cost benchmarking** — Benchmark procedure-level costs against state and national medians for rate-setting and contract negotiation
* **Grant writing and policy research** — Access structured Medicaid expenditure data for FQHC grant applications, state Medicaid program evaluations, and federal policy analysis

### Related Datasets and Research

This dataset pairs well with:

* [**CMS NPPES Provider Dataset**](/data-catalog/cms-nppes-provider-dataset.md) — Extended NPI provider attributes including additional practice locations, other names, and full taxonomy classification details
* [**HRSA Healthcare Resources**](/data-catalog/hrsa-dataset.md) — County-level healthcare workforce and shortage area data for correlating spending patterns with provider supply
* [**CMS Data Feeds Dataset**](/data-catalog/cms-data-feeds-dataset.md) — Medicare physician spending comparisons via the Medicare Provider Utilization and Payment Data feeds

### Frequently Asked Questions

**What states are included in the T-MSIS provider spending data?** All 50 US states plus DC, Puerto Rico, US Virgin Islands, and Guam — 54 jurisdictions total. Coverage varies by year as states onboarded to T-MSIS between 2015-2020. By 2020, all states report through T-MSIS.

**How often is the data updated?** CMS publishes new T-MSIS spending data approximately annually. Our pipeline automatically monitors for new releases and loads updated data within days of publication — no manual intervention needed. Coverage currently spans 2018-present and grows with each CMS release.

**What does "Unusable" mean in STATE\_QUALITY?** CMS's DQ Atlas applies stringent quality thresholds to T-MSIS data. Most states (\~50 of 54) receive an "Unusable" rating on at least one spending-related measure. This reflects CMS's internal quality standards — it does not mean the spending data is invalid or should be discarded. Use OVERALL\_QUALITY to identify states with known reporting gaps and adjust analysis accordingly.

**Why do only \~83% of HCPCS codes have descriptions?** The HCPCS\_CODES table combines CMS's Physician Fee Schedule (CPT codes) and HCPCS Level II quarterly file. The remaining \~17% are temporary local codes (e.g., T-codes, S-codes), state-specific codes, and ADA dental codes not published in national CMS reference files. Always use LEFT JOIN when joining PROVIDER\_SPENDING to HCPCS\_CODES.

{% hint style="success" %}
**Ready to access CMS Medicaid Provider Spending data?**
{% endhint %}

| Platform       | Action                                                                    |
| -------------- | ------------------------------------------------------------------------- |
| **Snowflake**  | [Get Listing](https://app.snowflake.com/marketplace/listing/GZT1Z7QRSYVE) |
| **Databricks** | [Free Trial](https://trial.dataplex-consulting.com)                       |

{% hint style="success" %}
Questions? [Contact our team](mailto:support@dataplex-consulting.com) for a walkthrough.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dataplex-consulting.com/data-catalog/cms-medicaid-provider-spending-dataset.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
