# HRSA Healthcare Resources Dataset

### About the Dataset

The HRSA Healthcare Resources Dataset provides comprehensive, county-level access to US healthcare workforce, shortage area designations, and health center locations from the Health Resources and Services Administration. Four government datasets are combined into a single database: physician and nursing workforce counts (AHRF), medically underserved area designations (MUA/MUP), health professional shortage area scores (HPSA), and 14,000+ federally qualified health center sites.

{% hint style="info" %}
**Get Full Access** | [Snowflake Marketplace](https://app.snowflake.com/marketplace/listing/GZT1Z7QRSW96) | [Free Trial](https://trial.dataplex-consulting.com)
{% endhint %}

### Quick Access

**Tables**: AHRF, SHORTAGE\_DESIGNATIONS, HPSA, HEALTH\_CENTER\_SITES\
**Sources**: 4 government data sources from data.hrsa.gov\
**Update Frequency**: AHRF annual, Shortage Designations monthly, HPSA and Health Center Sites quarterly\
**Geography**: All 3,200+ US counties

## Overview

The HRSA Healthcare Resources Dataset provides comprehensive access to US healthcare workforce and shortage data including:

* **AHRF** (Area Health Resources Files) - Physician and nursing workforce counts with per-100k population rates by county
* **SHORTAGE\_DESIGNATIONS** - Medically Underserved Areas and Populations (MUA/MUP) designations by county
* **HPSA** - Health Professional Shortage Area scores for Primary Care, Dental, and Mental Health by county
* **HEALTH\_CENTER\_SITES** - FQHC and Look-Alike service delivery site locations with addresses and coordinates

### Metadata Tables

Every Dataplex data product includes these standard metadata tables:

| Table             | Purpose                                                        |
| ----------------- | -------------------------------------------------------------- |
| `FEEDS`           | Dataset catalog — available tables, descriptions, update dates |
| `FEEDS_FILES`     | Batch load history with `is_latest` flag for data freshness    |
| `CHANGELOG`       | Change log — data loads, schema changes, corrections           |
| `DATA_DICTIONARY` | Column descriptions for all tables                             |

## Entity Relationship Diagram

![HRSA Entity Relationship Diagram](/files/KpDTexccNTIULH6vXZk7)

All four data tables join on `fips_county` for cross-source county-level analysis. Each table links to `FEEDS` and `FEEDS_FILES` via `feed_id` and `feeds_files_id` for data lineage and freshness tracking.

## Data Tables

### AHRF (Area Health Resources Files)

Health workforce data by US county — physicians, nurses, and population with per-100k rates. Unpivoted to latest and previous data periods for trend analysis.

**Key Features:**

* \~3,235 rows per vintage (one per US county)
* Physician and nursing counts with population-adjusted rates
* Latest and previous data years for year-over-year comparison
* Annual updates from HRSA (\~Q1 each year)

#### Column Reference

| Column                    | Type      | Description                                                                                        |
| ------------------------- | --------- | -------------------------------------------------------------------------------------------------- |
| fips\_county              | VARCHAR   | Five-digit Federal county code (state FIPS + county FIPS)                                          |
| county\_name              | VARCHAR   | County name with state abbreviation (e.g., New York, NY)                                           |
| state\_name               | VARCHAR   | Full state name                                                                                    |
| state\_abbrev             | VARCHAR   | Two-letter USPS state abbreviation                                                                 |
| fips\_state               | VARCHAR   | Two-digit state FIPS code                                                                          |
| vintage\_year             | VARCHAR   | AHRF release identifier (e.g., '2024-2025'). First four digits = latest data year in that release. |
| data\_year                | NUMBER    | Calendar year for this data point (e.g., 2024). Query WHERE data\_year = 2024 for specific years.  |
| total\_mds                | NUMBER    | Non-federal physicians (MDs) in the county                                                         |
| rn\_full\_time            | NUMBER    | Full-time registered nurses                                                                        |
| rn\_part\_time            | NUMBER    | Part-time registered nurses                                                                        |
| rn\_total                 | NUMBER    | Total registered nurses (full-time + part-time)                                                    |
| lpnlvn\_full\_time        | NUMBER    | Full-time licensed practical/vocational nurses                                                     |
| lpnlvn\_part\_time        | NUMBER    | Part-time licensed practical/vocational nurses                                                     |
| lpnlvn\_total             | NUMBER    | Total LPN/LVNs (full-time + part-time)                                                             |
| nursing\_workforce\_total | NUMBER    | All nurses: RN + LPN/LVN, full and part-time                                                       |
| population                | NUMBER    | County population estimate (NULL for some territories)                                             |
| mds\_per\_100k            | NUMBER    | Physicians per 100,000 population                                                                  |
| nurses\_per\_100k         | NUMBER    | Total nursing workforce per 100,000 population                                                     |
| feed\_id                  | VARCHAR   | FK to FEEDS — identifies which dataset this row belongs to                                         |
| feeds\_files\_id          | VARCHAR   | FK to FEEDS\_FILES — identifies which batch loaded this data                                       |
| created\_at               | TIMESTAMP | When the source data was loaded into the warehouse                                                 |
| updated\_at               | TIMESTAMP | When the table was last rebuilt                                                                    |

### SHORTAGE\_DESIGNATIONS (MUA/MUP)

Medically Underserved Areas and Populations by county — federal shortage area designations with population counts.

**Key Features:**

* \~3,200 rows (one per US county)
* MUA/MUP designation counts and affected population
* Derived from HRSA shapefiles with geospatial county aggregation
* Monthly updates from HRSA

#### Column Reference

| Column                  | Type      | Description                                                  |
| ----------------------- | --------- | ------------------------------------------------------------ |
| fips\_county            | VARCHAR   | Five-digit Federal county code                               |
| mua\_designation\_count | NUMBER    | Number of MUA/MUP designations in the county                 |
| mua\_population         | NUMBER    | Population in Medically Underserved Areas within the county  |
| feed\_id                | VARCHAR   | FK to FEEDS — identifies which dataset this row belongs to   |
| feeds\_files\_id        | VARCHAR   | FK to FEEDS\_FILES — identifies which batch loaded this data |
| created\_at             | TIMESTAMP | When the source data was loaded into the warehouse           |
| updated\_at             | TIMESTAMP | When the table was last rebuilt                              |

### HPSA (Health Professional Shortage Areas)

Health Professional Shortage Areas by county and discipline — Primary Care, Dental, and Mental Health shortage scores aggregated to county level.

**Key Features:**

* \~18,000 designation records across 3 disciplines
* HPSA scores (0-26 scale, higher = more severe shortage)
* Provider shortage estimates (additional providers needed)
* Quarterly updates from HRSA with snapshot-based history

#### Column Reference

| Column                    | Type      | Description                                                                                               |
| ------------------------- | --------- | --------------------------------------------------------------------------------------------------------- |
| fips\_county              | VARCHAR   | Five-digit Federal county code                                                                            |
| county\_name              | VARCHAR   | County name                                                                                               |
| state\_name               | VARCHAR   | Full state name                                                                                           |
| state\_abbrev             | VARCHAR   | Two-letter USPS state abbreviation                                                                        |
| discipline                | VARCHAR   | HPSA discipline: PRIMARY\_CARE, DENTAL, or MENTAL\_HEALTH                                                 |
| snapshot\_month           | VARCHAR   | Month this data snapshot was captured (e.g., '2026-02'). Use for trend analysis across time periods.      |
| hpsa\_score               | NUMBER    | Worst (highest) HPSA score in this county for this discipline (0-26 scale, higher = more severe shortage) |
| designation\_count        | NUMBER    | Number of active HPSA designations in this county for this discipline                                     |
| designated\_population    | NUMBER    | Total population in designated shortage areas within this county                                          |
| underserved\_population   | NUMBER    | Estimated underserved population in this county for this discipline                                       |
| provider\_shortage        | NUMBER    | Number of additional providers needed to remove the shortage designation                                  |
| last\_designation\_update | DATE      | Most recent designation update date for this county and discipline                                        |
| feed\_id                  | VARCHAR   | FK to FEEDS — identifies which dataset this row belongs to                                                |
| feeds\_files\_id          | VARCHAR   | FK to FEEDS\_FILES — identifies which batch loaded this data                                              |
| created\_at               | TIMESTAMP | When the source data was loaded into the warehouse                                                        |
| updated\_at               | TIMESTAMP | When the table was last rebuilt                                                                           |

### HEALTH\_CENTER\_SITES (FQHC and Look-Alike Sites)

FQHC and Look-Alike service delivery sites — individual facility locations with addresses, coordinates, and site types.

**Key Features:**

* \~14,000 individual facility locations
* Addresses, lat/long coordinates, and operating hours
* Site status and health center type classification
* Quarterly updates with snapshot-based history

#### Column Reference

| Column                      | Type      | Description                                                                                          |
| --------------------------- | --------- | ---------------------------------------------------------------------------------------------------- |
| site\_key                   | VARCHAR   | Site identifier (MD5 hash of organization + site name + address)                                     |
| health\_center\_type        | VARCHAR   | Type of health center (e.g., Community Health Center, Migrant Health Center, Look-Alike)             |
| health\_center\_name        | VARCHAR   | Organization name operating this site                                                                |
| site\_name                  | VARCHAR   | Name of this service delivery site                                                                   |
| site\_address               | VARCHAR   | Street address (suppressed for domestic violence shelters)                                           |
| site\_city                  | VARCHAR   | City where the site is located                                                                       |
| state\_abbrev               | VARCHAR   | Two-letter USPS state abbreviation                                                                   |
| zip\_code                   | VARCHAR   | ZIP code (suppressed for domestic violence shelters)                                                 |
| phone                       | VARCHAR   | Site telephone number                                                                                |
| operating\_hours\_per\_week | NUMBER    | Hours per week the site is open for patient visits                                                   |
| site\_status                | VARCHAR   | Current site status (e.g., Active, Inactive)                                                         |
| fips\_county                | VARCHAR   | Five-digit Federal county code — join key to AHRF, HPSA, and other county-level tables               |
| county\_name                | VARCHAR   | County name                                                                                          |
| state\_name                 | VARCHAR   | Full state name                                                                                      |
| longitude                   | NUMBER    | Site longitude (WGS84)                                                                               |
| latitude                    | NUMBER    | Site latitude (WGS84)                                                                                |
| snapshot\_month             | VARCHAR   | Month this data snapshot was captured (e.g., '2026-02'). Use for trend analysis across time periods. |
| feed\_id                    | VARCHAR   | FK to FEEDS — identifies which dataset this row belongs to                                           |
| feeds\_files\_id            | VARCHAR   | FK to FEEDS\_FILES — identifies which batch loaded this data                                         |
| created\_at                 | TIMESTAMP | When the source data was loaded into the warehouse                                                   |
| updated\_at                 | TIMESTAMP | When the table was last rebuilt                                                                      |

## Data Quality

### Standardization

* **Column mapping**: AHRF source columns embed year suffixes (e.g., `MD_NF_24`, `MD_NF_23`) that change annually. An automated column mapper produces year-agnostic names (`total_mds`, `mds_per_100k`).
* **Type casting**: All raw data arrives as VARCHAR and is cast to appropriate types (NUMBER, DATE) in the staging layer using `TRY_TO_NUMBER()`.
* **Geospatial aggregation**: Shortage Designation shapefiles are intersected with county boundaries to produce county-level MUA/MUP summaries.
* **HPSA aggregation**: Component-level HPSA records are aggregated to county level per discipline, taking the worst (highest) score per county.
* **Deduplication**: Each source publishes full files. The pipeline deduplicates to the latest version per primary key.

### Data Freshness

Check when data was last updated:

```sql
SELECT source_name, data_period, row_count, created_at, is_latest
FROM DWV.FEEDS_FILES
WHERE is_latest = TRUE;
```

## Getting Started

### Platform Schema Reference

This dataset is available on both Snowflake and Databricks. Queries use schema-only references — the database is already set by the share or catalog context:

| Platform       | Schema     | Example         |
| -------------- | ---------- | --------------- |
| **Snowflake**  | `DWV`      | `DWV.AHRF`      |
| **Databricks** | `hrsa_dwv` | `hrsa_dwv.ahrf` |

### Discover Available Data

Start with the `FEEDS` table to see what's available, and `FEEDS_FILES` to understand data freshness and load history.

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Browse the data catalog
SELECT table_name, description, latest_data_period,
       latest_row_count, last_loaded_at, total_batches
FROM DWV.FEEDS
ORDER BY table_name;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
-- Browse the data catalog
SELECT table_name, description, latest_data_period,
       latest_row_count, last_loaded_at, total_batches
FROM hrsa_dwv.feeds
ORDER BY table_name;
```

{% endtab %}
{% endtabs %}

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Check data freshness per source
-- is_latest = TRUE marks the current batch for each source
SELECT source_name, data_period, row_count, row_count_delta,
       created_at AS loaded_at, is_latest
FROM DWV.FEEDS_FILES
ORDER BY source_name, created_at DESC;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
-- Check data freshness per source
SELECT source_name, data_period, row_count, row_count_delta,
       created_at AS loaded_at, is_latest
FROM hrsa_dwv.feeds_files
ORDER BY source_name, created_at DESC;
```

{% endtab %}
{% endtabs %}

### Working with Data Lineage

Every data row links to `FEEDS_FILES` via `feeds_files_id`, which tells you exactly which batch loaded that data. Use this to filter to the current data version or trace any row back to its source load.

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Get current workforce data with batch lineage
-- JOIN to FEEDS_FILES to see when data was loaded and which period it covers
SELECT
    a.fips_county,
    a.county_name,
    a.state_abbrev,
    a.data_year,
    a.total_mds,
    a.mds_per_100k,
    a.nurses_per_100k,
    ff.data_period,
    ff.created_at AS data_loaded_at
FROM DWV.AHRF a
JOIN DWV.FEEDS_FILES ff ON a.feeds_files_id = ff.id
WHERE ff.is_latest = TRUE
  AND a.data_year = (SELECT MAX(data_year) FROM DWV.AHRF)
ORDER BY a.mds_per_100k DESC NULLS LAST
LIMIT 20;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
-- Get current workforce data with batch lineage
SELECT
    a.fips_county,
    a.county_name,
    a.state_abbrev,
    a.data_year,
    a.total_mds,
    a.mds_per_100k,
    a.nurses_per_100k,
    ff.data_period,
    ff.created_at AS data_loaded_at
FROM hrsa_dwv.ahrf a
JOIN hrsa_dwv.feeds_files ff ON a.feeds_files_id = ff.id
WHERE ff.is_latest = TRUE
  AND a.data_year = (SELECT MAX(data_year) FROM hrsa_dwv.ahrf)
ORDER BY a.mds_per_100k DESC NULLS LAST
LIMIT 20;
```

{% endtab %}
{% endtabs %}

### Healthcare Workforce Analysis

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Counties with critical workforce shortages
-- Cross-source analysis: workforce + HPSA scores + MUA designations
SELECT
    a.fips_county,
    a.county_name,
    a.state_abbrev,
    a.total_mds,
    a.mds_per_100k,
    a.nurses_per_100k,
    h.hpsa_score,
    h.provider_shortage AS providers_needed,
    s.mua_designation_count
FROM DWV.AHRF a
JOIN DWV.FEEDS_FILES ff_a ON a.feeds_files_id = ff_a.id AND ff_a.is_latest = TRUE
JOIN DWV.HPSA h ON a.fips_county = h.fips_county AND h.discipline = 'PRIMARY_CARE'
JOIN DWV.FEEDS_FILES ff_h ON h.feeds_files_id = ff_h.id AND ff_h.is_latest = TRUE
LEFT JOIN DWV.SHORTAGE_DESIGNATIONS s ON a.fips_county = s.fips_county
WHERE a.data_year = (SELECT MAX(data_year) FROM DWV.AHRF)
  AND h.hpsa_score >= 15
ORDER BY h.hpsa_score DESC
LIMIT 50;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
-- Counties with critical workforce shortages
SELECT
    a.fips_county,
    a.county_name,
    a.state_abbrev,
    a.total_mds,
    a.mds_per_100k,
    a.nurses_per_100k,
    h.hpsa_score,
    h.provider_shortage AS providers_needed,
    s.mua_designation_count
FROM hrsa_dwv.ahrf a
JOIN hrsa_dwv.feeds_files ff_a ON a.feeds_files_id = ff_a.id AND ff_a.is_latest = TRUE
JOIN hrsa_dwv.hpsa h ON a.fips_county = h.fips_county AND h.discipline = 'PRIMARY_CARE'
JOIN hrsa_dwv.feeds_files ff_h ON h.feeds_files_id = ff_h.id AND ff_h.is_latest = TRUE
LEFT JOIN hrsa_dwv.shortage_designations s ON a.fips_county = s.fips_county
WHERE a.data_year = (SELECT MAX(data_year) FROM hrsa_dwv.ahrf)
  AND h.hpsa_score >= 15
ORDER BY h.hpsa_score DESC
LIMIT 50;
```

{% endtab %}
{% endtabs %}

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- State-level workforce summary for grant writing
-- Aggregate county data to state level for federal funding applications
SELECT
    a.state_abbrev,
    a.state_name,
    COUNT(DISTINCT a.fips_county) AS counties,
    SUM(a.population) AS total_population,
    SUM(a.total_mds) AS total_physicians,
    ROUND(SUM(a.total_mds) * 100000.0 / NULLIF(SUM(a.population), 0), 1) AS state_mds_per_100k,
    SUM(h.provider_shortage) AS total_providers_needed,
    COUNT(CASE WHEN h.hpsa_score >= 15 THEN 1 END) AS critical_shortage_counties
FROM DWV.AHRF a
JOIN DWV.FEEDS_FILES ff ON a.feeds_files_id = ff.id AND ff.is_latest = TRUE
LEFT JOIN DWV.HPSA h ON a.fips_county = h.fips_county AND h.discipline = 'PRIMARY_CARE'
    AND h.feeds_files_id IN (SELECT id FROM DWV.FEEDS_FILES WHERE source_name = 'hpsa' AND is_latest = TRUE)
WHERE a.data_year = (SELECT MAX(data_year) FROM DWV.AHRF)
GROUP BY a.state_abbrev, a.state_name
ORDER BY total_providers_needed DESC NULLS LAST;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
-- State-level workforce summary for grant writing
SELECT
    a.state_abbrev,
    a.state_name,
    COUNT(DISTINCT a.fips_county) AS counties,
    SUM(a.population) AS total_population,
    SUM(a.total_mds) AS total_physicians,
    ROUND(SUM(a.total_mds) * 100000.0 / NULLIF(SUM(a.population), 0), 1) AS state_mds_per_100k,
    SUM(h.provider_shortage) AS total_providers_needed,
    COUNT(CASE WHEN h.hpsa_score >= 15 THEN 1 END) AS critical_shortage_counties
FROM hrsa_dwv.ahrf a
JOIN hrsa_dwv.feeds_files ff ON a.feeds_files_id = ff.id AND ff.is_latest = TRUE
LEFT JOIN hrsa_dwv.hpsa h ON a.fips_county = h.fips_county AND h.discipline = 'PRIMARY_CARE'
    AND h.feeds_files_id IN (SELECT id FROM hrsa_dwv.feeds_files WHERE source_name = 'hpsa' AND is_latest = TRUE)
WHERE a.data_year = (SELECT MAX(data_year) FROM hrsa_dwv.ahrf)
GROUP BY a.state_abbrev, a.state_name
ORDER BY total_providers_needed DESC NULLS LAST;
```

{% endtab %}
{% endtabs %}

### FQHC Network Analysis

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Health centers in high-shortage counties
-- Site-level facility data joined with county shortage context
SELECT
    hc.health_center_name,
    hc.site_name,
    hc.site_city,
    hc.state_abbrev,
    hc.operating_hours_per_week,
    h.hpsa_score,
    h.designated_population,
    a.mds_per_100k
FROM DWV.HEALTH_CENTER_SITES hc
JOIN DWV.FEEDS_FILES ff_hc ON hc.feeds_files_id = ff_hc.id AND ff_hc.is_latest = TRUE
JOIN DWV.HPSA h ON hc.fips_county = h.fips_county AND h.discipline = 'PRIMARY_CARE'
JOIN DWV.FEEDS_FILES ff_h ON h.feeds_files_id = ff_h.id AND ff_h.is_latest = TRUE
JOIN DWV.AHRF a ON hc.fips_county = a.fips_county
    AND a.data_year = (SELECT MAX(data_year) FROM DWV.AHRF)
WHERE h.hpsa_score >= 20
  AND hc.site_status = 'Active'
ORDER BY h.hpsa_score DESC, hc.health_center_name
LIMIT 25;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
-- Health centers in high-shortage counties
SELECT
    hc.health_center_name,
    hc.site_name,
    hc.site_city,
    hc.state_abbrev,
    hc.operating_hours_per_week,
    h.hpsa_score,
    h.designated_population,
    a.mds_per_100k
FROM hrsa_dwv.health_center_sites hc
JOIN hrsa_dwv.feeds_files ff_hc ON hc.feeds_files_id = ff_hc.id AND ff_hc.is_latest = TRUE
JOIN hrsa_dwv.hpsa h ON hc.fips_county = h.fips_county AND h.discipline = 'PRIMARY_CARE'
JOIN hrsa_dwv.feeds_files ff_h ON h.feeds_files_id = ff_h.id AND ff_h.is_latest = TRUE
JOIN hrsa_dwv.ahrf a ON hc.fips_county = a.fips_county
    AND a.data_year = (SELECT MAX(data_year) FROM hrsa_dwv.ahrf)
WHERE h.hpsa_score >= 20
  AND hc.site_status = 'Active'
ORDER BY h.hpsa_score DESC, hc.health_center_name
LIMIT 25;
```

{% endtab %}
{% endtabs %}

### Complete County Health Profile

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Full cross-source county profile: workforce + shortages + facilities
-- Joins all 4 data tables on fips_county using is_latest for current data
SELECT
    a.fips_county,
    a.county_name,
    a.state_abbrev,
    a.total_mds,
    a.mds_per_100k,
    a.nurses_per_100k,
    h.hpsa_score AS pc_shortage_score,
    h.provider_shortage AS pc_providers_needed,
    s.mua_designation_count,
    s.mua_population,
    COUNT(hc.site_key) AS health_center_count,
    ff_a.data_period AS ahrf_period,
    ff_h.data_period AS hpsa_period
FROM DWV.AHRF a
JOIN DWV.FEEDS_FILES ff_a ON a.feeds_files_id = ff_a.id AND ff_a.is_latest = TRUE
LEFT JOIN DWV.HPSA h ON a.fips_county = h.fips_county AND h.discipline = 'PRIMARY_CARE'
LEFT JOIN DWV.FEEDS_FILES ff_h ON h.feeds_files_id = ff_h.id AND ff_h.is_latest = TRUE
LEFT JOIN DWV.SHORTAGE_DESIGNATIONS s ON a.fips_county = s.fips_county
LEFT JOIN DWV.HEALTH_CENTER_SITES hc ON a.fips_county = hc.fips_county
    AND hc.site_status = 'Active'
    AND hc.feeds_files_id IN (SELECT id FROM DWV.FEEDS_FILES WHERE source_name = 'health_center_sites' AND is_latest = TRUE)
WHERE a.data_year = (SELECT MAX(data_year) FROM DWV.AHRF)
GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13
ORDER BY h.hpsa_score DESC NULLS LAST
LIMIT 50;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
-- Full cross-source county profile
SELECT
    a.fips_county,
    a.county_name,
    a.state_abbrev,
    a.total_mds,
    a.mds_per_100k,
    a.nurses_per_100k,
    h.hpsa_score AS pc_shortage_score,
    h.provider_shortage AS pc_providers_needed,
    s.mua_designation_count,
    s.mua_population,
    COUNT(hc.site_key) AS health_center_count,
    ff_a.data_period AS ahrf_period,
    ff_h.data_period AS hpsa_period
FROM hrsa_dwv.ahrf a
JOIN hrsa_dwv.feeds_files ff_a ON a.feeds_files_id = ff_a.id AND ff_a.is_latest = TRUE
LEFT JOIN hrsa_dwv.hpsa h ON a.fips_county = h.fips_county AND h.discipline = 'PRIMARY_CARE'
LEFT JOIN hrsa_dwv.feeds_files ff_h ON h.feeds_files_id = ff_h.id AND ff_h.is_latest = TRUE
LEFT JOIN hrsa_dwv.shortage_designations s ON a.fips_county = s.fips_county
LEFT JOIN hrsa_dwv.health_center_sites hc ON a.fips_county = hc.fips_county
    AND hc.site_status = 'Active'
    AND hc.feeds_files_id IN (SELECT id FROM hrsa_dwv.feeds_files WHERE source_name = 'health_center_sites' AND is_latest = TRUE)
WHERE a.data_year = (SELECT MAX(data_year) FROM hrsa_dwv.ahrf)
GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13
ORDER BY h.hpsa_score DESC NULLS LAST
LIMIT 50;
```

{% endtab %}
{% endtabs %}

### Payer Network Adequacy

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Shortage gap analysis for Medicare Advantage network adequacy
-- Counties where provider supply falls short of HPSA-designated need
SELECT
    a.state_abbrev,
    a.county_name,
    a.fips_county,
    a.mds_per_100k,
    a.population,
    h_pc.hpsa_score AS pc_score,
    h_pc.provider_shortage AS pc_gap,
    h_dent.hpsa_score AS dental_score,
    h_dent.provider_shortage AS dental_gap,
    h_mh.hpsa_score AS mh_score,
    h_mh.provider_shortage AS mh_gap
FROM DWV.AHRF a
JOIN DWV.FEEDS_FILES ff ON a.feeds_files_id = ff.id AND ff.is_latest = TRUE
LEFT JOIN DWV.HPSA h_pc ON a.fips_county = h_pc.fips_county AND h_pc.discipline = 'PRIMARY_CARE'
    AND h_pc.feeds_files_id IN (SELECT id FROM DWV.FEEDS_FILES WHERE source_name = 'hpsa' AND is_latest = TRUE)
LEFT JOIN DWV.HPSA h_dent ON a.fips_county = h_dent.fips_county AND h_dent.discipline = 'DENTAL'
    AND h_dent.feeds_files_id IN (SELECT id FROM DWV.FEEDS_FILES WHERE source_name = 'hpsa' AND is_latest = TRUE)
LEFT JOIN DWV.HPSA h_mh ON a.fips_county = h_mh.fips_county AND h_mh.discipline = 'MENTAL_HEALTH'
    AND h_mh.feeds_files_id IN (SELECT id FROM DWV.FEEDS_FILES WHERE source_name = 'hpsa' AND is_latest = TRUE)
WHERE a.data_year = (SELECT MAX(data_year) FROM DWV.AHRF)
  AND (h_pc.hpsa_score >= 10 OR h_dent.hpsa_score >= 10 OR h_mh.hpsa_score >= 10)
ORDER BY COALESCE(h_pc.hpsa_score, 0) + COALESCE(h_dent.hpsa_score, 0) + COALESCE(h_mh.hpsa_score, 0) DESC
LIMIT 25;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
-- Shortage gap analysis for Medicare Advantage network adequacy
SELECT
    a.state_abbrev,
    a.county_name,
    a.fips_county,
    a.mds_per_100k,
    a.population,
    h_pc.hpsa_score AS pc_score,
    h_pc.provider_shortage AS pc_gap,
    h_dent.hpsa_score AS dental_score,
    h_dent.provider_shortage AS dental_gap,
    h_mh.hpsa_score AS mh_score,
    h_mh.provider_shortage AS mh_gap
FROM hrsa_dwv.ahrf a
JOIN hrsa_dwv.feeds_files ff ON a.feeds_files_id = ff.id AND ff.is_latest = TRUE
LEFT JOIN hrsa_dwv.hpsa h_pc ON a.fips_county = h_pc.fips_county AND h_pc.discipline = 'PRIMARY_CARE'
    AND h_pc.feeds_files_id IN (SELECT id FROM hrsa_dwv.feeds_files WHERE source_name = 'hpsa' AND is_latest = TRUE)
LEFT JOIN hrsa_dwv.hpsa h_dent ON a.fips_county = h_dent.fips_county AND h_dent.discipline = 'DENTAL'
    AND h_dent.feeds_files_id IN (SELECT id FROM hrsa_dwv.feeds_files WHERE source_name = 'hpsa' AND is_latest = TRUE)
LEFT JOIN hrsa_dwv.hpsa h_mh ON a.fips_county = h_mh.fips_county AND h_mh.discipline = 'MENTAL_HEALTH'
    AND h_mh.feeds_files_id IN (SELECT id FROM hrsa_dwv.feeds_files WHERE source_name = 'hpsa' AND is_latest = TRUE)
WHERE a.data_year = (SELECT MAX(data_year) FROM hrsa_dwv.ahrf)
  AND (h_pc.hpsa_score >= 10 OR h_dent.hpsa_score >= 10 OR h_mh.hpsa_score >= 10)
ORDER BY COALESCE(h_pc.hpsa_score, 0) + COALESCE(h_dent.hpsa_score, 0) + COALESCE(h_mh.hpsa_score, 0) DESC
LIMIT 25;
```

{% endtab %}
{% endtabs %}

### Tracking Data Changes Over Time

`FEEDS_FILES` records every batch load with `row_count_delta` showing what changed. Use this to monitor source data updates and understand how the dataset evolves.

{% tabs %}
{% tab title="Snowflake" %}

```sql
-- Track how HPSA data has evolved across batches
-- row_count_delta shows net new designations per load
SELECT
    source_name,
    data_period,
    row_count,
    row_count_delta,
    created_at AS loaded_at,
    is_latest
FROM DWV.FEEDS_FILES
WHERE source_name = 'hpsa'
ORDER BY created_at;
```

{% endtab %}

{% tab title="Databricks" %}

```sql
-- Track how HPSA data has evolved across batches
SELECT
    source_name,
    data_period,
    row_count,
    row_count_delta,
    created_at AS loaded_at,
    is_latest
FROM hrsa_dwv.feeds_files
WHERE source_name = 'hpsa'
ORDER BY created_at;
```

{% endtab %}
{% endtabs %}

{% hint style="success" %}
**Ready to access HRSA data?**
{% endhint %}

| Platform       | Action                                                                           |
| -------------- | -------------------------------------------------------------------------------- |
| **Snowflake**  | [Get on Marketplace](https://app.snowflake.com/marketplace/listing/GZT1Z7QRSW96) |
| **Databricks** | [Free Trial](https://trial.dataplex-consulting.com)                              |

{% hint style="success" %}
Questions? [Contact our team](mailto:support@dataplex-consulting.com) for a walkthrough.
{% endhint %}

## HRSA Data Sources

* [HRSA Data Downloads](https://data.hrsa.gov/data/download)
* [Area Health Resources Files (AHRF)](https://data.hrsa.gov/topics/health-workforce/ahrf)
* [HPSA Find](https://data.hrsa.gov/tools/shortage-area/hpsa-find)
* [Health Center Program](https://data.hrsa.gov/tools/data-portal)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dataplex-consulting.com/data-catalog/hrsa-dataset.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
