# CMS Data Research Dataset

The CMS Data Research Dataset is a comprehensive collection of CMS data feeds from cms.data.gov/data-research, dating back to 2001. This dataset offers:

* A view per feed with aligned and properly casted file attributes
* Automatic updates when new feed files are received
* Addition of new views as new CMS data research feeds are published
* As of December 2023, 16 feeds with over 2500 files
* Available on the Snowflake Marketplace

A free trial is available, providing access to the dataset for fourteen days and including the first 1,500 rows from every feed file table.

See the full list of feeds included in the [CMS Data Research Catalog](/data-catalog/cms-data-research-dataset/cms-data-research-catalog.md).

### Dataset Features

* Extensive coverage of CMS data feeds
* Automatic daily updates
* Properly aligned and casted attributes for each feed
* Views that are automatically updated with new feed files
* Expandable dataset with new feeds added as they become available
* Easy access through the Snowflake Marketplace

### Data Quality and Maintenance

At Dataplex Consulting & Data Products, we prioritize data quality:

* Daily monitoring of ingestion and ETL jobs
* Automated data quality checks to prevent bad data from reaching customers
* Timely updates when CMS publishes new information
* Consistent data structure across feeds for ease of use

### Business Applications

The CMS Data Research Dataset can be utilized for various purposes, including:

* Enriching or augmenting existing datasets
* Analyzing published feed metrics over time
* Performing segmentation analysis
* Training machine learning models
* Conducting geospatial analysis

### Example Use Cases

1. Analyzing enrollment counts by plan
2. Tracking eligible and enrolled individuals in Part-D Plans by location
3. Monitoring Special Needs Dual-Eligible Enrollment Counts over time
4. Examining enrollment metrics by state
5. Accessing the most recent plan crosswalk data

### Data Structure

The dataset includes 16 feeds as of December 2023:

1. 2015 Part C\&D Plan Crosswalk
2. Enrollment by Contract
3. MA Contract Service Area
4. MA Enrollment by SCC
5. MA Enrollment by SCP
6. MA State/County Penetration
7. Monthly Enrollment by CPSC
8. Monthly Enrollment by Plan
9. Monthly Enrollment by State
10. PBP Benefits 2017
11. PDP Contract Service Area
12. PDP Enrollment by SCC
13. PDP Enrollment by SCP
14. PDP State/County Penetration
15. SNP Comprehensive Report
16. State Service Area

### Entity Relationship Diagram

![CMS Data Research Schema](/files/MERU1grGR9OCM3U11V6T)

### Sample Queries

#### Query the Enrollment Count by Plan as of October 2019

```sql
select s.organization_type,
       s.plan_id,
       s.plan_type,
       s.organization_name,
       s.enrollment
from dwv.feeds f
join dwv.feeds_files ff
  on f.id = ff.feed_id
join dwv.MONTHLY_ENROLLMENT_BY_PLAN s
  on ff.id = s.file_id
 and f.id = s.file_feed_id
 and ff.file_report_period = to_date('2019-10-01','YYYY-MM-DD')
order by s.enrollment desc nulls last;
```

#### Query the Eligible and Enrolled in Part-D Plans in West Baton Rouge by Month

```sql
select ff.file_report_period,
       s.eligibles,
       enrolled,
       penetration
from dwv.feeds f
join dwv.feeds_files ff
  on f.id = ff.feed_id
join dwv.PDP_STATE_COUNTY_PENETRATION s
  on ff.id = s.file_id
 and f.id = s.file_feed_id
 and state_name = 'Louisiana'
 and county_name = 'West Baton Rouge'
order by file_report_period;
```

### Support and Contact

For any questions or assistance with the CMS Data Research Dataset:

* Email: <support@dataplex-consulting.com>
* Daily monitoring and support provided by the Dataplex Consulting & Data Products team

### About Dataplex

Dataplex Consulting & Data Products delivers turnkey, analytics-ready data products that make complex public and commercial data easy to use across modern data platforms. Our data pipelines include automated quality checks and active monitoring to ensure timely, reliable, and well-structured data that is ready for downstream analytics, machine learning, and operational use.

In addition to data products, Dataplex provides data engineering and analytics consulting services to organizations of all sizes. We bring deep, hands-on experience supporting both early-stage companies and large enterprises, helping teams build scalable data platforms, improve data reliability, and become more data-driven.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dataplex-consulting.com/data-catalog/cms-data-research-dataset.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
