FDA FAERS Dataset

About the Dataset

The FDA FAERS (FDA Adverse Event Reporting System) Dataset is a comprehensive data product providing normalized, cleaned, and analytically-ready access to pharmaceutical drug adverse event reports. This dataset encompasses over 18 million adverse event records, 65 million drug records, 54 million reaction records, and 18 million patient records, with advanced data quality enhancements and 4 pre-aggregated analytical reporting models for drug safety surveillance.

Quick Access

Base Tables: Adverse events, drugs involved, patient demographics, reported reactions, and report sources Aggregate Models: 4 pre-built analytical views for drug safety analysis, risk profiling, and trend monitoring Update Frequency: Quarterly from FDA FAERS database

Overview

The FDA FAERS Dataset provides comprehensive access to pharmaceutical drug adverse event data including:

  • Core Event Data (drug__events) - Main table with cleaned adverse event reports

  • Drug Information (drug__events_drugs) - Detailed information about drugs involved in events

  • Patient Demographics (drug__events_patients) - Normalized patient data with age and demographic information

  • Adverse Reactions (drug__events_reactions) - MedDRA-coded reactions and outcomes

  • Report Sources (drug__events_report_sources) - Information about who reported the events

  • Analytical Reports - Pre-aggregated intelligence for drug safety analysis, demographic risk profiling, and trend monitoring

Dataset Structure

The FDA FAERS dataset is organized around adverse event reports, with connected tables containing detailed information about the drugs involved, patient demographics, reported reactions, and report sources.

FDA FAERS Dataset Structure

Base Tables

drug__events (Main Events Table)

Core adverse event reports containing event-level information. Each record represents a single adverse event report identified by a unique safety_report_id.

Key Features:

  • Over 18 million adverse event records

  • Clean, standardized data ready for analysis

  • Complete event details including dates, countries, and seriousness criteria

  • Reporter organization information for source tracking

drug__events_drugs (Drug Details Table)

Drug-specific information for each adverse event. Each record represents one drug involved in an adverse event report.

Key Features:

  • Over 65 million drug records with detailed information

  • Drug characterization (Suspect, Concomitant, Interacting)

  • Dosage information and administration routes

  • Indications for use and actions taken after the event

drug__events_patients (Patient Information Table)

Patient demographic information with comprehensive data cleaning and normalization.

Key Features:

  • Over 18 million patient records

  • Standardized age data in years and FDA age group categories

  • Weight data normalized to kilograms

  • Age at onset information for temporal analysis

drug__events_reactions (Adverse Reactions Table)

Reported adverse reactions and outcomes using standardized MedDRA terminology.

Key Features:

  • Over 54 million reaction records

  • MedDRA Preferred Terms and Lowest Level Terms

  • Reaction outcomes (Recovered, Fatal, etc.)

  • Links to specific patients and drugs through sequence numbers

drug__events_report_sources (Report Sources Table)

Information about the sources of adverse event reports.

Key Features:

  • Reporter qualifications and credentials

  • Primary source country information

  • Report source types for quality assessment

  • Links to main event records

Aggregate/Reporting Models

drug__agg_drug_reaction_associations

Pre-aggregated drug-reaction associations with statistical significance measures for safety signal detection.

Use Cases:

  • Drug safety signal detection and monitoring

  • Adverse reaction frequency analysis

  • Comparative safety assessments across drugs

  • Pharmacovigilance reporting and alerts

Key Metrics:

  • Association strength scores between drugs and reactions

  • Occurrence counts and statistical significance

  • Serious event and death rates by drug-reaction pair

  • Percentage of drug reports with specific reactions

drug__agg_manufacturer_adverse_events

Manufacturer-level adverse event analytics with portfolio safety metrics and competitive benchmarks.

Use Cases:

  • Manufacturer safety performance benchmarking

  • Portfolio risk assessment across drug products

  • Competitive intelligence for pharmaceutical companies

  • Regulatory compliance monitoring

Key Metrics:

  • Total reports by manufacturer with YoY comparisons

  • Serious event and death rates by company

  • Number of unique drugs with adverse events

  • Trend analysis and growth metrics

drug__agg_demographic_risk_profiles

Demographic risk analysis segmented by patient age groups and gender for targeted safety monitoring.

Use Cases:

  • Identifying high-risk patient populations

  • Age and gender-based safety signal detection

  • Pediatric and geriatric drug safety monitoring

  • Demographic-specific risk assessment

Key Metrics:

  • Risk scores by demographic segments

  • Event counts and rates by age group and gender

  • Comparative risk levels across populations

  • Statistical significance thresholds (10+ events)

Time-series safety trends with statistical trend detection and forecasting capabilities.

Use Cases:

  • Safety signal trend monitoring over time

  • Emerging risk detection and early warnings

  • Seasonal pattern analysis in adverse events

  • Post-market surveillance and monitoring

Key Metrics:

  • Multiple time granularities (Daily, Weekly, Monthly, Quarterly)

  • Moving averages and trend indicators

  • Serious event and death rate trends

  • Statistical trend direction classifications

Data Quality Improvements

Patient Information Standardization

Age Normalization

  • All patient ages converted to years for consistent analysis

  • FDA age group categories applied (Neonate, Infant, Child, Adolescent, Adult, Elderly)

  • Handles various reporting formats and units

Weight Standardization

  • All weights normalized to kilograms

  • Automatic unit detection and conversion

  • Validation for realistic weight ranges

Demographic Enhancement

  • Gender categories standardized (Male, Female, NULL for unknown)

  • Missing data clearly identified for transparency

  • Age at onset calculations for temporal analysis

Drug Information Enhancements

Drug Name Standardization

  • Medicinal product names cleaned and normalized

  • Brand names, generic names, and ingredients handled

  • Consistent formatting for analysis

Characterization Categories

  • Clear classification: Suspect, Concomitant, Interacting

  • Action taken standardization (Withdrawn, Dose reduced, etc.)

  • Route of administration normalization

Reaction Coding

MedDRA Standardization

  • Preferred Terms (PT) and Lowest Level Terms (LLT) included

  • Hierarchical coding system for reaction analysis

  • Outcome standardization for severity assessment

Getting Started

Basic Query Examples

Recent Adverse Events Overview

-- Search for adverse events involving a specific drug
SELECT 
    e.safety_report_id,
    e.receipt_date,
    e.occur_country,
    e.serious,
    d.medicinalproduct,
    d.drug_characterization,
    d.drug_indication,
    d.drug_dose_text
FROM drug__events e
INNER JOIN drug__events_drugs d
    ON e.safety_report_id = d.safety_report_id
WHERE d.medicinalproduct ILIKE '%ozempic%'
LIMIT 100;

Top Adverse Reactions Analysis

-- Count top adverse reactions for a specific drug
SELECT 
    r.reaction_meddra_pt as adverse_reaction,
    COUNT(*) as report_count,
    COUNT(CASE WHEN e.serious = true THEN 1 END) as serious_cases,
    ROUND(COUNT(CASE WHEN e.serious = true THEN 1 END) * 100.0 / COUNT(*), 2) as serious_percentage
FROM drug__events e
INNER JOIN drug__events_drugs d
    ON e.safety_report_id = d.safety_report_id
INNER JOIN drug__events_reactions r
    ON e.safety_report_id = r.safety_report_id
WHERE d.medicinalproduct ILIKE '%rinvoq%'
    AND d.drug_characterization = 'Suspect'
    AND r.reaction_meddra_pt IS NOT NULL
GROUP BY r.reaction_meddra_pt
ORDER BY report_count DESC
LIMIT 10;

Geographic Distribution Analysis

-- Analyze adverse events by country and seriousness
SELECT 
    occur_country,
    COUNT(*) as total_reports,
    COUNT(CASE WHEN serious = true THEN 1 END) as serious_reports,
    COUNT(CASE WHEN seriousness_death = true THEN 1 END) as fatal_reports,
    ROUND(COUNT(CASE WHEN serious = true THEN 1 END) * 100.0 / COUNT(*), 2) as serious_percentage
FROM drug__events
WHERE receipt_date >= CURRENT_DATE - INTERVAL '2 years'
    AND occur_country IS NOT NULL
GROUP BY occur_country
ORDER BY total_reports DESC
LIMIT 15;

Advanced Analytics Examples

Demographic Risk Analysis

-- Analyze patient demographics for serious adverse events
SELECT 
    p.patient_sex,
    p.patient_age_group,
    COUNT(*) as patient_count,
    AVG(p.patient_age_years) as avg_age,
    COUNT(CASE WHEN e.seriousness_death = true THEN 1 END) as fatal_cases
FROM drug__events e
INNER JOIN drug__events_patients p
    ON e.safety_report_id = p.safety_report_id
WHERE e.serious = true
    AND e.receipt_date >= '2022-01-01'
    AND p.patient_sex IS NOT NULL
    AND p.patient_age_group IS NOT NULL
GROUP BY p.patient_sex, p.patient_age_group
ORDER BY patient_count DESC;

Manufacturer Safety Performance

-- Top pharmaceutical manufacturers by adverse event volume
SELECT 
    e.sender_organization_name as manufacturer,
    COUNT(DISTINCT e.safety_report_id) as total_reports,
    COUNT(DISTINCT d.medicinalproduct) as unique_drugs,
    COUNT(CASE WHEN e.serious = true THEN 1 END) as serious_events,
    ROUND(COUNT(CASE WHEN e.serious = true THEN 1 END) * 100.0 / COUNT(*), 2) as serious_rate
FROM drug__events e
LEFT JOIN drug__events_drugs d
    ON e.safety_report_id = d.safety_report_id
WHERE e.receipt_date >= '2023-01-01'
    AND e.sender_organization_name IS NOT NULL
    AND e.sender_organization_name != ''
GROUP BY e.sender_organization_name
HAVING COUNT(DISTINCT e.safety_report_id) >= 100
ORDER BY total_reports DESC
LIMIT 20;

Temporal Trend Analysis

-- Monthly adverse event trends
SELECT 
    DATE_TRUNC('month', receipt_date) as report_month,
    COUNT(*) as total_events,
    COUNT(CASE WHEN serious = true THEN 1 END) as serious_events,
    COUNT(DISTINCT sender_organization_name) as reporting_companies,
    COUNT(DISTINCT occur_country) as reporting_countries
FROM drug__events
WHERE receipt_date >= CURRENT_DATE - INTERVAL '2 years'
  AND receipt_date < CURRENT_DATE
GROUP BY DATE_TRUNC('month', receipt_date)
ORDER BY report_month DESC;

Using Pre-Aggregated Analytics

Drug-Reaction Association Analysis

-- Use pre-built drug-reaction associations for quick insights
SELECT 
    drug_name,
    reaction,
    occurrence_count,
    serious_rate,
    death_rate,
    association_strength_score
FROM drug__agg_drug_reaction_associations
WHERE occurrence_count >= 1000  -- Focus on statistically significant associations
    AND serious_rate >= 50.0     -- High seriousness rate
ORDER BY association_strength_score DESC
LIMIT 25;

Demographic Risk Profiling

-- Identify high-risk demographics for specific drugs
SELECT 
    drug_name,
    age_group,
    patient_sex,
    event_count,
    serious_rate,
    risk_score,
    risk_level
FROM drug__agg_demographic_risk_profiles
WHERE drug_name ILIKE '%metformin%'
    AND event_count >= 10  -- Statistical significance threshold
ORDER BY risk_score DESC;

Safety Trend Monitoring

-- Monitor monthly safety trends for a drug
SELECT 
    drug_name,
    time_period,
    period_type,
    event_count,
    serious_rate,
    trend_direction,
    moving_avg_30d
FROM drug__agg_safety_trends
WHERE drug_name ILIKE '%humira%'
    AND period_type = 'Monthly'
    AND time_period >= '2023-01-01'
ORDER BY time_period DESC;

Why Choose This Dataset

  • Ready for Analysis: Clean, normalized data with standardized terminology

  • Comprehensive Coverage: Complete adverse event data with all related information connected

  • Pre-Built Intelligence: Aggregated analytics for immediate insights without complex queries

  • Statistical Significance: Built-in thresholds and association measures for reliable analysis

  • Performance Optimized: Denormalized keys and pre-aggregations for fast query performance

Data Update Frequency and Freshness

Update Schedule

  • FDA Updates: FDA releases new FAERS data quarterly

  • Our Updates: We check for new data twice daily and load it immediately when available

  • Typical Data Lag: Drug adverse event data is typically 3-6 months behind real-time due to FDA's quarterly release cycle

Understanding Data Timeliness

The FDA FAERS data has a significant lag due to the quarterly reporting cycle:

  • Adverse events are collected throughout each quarter

  • FDA processes and validates reports after quarter end

  • Data is typically released 3-4 months after the quarter closes

What This Means for You

  • Most recent adverse event data will be 3-6 months old

  • This is inherent to FDA's quarterly release schedule for ensuring data quality

Important Data Interpretation Guidelines

  • Adverse event reports are voluntary submissions and may not represent all occurrences

  • Reports do not establish causation between drugs and adverse events

  • Duplicate reports may exist despite FDA's deduplication efforts

  • Reporting rates can be influenced by media attention, new drug launches, and other factors

  • Statistical associations should be validated with clinical expertise

FDA Documentation Resources

Last updated