Census LEHD LODES Employment Dataset
About the Dataset
Census block-level employment data from the U.S. Census Bureau's Longitudinal Employer-Household Dynamics (LEHD) program. Origin-destination commute flows, workforce demographics, and job characteristics with geographic crosswalk for county, metro, and state aggregation. 22 years of annual data (2002-2023) covering all 50 states, DC, and Puerto Rico.
75,000 files condensed into one SQL query.
Get Full Access | Snowflake Marketplace | Free Trial
Quick Access
Tables: OD, RAC, WAC, XWALK + 4 metadata tables Sources: 4 Census LEHD data sources Coverage: All US Census blocks (~8 million), 22 years (2002-2023) Update Frequency: Annually (1-2 year lag from reference year to Census release)
Overview
The Census LEHD LODES dataset provides comprehensive access to employment geography data including:
Origin-Destination (
OD) - Block-to-block commute flows: where workers live and where they work. 2.6 billion rows across 22 years.Residence Area Characteristics (
RAC) - Jobs by where workers live with 42 demographic columns (age, earnings, 20 NAICS sectors, race, ethnicity, education, sex). 119 million rows.Workplace Area Characteristics (
WAC) - Jobs by where people work with 52 columns including firm age and firm size (WAC-only). 48.5 million rows.Geographic Crosswalk (
XWALK) - Maps every 2020 Census block to tracts, counties, metros, congressional districts, ZCTAs, and coordinates. 8.2 million rows. Always fully available (no trial limit).
Metadata Tables
Every Dataplex data product includes these standard metadata tables:
FEEDS
Dataset catalog — available tables, descriptions, update dates
FEEDS_FILES
Batch load history with is_latest flag for data freshness
CHANGELOG
Change log — data loads, schema changes, corrections
DATA_DICTIONARY
Column descriptions for all tables
Entity Relationship Diagram

Join pattern: OD, RAC, and WAC all join to XWALK via geocode columns (w_geocode or h_geocode = tabblk2020) for geographic aggregation from blocks to counties, metros, and states. All data tables link to FEEDS and FEEDS_FILES via feed_id and feeds_files_id for data lineage.
Data Tables
OD (Origin-Destination)
Block-to-block commute flows: where workers live and where they work. Each row represents a unique home-block to work-block pair for a given year. JOIN to XWALK on w_geocode or h_geocode to aggregate to county, metro, or state level.
Key Features:
2.6 billion rows across 22 years (2002-2023)
Census block-level granularity (15-digit FIPS codes)
Job counts segmented by age (3), earnings (3), and industry (3)
mainfiles (both residence and workplace in same state) andauxfiles (workplace in state, residence elsewhere)
Primary Key: w_geocode + h_geocode + year + part
See Schema Reference for all 19 columns.
RAC (Residence Area Characteristics)
Jobs by where workers live. Each row is a Census block where workers reside, with 42 demographic breakdown columns. JOIN to XWALK on h_geocode to aggregate.
Key Features:
119 million rows across 22 years
20 NAICS industry sectors (CNS01-CNS20)
7 race categories, 2 ethnicity groups, 4 education levels, 2 sex categories
3 age segments and 3 earnings brackets
Primary Key: h_geocode + year
See Schema Reference for all 48 columns.
WAC (Workplace Area Characteristics)
Jobs by where people work. Each row is a Census block where jobs are located. Identical demographic columns to RAC plus firm age (CFA) and firm size (CFS) columns not available in RAC.
Key Features:
48.5 million rows across 22 years
All RAC columns plus 5 firm age groups (CFA01-CFA05) and 5 firm size groups (CFS01-CFS05)
WAC uses noise infusion (NOT synthetic like RAC/OD) — reliable at 10+ jobs per block
Primary Key: w_geocode + year
See Schema Reference for all 58 columns.
XWALK (Geographic Crosswalk)
Reference table mapping every 2020 Census block to higher geographies. This is the aggregation enabler — JOIN OD/RAC/WAC to XWALK to roll up block-level data to counties, metros, states, congressional districts, or ZCTAs.
Key Features:
8.2 million rows (one per 2020 Census block)
Maps to: state, county, tract, block group, CBSA/metro, ZCTA, congressional district, place, school district, and more
Includes block centroid coordinates (latitude/longitude)
Always fully available — no trial limit (essential for any analysis)
Primary Key: tabblk2020
See Schema Reference for all 45 columns.
Data Quality
Data Generation Methods
OD
Full synthetic data generation
Statistically representative; reliable at county level and above
RAC
Full synthetic data generation
Statistically representative; reliable at county level and above
WAC
Multiplicative noise infusion
Reliable at 10+ jobs per block
XWALK
Exact Census geography
Exact
Standardization
All geocode columns zero-padded to 15 digits (preserved as strings, not integers)
All geographic codes preserve leading zeros (state, county, tract, CBSA)
Year extracted from filenames and added as a typed integer column
All job count columns cast to NUMBER with
TRY_TO_NUMBERfor safe handling
Data Freshness
Check when data was last updated:
Getting Started
Platform Schema Reference
This dataset is available on both Snowflake and Databricks. Queries use schema-only references — the database is already set by the share or catalog context:
Snowflake
DWV
DWV.OD
Databricks
census_lehd_lodes_dwv
census_lehd_lodes_dwv.od
Discover Available Data
Start with the FEEDS table to see what's available, and FEEDS_FILES to understand data freshness.
Working with Data Lineage
Every data row links to FEEDS_FILES via feeds_files_id, which tells you exactly which batch loaded that data. Use this to filter to the current data version or trace any row back to its source load.
Top Employment Counties
Aggregate workplace block data to county level to find the highest employment centers.
Commute Flows Between Counties
Find the largest commute flows between counties using OD + XWALK.
Workforce Demographics by Metro
Analyze workforce age, earnings, and industry composition at the metropolitan area level.
Employment Trend Over Time
Track total job counts by county across years to identify growth and decline.
Tracking Data Changes Over Time
FEEDS_FILES records every batch load with row_count_delta showing what changed. Use this to monitor source data updates.
Ready to access Census LEHD LODES data?
Snowflake
Databricks
Questions? Contact our team for a walkthrough.
Last updated

