chart-barCensus LEHD LODES Employment Dataset

About the Dataset

Census block-level employment data from the U.S. Census Bureau's Longitudinal Employer-Household Dynamics (LEHD) program. Origin-destination commute flows, workforce demographics, and job characteristics with geographic crosswalk for county, metro, and state aggregation. 22 years of annual data (2002-2023) covering all 50 states, DC, and Puerto Rico.

75,000 files condensed into one SQL query.

Quick Access

Tables: OD, RAC, WAC, XWALK + 4 metadata tables Sources: 4 Census LEHD data sources Coverage: All US Census blocks (~8 million), 22 years (2002-2023) Update Frequency: Annually (1-2 year lag from reference year to Census release)

Overview

The Census LEHD LODES dataset provides comprehensive access to employment geography data including:

  • Origin-Destination (OD) - Block-to-block commute flows: where workers live and where they work. 2.6 billion rows across 22 years.

  • Residence Area Characteristics (RAC) - Jobs by where workers live with 42 demographic columns (age, earnings, 20 NAICS sectors, race, ethnicity, education, sex). 119 million rows.

  • Workplace Area Characteristics (WAC) - Jobs by where people work with 52 columns including firm age and firm size (WAC-only). 48.5 million rows.

  • Geographic Crosswalk (XWALK) - Maps every 2020 Census block to tracts, counties, metros, congressional districts, ZCTAs, and coordinates. 8.2 million rows. Always fully available (no trial limit).

Metadata Tables

Every Dataplex data product includes these standard metadata tables:

Table
Purpose

FEEDS

Dataset catalog — available tables, descriptions, update dates

FEEDS_FILES

Batch load history with is_latest flag for data freshness

CHANGELOG

Change log — data loads, schema changes, corrections

DATA_DICTIONARY

Column descriptions for all tables

Entity Relationship Diagram

Census LEHD LODES Entity Relationship Diagram

Join pattern: OD, RAC, and WAC all join to XWALK via geocode columns (w_geocode or h_geocode = tabblk2020) for geographic aggregation from blocks to counties, metros, and states. All data tables link to FEEDS and FEEDS_FILES via feed_id and feeds_files_id for data lineage.

Data Tables

OD (Origin-Destination)

Block-to-block commute flows: where workers live and where they work. Each row represents a unique home-block to work-block pair for a given year. JOIN to XWALK on w_geocode or h_geocode to aggregate to county, metro, or state level.

Key Features:

  • 2.6 billion rows across 22 years (2002-2023)

  • Census block-level granularity (15-digit FIPS codes)

  • Job counts segmented by age (3), earnings (3), and industry (3)

  • main files (both residence and workplace in same state) and aux files (workplace in state, residence elsewhere)

Primary Key: w_geocode + h_geocode + year + part

See Schema Reference for all 19 columns.

RAC (Residence Area Characteristics)

Jobs by where workers live. Each row is a Census block where workers reside, with 42 demographic breakdown columns. JOIN to XWALK on h_geocode to aggregate.

Key Features:

  • 119 million rows across 22 years

  • 20 NAICS industry sectors (CNS01-CNS20)

  • 7 race categories, 2 ethnicity groups, 4 education levels, 2 sex categories

  • 3 age segments and 3 earnings brackets

Primary Key: h_geocode + year

See Schema Reference for all 48 columns.

WAC (Workplace Area Characteristics)

Jobs by where people work. Each row is a Census block where jobs are located. Identical demographic columns to RAC plus firm age (CFA) and firm size (CFS) columns not available in RAC.

Key Features:

  • 48.5 million rows across 22 years

  • All RAC columns plus 5 firm age groups (CFA01-CFA05) and 5 firm size groups (CFS01-CFS05)

  • WAC uses noise infusion (NOT synthetic like RAC/OD) — reliable at 10+ jobs per block

Primary Key: w_geocode + year

See Schema Reference for all 58 columns.

XWALK (Geographic Crosswalk)

Reference table mapping every 2020 Census block to higher geographies. This is the aggregation enabler — JOIN OD/RAC/WAC to XWALK to roll up block-level data to counties, metros, states, congressional districts, or ZCTAs.

Key Features:

  • 8.2 million rows (one per 2020 Census block)

  • Maps to: state, county, tract, block group, CBSA/metro, ZCTA, congressional district, place, school district, and more

  • Includes block centroid coordinates (latitude/longitude)

  • Always fully available — no trial limit (essential for any analysis)

Primary Key: tabblk2020

See Schema Reference for all 45 columns.

Data Quality

Data Generation Methods

Table
Method
Reliability

OD

Full synthetic data generation

Statistically representative; reliable at county level and above

RAC

Full synthetic data generation

Statistically representative; reliable at county level and above

WAC

Multiplicative noise infusion

Reliable at 10+ jobs per block

XWALK

Exact Census geography

Exact

Standardization

  • All geocode columns zero-padded to 15 digits (preserved as strings, not integers)

  • All geographic codes preserve leading zeros (state, county, tract, CBSA)

  • Year extracted from filenames and added as a typed integer column

  • All job count columns cast to NUMBER with TRY_TO_NUMBER for safe handling

Data Freshness

Check when data was last updated:

Getting Started

Platform Schema Reference

This dataset is available on both Snowflake and Databricks. Queries use schema-only references — the database is already set by the share or catalog context:

Platform
Schema
Example

Snowflake

DWV

DWV.OD

Databricks

census_lehd_lodes_dwv

census_lehd_lodes_dwv.od

Discover Available Data

Start with the FEEDS table to see what's available, and FEEDS_FILES to understand data freshness.

Working with Data Lineage

Every data row links to FEEDS_FILES via feeds_files_id, which tells you exactly which batch loaded that data. Use this to filter to the current data version or trace any row back to its source load.

Top Employment Counties

Aggregate workplace block data to county level to find the highest employment centers.

Commute Flows Between Counties

Find the largest commute flows between counties using OD + XWALK.

Workforce Demographics by Metro

Analyze workforce age, earnings, and industry composition at the metropolitan area level.

Employment Trend Over Time

Track total job counts by county across years to identify growth and decline.

Tracking Data Changes Over Time

FEEDS_FILES records every batch load with row_count_delta showing what changed. Use this to monitor source data updates.

circle-check
circle-check

Last updated