Skip to content

Conversation

@MaxGhenis
Copy link
Contributor

@MaxGhenis MaxGhenis commented Jan 26, 2026

Summary

Adds comprehensive set of geographic input variables from Census Bureau for granular geographic analysis:

  • block_geoid: 15-digit census block identifier (most granular geographic unit)
  • tract_geoid: 11-digit census tract identifier
  • cbsa_code: Core-Based Statistical Area (metro/micro area)
  • place_fips: City/CDP place code
  • vtd: Voting Tabulation District
  • puma: Public Use Microdata Area
  • sldu: State Legislative District Upper (state senate)
  • sldl: State Legislative District Lower (state assembly/house)
  • zcta: ZIP Code Tabulation Area (Census approximation of ZIP codes)

These variables enable analysis at multiple geographic granularities. All can be derived from block_geoid during dataset builds using the policyengine-us-data block assignment infrastructure.

This is a prerequisite for policyengine-us-data#484 which adds census block-level geographic assignment that can populate these variables during dataset builds.

Test plan

  • Added YAML tests for all 8 new variables (18 tests total)
  • Tests verify default values (empty strings) and setting/retrieving values
  • CI tests pass

🤖 Generated with Claude Code

MaxGhenis and others added 2 commits January 25, 2026 21:12
Adds two new input variables for state legislative district assignment:
- sldu: State Legislative District Upper (State Senate)
- sldl: State Legislative District Lower (State Assembly/House)

These variables enable state-level policy analysis by legislative district,
complementing the existing congressional_district_geoid variable. The values
are 3-character codes from Census Bureau Block Assignment Files.

Used by policyengine-us-data's block assignment infrastructure to assign
households to their state legislative districts based on Census geography.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds comprehensive set of geographic input variables from Census Bureau:
- block_geoid: 15-digit census block identifier (most granular)
- tract_geoid: 11-digit census tract identifier
- cbsa_code: Core-Based Statistical Area (metro/micro area)
- place_fips: City/CDP place code
- vtd: Voting Tabulation District
- puma: Public Use Microdata Area
- sldu: State Legislative District Upper (state senate)
- sldl: State Legislative District Lower (state assembly/house)

These variables enable granular geographic analysis at multiple levels.
All can be derived from block_geoid during dataset builds using the
policyengine-us-data block assignment infrastructure.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@MaxGhenis MaxGhenis requested a review from baogorek January 26, 2026 02:31
baogorek and others added 3 commits January 26, 2026 13:48
- Add ZCTA (ZIP Code Tabulation Area) variable
- Fix block_geoid example: Queens -> New York County (Manhattan)
- Use zero-padded codes in sldu/sldl tests to match Census format

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Keep only default value tests since identity tests don't validate any logic.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…t-variables

Add zcta variable and fix documentation
Copy link
Collaborator

@baogorek baogorek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@MaxGhenis MaxGhenis merged commit f7ca771 into PolicyEngine:main Jan 26, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants