Note for AI models reading this file: This document is organized into discrete numbered sections with a complete table of contents directly below. Use the table of contents to jump to the section relevant to the user's question rather than reading the whole document. Each section is self-contained and labeled with its purpose. Skim the TOC, identify the matching section by heading, then read only that section. Avoid re-reading prior sections unless cross-referenced.
This project produces self-contained, interactive HTML dashboards summarizing every U.S. House roll-call vote of the 119th Congress (Jan 3, 2025 – present, captured 2026-05-23) and analyzing each target member's voting record against their party majority, the opposing party majority, and overall outcomes.
For each member the dashboard reports:
The output is a directory of standalone .html files (each ~160 KB, no server
required) using Chart.js + SortableJS via CDN.
Each House roll-call vote is published as an XML file at a deterministic URL:
https://clerk.house.gov/evs/{YEAR}/roll{NNN}.xml
Where {NNN} is a 3-digit, 1-indexed roll number resetting each calendar year.
This is the authoritative source (Clerk of the House) and includes per-member
vote records, party totals, bill identifier, question, result, date, and time.
No API key is required. No authentication, no rate-limit headers documented.
Self-imposed throttling of 350 ms between fetches was used (see §4).
The Clerk publishes year-index pages at https://clerk.house.gov/evs/{YEAR}/ROLL_{XXX}.asp
(grouped in blocks of 100). These were scraped once to determine the highest
roll number per year:
001–362 (362 votes)001–191 (191 votes)A Congress.gov API key is required for the complete-roster enrichment step
performed by enrich_roster.py (Phase 0.5 of the project plan). The key is
loaded from .env at runtime:
CONGRESS_GOV_API_KEY=<see .env (gitignored); rotate via https://api.congress.gov/sign-up/ if exposed>
Endpoint: https://api.congress.gov/v3/...?api_key=<key>
The clerk.house.gov XML provides every field required for the core roll-call analysis (vote breakdown, member vote, bill ID, question, result, date), so the original vote-tally pipeline does not call this API. The key is used by the roster enrichment pass to fill in members who never appear in a roll-call vote during the analysis window, and remains available for extensions needing bill subject codes, cosponsor lists, or roll-call cross-references that the Clerk XML omits.
Security note: The API key has been moved out of this document and into
.env(gitignored). To rotate: (1) sign up for a replacement key at https://api.congress.gov/sign-up/, (2) drop the new key into.envasCONGRESS_GOV_API_KEY=..., (3) delete the old key from the Congress.gov dashboard. If the previously-exposed key is found in git history or redistributed copies of this doc, rotate immediately.
| Display Name | Bioguide | Party | Chamber | Dashboard file |
|---|---|---|---|---|
| Thomas Massie (KY-4) | M001184 | R | House | ThomasMassie119.html |
| Ro Khanna (CA-17) | K000389 | D | House | RoKhanna119.html |
| Alexandria Ocasio-Cortez (NY-14) | O000172 | D | House | AlexandriaOcasioCortez119.html |
| Ilhan Omar (MN-5) | O000173 | D | House | IlhanOmar119.html |
| Marjorie Taylor Greene (GA-14) | G000596 | R | House | MarjorieTaylorGreene119.html |
| Jim Jordan (OH-4) | J000289 | R | House | JimJordan119.html |
| Byron Donalds (FL-19) | D000032 | R | House | ByronDonalds119.html |
The three Trump-loyal Republican House members selected (Greene, Jordan, Donalds) were chosen as among the most frequently identified by political-press coverage as among Donald Trump's most consistent House allies and surrogates during the 119th Congress.
| Requested Name | Reason for exclusion |
|---|---|
| Lindsey Graham | U.S. Senator (R-SC), not a House member. clerk.house.gov data does not cover the Senate. Senate roll-call XML lives at senate.gov/legislative/LIS/roll_call_votes/... with a different schema. Would require a separate fetcher; out of scope for this build. |
| Ron Paul | Not a member of the 119th Congress. Last served TX-14 in the 112th Congress (ended Jan 3, 2013). No 119th roll-call record exists. |
Some members have circumstances that materially affect how their voting record should be read. To prevent misinterpretation, the builder supports an optional member note — a short banner rendered directly under the dashboard header, above the KPI grid, styled as a yellow-bordered callout. The note is also preserved in source so it can be discovered when reading the roster.
Add a member note whenever the raw numbers would mislead a reader who does not already know the member's circumstances. Trigger conditions:
Notes are passed as the optional 5th element of a ROSTER tuple inside
build_member.py (House) or build_senator.py (Senate):
("G000596", "Marjorie Taylor Greene", "R", "MarjorieTaylorGreene119.html",
"Rep. Greene publicly announced her resignation from the House in late 2025, "
"effective January 5, 2026. This explains her substantially lower "
"participation count (325 of 553) versus other members analyzed. Votes "
"after her departure date are necessarily recorded as Not Voting in "
"clerk.house.gov data."),
The text appears verbatim inside a <div class="member-note"> banner.
Keep it under ~3 sentences. Lead with the fact; follow with the
implication for the metrics on this dashboard.
python3 build_member.py G000596 "Marjorie Taylor Greene" R MarjorieTaylorGreene119.html
# (Single-member CLI mode does not accept a note — re-run the full ROSTER
# loop with `python3 build_member.py` to pick up note changes.)
The fetcher (fetch_votes.py) executed once and cached every XML locally so
subsequent member-by-member analysis re-parses from disk (no re-fetch).
Steps performed:
https://clerk.house.gov/evs/{YEAR}/ROLL_{000,100,200,300}.asp
for each calendar year, grep'd rollnumber=NNN parameters, took the max.year ∈ {2025, 2026} × roll ∈ [1..max_roll]:
https://clerk.house.gov/evs/{year}/roll{roll:03d}.xmlvote_cache/{year}_{roll:03d}.xmlUser-Agent: Mozilla/5.0 (research; polisci-analysis) header.time.sleep(0.35) between successful network fetches (≈2.9 req/s),
well under any conservative rate-limit threshold.vote_cache/.Total wall-clock time for the cold fetch: ≈3.5 minutes (553 × 0.35 s + transfer).
Each roll{NNN}.xml follows DTD vote v1.0 20031119. Key extracted fields:
<rollcall-vote>
<vote-metadata>
<majority> R or D
<rollcall-num> e.g. 47
<legis-num> e.g. "H R 1234"
<vote-question> e.g. "On Passage"
<vote-result> e.g. "Passed" / "Failed" / "Agreed to" / "Rejected"
<action-date> e.g. "3-Jan-2025"
<vote-desc> short bill title
<vote-totals>
<totals-by-party> { party, yea-total, nay-total, present-total, not-voting-total } × R/D/I
<vote-data>
<recorded-vote>
<legislator name-id="M001184" party="R" state="KY" ...>Massie</legislator>
<vote>Yea | Nay | Aye | No | Present | Not Voting</vote>
... (one per legislator)
Aye/No are emitted for procedural questions; Yea/Nay for ordinary
passage. Both are normalized to Yea/Nay for analysis purposes.
A member's vote is looked up by exact match on legislator/@name-id against
the target Bioguide ID. If the member did not vote, the field is absent
entirely and we record None (rendered as "absent" in classification).
For every vote, the analyzer determines each party's majority position:
party_position = Yea if yea > nay
= Nay if nay > yea
= Split otherwise (tie or zero)
Per vote, the member's normalized vote (Yea/Nay) is compared to each party's
majority position:
| Both R-pos and D-pos match member's vote | → Helped Both |
| Only R-pos matches | → Helped Republicans |
| Only D-pos matches | → Helped Democrats |
| Neither matches (both opposed member) | → Helped Neither |
| Member did not vote / voted Present | → N/A: <state> |
"Helped Both" arises on bipartisan votes where both party majorities aligned (common on naming-a-post-office bills, suspension-calendar items, etc.). "Helped Neither" arises when member is on the losing side relative to both party leaderships — usually a small protest/defector cluster.
A "blocking win" is recorded when:
Nay, ANDresult matches fail, reject, not agreed, not passed), ANDblocked = "Democrat" if Dem majority was Yea and Rep majority was not Yea
(i.e., Dems backed it, member's Nay helped sink it).
blocked = "Republican" if Rep majority was Yea and Dem majority was not Yea
(i.e., GOP backed it, member's Nay helped sink it).
This metric attributes a single "share" of credit to the member for the defeat, regardless of margin. Members with many such tallies are disproportionately blocking their own caucus's agenda (notable for Massie).
Across votes where each party's majority took a definite position (not Split),
count how many times the member's normalized vote matched (with) or differed
(against) that party's majority. Reported as KPI cards plus a stacked bar
chart with raw counts and percentages.
A vote is a lone-wolf defection if all of the following hold:
This identifies the "stubborn outliers" within a caucus.
Each vote's action-date is parsed (DD-Mon-YYYY) and bucketed by
YYYY-MM. The four primary alignment classes are summed per month
and rendered as a multi-series line chart.
build_member.py holds a single HTML template string with __PLACEHOLDER__
tokens. The Python builder fills in the data, embeds the full per-vote JSON
payload (typically ~150 KB) inline, and writes the result to
results/<Member>119.html. The output has no runtime dependencies on local
files — Chart.js and SortableJS load from public CDNs.
localStorage under a per-bioguide key, so each member's
dashboard remembers its own layout.<canvas> is wrapped in a position:relative; height:300px container
to prevent Chart.js's responsive-resize loop from infinitely growing the page..card { min-width: 0 } so CSS grid columns can shrink properly with
long content.The tree is split into a working /data/ area (raw caches, intermediate
JSONL, per-member metric files) that the pipeline reads and writes, and a
/results/ area that is the actual embeddable artifact shipped to hosts.
Everything under /results/<C>/ is self-contained — no external network
calls at runtime, all vendored — while /data/ retains the upstream
caches and build metadata needed to re-derive results from scratch.
polisci/
├── DOCUMENTATION.md # this file
├── NOTES.md # deferred concerns; see file
├── PROJECT_SCOPE.md # PM-owned scope record (created/updated by PM agent only)
├── CLAUDE.md # Claude Code project context
├── Methodology.md # end-user methodology doc (copied into results/<C>/)
├── .env # gitignored — CONGRESS_GOV_API_KEY=...
├── .gitignore
├── .claude/skills/bulk-update/SKILL.md # project skill: post-change PM+docs+commit orchestration
├── fetch.py # idempotent network fetch
├── parse.py # XML → votes.jsonl + roster.json (+ merge w/ Congress.gov directory)
├── analyze.py # pure analytics; classify_vote + aggregate
├── enrich_roster.py # Congress.gov roster pull + LIS↔bioguide crosswalk
├── build_members.py # parallel per-member JSON build
├── build_app.py # template → results/<C>/ embeddable artifact
├── build_all.py # one-command orchestration
├── tests/ # pytest unit tests for analyze.py + parity_check.py
│ ├── fixtures/*.xml
│ ├── test_analyze.py
│ └── parity_check.py
├── template/
│ ├── app.html, app.css, app.js
│ ├── compare.html, compare.js
│ ├── ranking.html, ranking.js
│ └── vendor/
│ ├── chart.umd.min.js # Chart.js 4.4.0
│ └── sortable.min.js # SortableJS 1.15.2
├── data/119/
│ ├── house/{cache/, votes.jsonl, roster.json}
│ ├── senate/{cache/, votes.jsonl, roster.json}
│ ├── members/<id>.json # per-member metrics
│ ├── manifest.json # member index for picker
│ ├── members_directory.json # Congress.gov roster (~551 members)
│ ├── lis_to_bioguide.json # Senate ID crosswalk
│ ├── api_cache/ # cached Congress.gov responses
│ └── build_report.json
├── results/119/ # embeddable artifact — what ships
│ ├── app.html, compare.html, ranking.html, app.js, compare.js, ranking.js, app.css
│ ├── vendor/{chart…, sortable…}
│ ├── data/{manifest.json, members/<id>.json}
│ ├── Methodology.md # copied at build time by build_app.py
│ └── README.md
└── legacy/ # archived pre-pivot single-member dashboards
python3 fetch.py --congress 119 # idempotent; near-zero work if cache populated
python3 parse.py --congress 119 # XML → votes.jsonl + roster.json
python3 enrich_roster.py --congress 119 # Congress.gov API → members_directory.json + lis_to_bioguide.json
python3 parse.py --congress 119 # re-run: essential — re-does directory-merge now that members_directory.json + lis_to_bioguide.json are populated
pytest tests/ # gate: classifier behavior frozen
python3 build_members.py --congress 119 # parallel: writes per-member JSON + manifest + build_report
python3 build_app.py --congress 119 # template/ + vendor + data → results/<C>/
python3 build_all.py --congress 119
fetch.py — downloads House (clerk.house.gov) and Senate
(senate.gov/LIS) roll-call XML into data/<C>/{house,senate}/cache/.
Idempotent; skips files already on disk.parse.py — parses cached XML into votes.jsonl and roster.json
per chamber. If members_directory.json exists, merges it into the
roster so members who never cast a vote are still listed.enrich_roster.py — pulls the canonical Congress.gov member
directory and emits members_directory.json plus the
lis_to_bioguide.json Senate-ID crosswalk. Requires
CONGRESS_GOV_API_KEY in .env.analyze.py — pure functions (classify_vote, aggregate) shared
by build_members.py and the parity tests. No I/O.build_members.py — parallel per-member metric computation;
writes data/<C>/members/<id>.json, manifest.json, and
build_report.json.build_app.py — copies template/ + vendored libraries + the
built data into results/<C>/ — the self-contained, embeddable
artifact.build_all.py — orchestrates the seven steps above with a single
--congress argument.pytest tests/ — gate run between parse and build; freezes
classifier behavior and includes the legacy 8-member KPI parity check.python3 build_all.py --congress 120
No code changes required; the pipeline is parameterized end-to-end on
--congress.
senate.gov/legislative/LIS/....aggregate() in build_member.py if needed.Yea for analysis; the distinction (passage vs. procedural) is preserved
in the per-vote table column.fetch_votes.py and
build_member.py to refresh.| Date | Change |
|---|---|
| 2026-05-23 | Initial Massie dashboard built (553 votes, 6 KPIs, 5 charts, filterable table) |
| 2026-05-23 | Fixed Chart.js infinite-resize bug by wrapping canvases in fixed-height containers |
| 2026-05-23 | Added percentages to doughnut legend + tooltip |
| 2026-05-23 | Added inline count + % labels above each bar in vote-distribution chart |
| 2026-05-23 | Added "Voted Against GOP/Dem Majority" KPI cards + with/against stacked bar chart |
| 2026-05-23 | Added blocking-wins horizontal bar; percentages on all KPI cards |
| 2026-05-23 | Added SortableJS drag-and-drop card reordering with localStorage persistence |
| 2026-05-23 | Merged "Total Roll Calls" and "Massie Voted" into single Participation card |
| 2026-05-23 | Parameterized builder (build_member.py); generated dashboards for 6 additional House members |
| 2026-05-23 | Wrote DOCUMENTATION.md; moved Massie dashboard to results/ThomasMassie119.html |
| 2026-05-24 | Added §3.1 member-notes guidance; rendered MTG resignation banner on her dashboard |
| 2026-05-24 | Wrote Senate fetcher + builder; generated LindseyGraham119.html (see §12) |
build_*, template/, results/)Replaced the 8 standalone dashboards in legacy/ with a parameterized
pipeline producing a single interactive SPA covering every 119th-Congress
member (552 total: 449 House + 103 Senate). Single-page member picker
with searchable typeahead + sidebar filters; comparison view overlays up
to 6 members across 5 charts; URL-deep-linkable; framework-free and
embeddable into third-party hosts via standalone, iframe, or inline
modes (see results/119/README.md). Roster completeness now sourced
from the Congress.gov API (enrich_roster.py); the Congress.gov API
key moved from §2 of this document into .env (gitignored). Phase 3
KPI-parity gate confirmed 8/8 legacy members reproduce exactly.
Generalizes to future Congresses via --congress N.
Known limitations carried forward to v1.1: see NOTES.md.
Three rounds of post-launch enhancements:
Rankings & UX (Round A) — New ranking.html / ranking.js page sorts
House or Senate members by any of 14 metrics with chamber/party filters and
shareable URL state. Manifest now carries per-member KPI dict (k) so the
page needs a single fetch. build_app.py inlines the full manifest into
both HTML heads, restoring file:// support for the picker (per-member JSON
still requires HTTP). Fixed a Chart.js infinite-growth feedback loop by
giving .chart-frame a fixed height and wrapping each canvas in a
.chart-canvas-wrap with position: relative + flex: 1 1 auto. Two new
KPI tiles ("Voted With GOP" / "Voted With Dem", now 8 total). Page footers
now link to the underlying clerk.house.gov / senate.gov XML and to a new
Methodology.md (end-user methodology doc, also copied into the shipping
artifact). New CLAUDE.md (Claude Code project context).
Delegate banner (Round B) — Detected the 6 territorial House delegates
(AS, DC, GU, MP, PR, VI) by USPS code; parse.py overrides the vote XMLs'
"XX" state code with the directory's real code. Per-member JSON gets an
is_delegate flag; app.js renders a yellow banner explaining that
delegates may vote in committee and on Committee-of-the-Whole amendments
but not on House final passage — their low participation rate is
structural, not absenteeism.
Structural banners for member-elect / replaced / died (Round C) —
enrich_roster.py now does three extra passes beyond the original bulk
fetch: (1) Rescue — individual /v3/member/{bg} lookups for any
vote-derived House bioguide missing from the per-Congress bulk listing
(recovers Matt Gaetz, FL-1, never-seated member-elect, plus his full name
and term history); (2) Replacement-linking — pairs predecessor↔successor
by (state, district) within the Congress window, emitting replaces /
replaced_by bioguide refs; (3) Detail-enrichment — individual lookups
for every member on a replacement chain to get accurate per-Congress
congress_term (startYear/endYear/district) and death_year (the bulk
listing only carries chamber + startYear). parse.py and build_members.py
propagate these into the per-member JSON; app.js renderNote() branches
by status with priority delegate > unseated > died > replaced_by > replaces
served_partial. Predecessor and successor names render as in-app links via
manifestByIdlookup. 8 House replacement pairs auto-linked in the 119th (Gaetz→Patronis FL-1, Waltz→Fine FL-6, Grijalva R.→Grijalva A. AZ-7, Turner→Menefee TX-18, Connolly→Walkinshaw VA, Greene→Fuller GA, Green→Van Epps TN, Sherrill→Mejia NJ).
Developer tooling (Round D) — .claude/skills/bulk-update/SKILL.md
defines a project-level Claude Code skill that orchestrates the full
post-change cycle: PM updates PROJECT_SCOPE.md, parallel programmer agents
sync CLAUDE.md and DOCUMENTATION.md, then git add/commit/push.
This section captures the design for extending the House pipeline to U.S. Senators, plus the implementation choices for the first build (Lindsey Graham, R-SC).
Senate roll-call XML lives on senate.gov, not clerk.house.gov:
https://www.senate.gov/legislative/LIS/roll_call_lists/vote_menu_119_{S}.xml
where {S} is 1 (2025) or 2 (2026).https://www.senate.gov/legislative/LIS/roll_call_votes/vote119{S}/vote_119_{S}_{NNNNN}.xml
(5-digit zero-padded vote number — note: different padding from House
rolls, which use 3 digits).No API key, no authentication. Same 350 ms throttle policy applies.
| Field | House (clerk.house.gov) |
Senate (senate.gov) |
|---|---|---|
| Root element | rollcall-vote |
roll_call_vote |
| Per-party totals | vote-totals/totals-by-party |
Not present — must aggregate from per-member records |
| Member ID | Bioguide (name-id="M001184") |
LIS (lis_member_id, e.g. S293) |
| Vote element | <vote>Yea</vote> |
<vote_cast>Yea</vote_cast> |
| Member party | attribute on <legislator> |
child <party> element |
| Date format | 3-Jan-2025 |
January 9, 2025, 02:54 PM |
| Vote question | <vote-question> + <vote-desc> |
<question> + <vote_title> + <vote_document_text> |
| Result | <vote-result> (e.g. "Failed") |
<vote_result> (e.g. "Cloture on the Motion to Proceed Agreed to") |
Two new scripts, mirroring the House pair:
fetch_senate.py — fetches vote_menu_119_{1,2}.xml, discovers max
vote number per session, then loops to download every per-vote XML to
senate_vote_cache/{S}_{NNNNN}.xml. Idempotent like the House fetcher.build_senator.py — parses cached XML, aggregates per-party totals
from member records, runs the same classification logic as the House
builder (§6) so the output is methodologically comparable across chambers.
Emits HTML to results/<Senator>119.html using the same template
(substituting "House" → "Senate" in the header and source attribution).<members>/<member>
and increment {R,D,I} × {yea,nay,present,not_voting} based on
<party> and <vote_cast>. Independents are tallied separately but
classified by which caucus they conference with (Sanders, King → D
for majority-position computation, since they reliably caucus with
Democrats).Yea/Nay/Present/Not Voting;
no Aye/No distinction. No normalization needed.<document>/<document_type> +
<document_number> (e.g. "S. 5"). For nominations, fall back to
<vote_title> (which contains "Motion to Invoke Cloture: ...").S293 (LIS), G000359 (bioguide).| Display Name | LIS ID | Party | Chamber | Dashboard file |
|---|---|---|---|---|
| Lindsey Graham (SC) | S293 | R | Senate | LindseyGraham119.html |
Suggested next senators (not built yet):
Add via ROSTER in build_senator.py once written.
The HTML template, CSS, and JS in build_senator.py are intentionally
identical to build_member.py so dashboards are directly comparable.
The only structural changes:
senate.gov instead of clerk.house.gov.This is intentional: the comparative value of these dashboards depends on consistent visual + methodological treatment across chambers.