PROJECT_SCOPE.md 14 KB

Project: 119th Congress Voting Dashboard

Written post-hoc by the PM agent (2026-05-24) to codify shipped reality through Phase 6. The implementation plan that drove the build lives at research/PLAN.md. Only the PM agent may edit this file.


1. Project Overview

A static-file, framework-free interactive dashboard that surfaces member-level roll-call voting behavior for every member of the 119th US Congress. Eight standalone legacy HTML files (Massie, Khanna, AOC, Omar, MTG, Jordan, Donalds, Graham) were replaced with a unified single-page application covering all ~552 seated members, served from a single build artifact at results/119/.

The project is a personal research and analysis tool, built and operated by a solo hobbyist analyst. It is designed for the analyst's own use and for sharing with third-party hosts who can embed the artifact via iframe or inline <div>.


2. Target Users

User Description
Primary The analyst themselves — personal research, exploration, and editorial work on congressional voting patterns.
Secondary Third-party hosts who embed the artifact via iframe or inline <div> into their own pages. Embedding is fully supported by design.

3. Core Value Proposition

Given any member of the 119th Congress, instantly render a reproducible, citation-quality breakdown of how their roll-call votes aligned with or diverged from each party's majority position — without a page reload, without a server, and without any external network dependency after the initial page load.


4. Scope of Work

In Scope (MVP — as built through Phase 6)

Feature Description Priority
Single-member SPA dashboard app.html — searchable typeahead + sidebar filters (chamber, party, state); on selection, fetches per-member JSON and mutates Chart.js datasets in place without teardown Must Have
5 core charts Vote distribution, alignment doughnut, blocking bars, alignment-over-time line, with/against stacked bar — created once at page init Must Have
Sortable/filterable vote table Per-vote rows; all upstream strings rendered via textContent (no innerHTML) Must Have
URL deep-linking pushState on member selection; replaceState on filter typing; reload restores state Must Have
localStorage persistence Last-selected member persisted as polisci:v119:lastMember; validated against manifest on read Could Have (shipped)
Comparison view compare.html — multi-select pills (up to 6 members); 5 overlay charts; shareable ?ids= URL Must Have
All-member coverage enrich_roster.py pulls the complete 119th roster from Congress.gov API; members with zero roll-call votes receive a served_partial banner Must Have
Framework-free embedding Three modes: standalone, iframe (sandbox="allow-scripts allow-same-origin"), inline <div id="polisci-root" data-base="…"> Must Have
Zero CDN dependencies Chart.js 4.4.0 + SortableJS 1.15.2 vendored locally in template/vendor/; no runtime external requests Must Have
Parameterized pipeline All build scripts accept --congress N; generalizes to future Congresses without code changes Must Have
Reproducibility metadata Every per-member JSON includes a _meta block: schema version, pipeline version, classifier hash, data snapshot date, source XML counts Must Have
Test suite pytest tests/test_analyze.py with frozen XML fixtures covering partisan, bipartisan, absent-member, and failed-blocking cases Must Have

Out of Scope (Future / Deferred — see NOTES.md for full rationale)

  • Editorial label rewording — "Helped Republicans / Blocked Dem-Backed" language kept per user preference; neutral alternatives deferred to v1.1 before any third-party publication (NOTES item 1, compliance Finding 1, High risk)
  • Comparison chart trimming — all 5 overlay charts shipped; reducing the set deferred pending user-research feedback (NOTES item 2)
  • Visible caveats panel — collapsible "How to read this" panel describing methodology limitations deferred to v1.1 (NOTES item 3, compliance Finding 2)
  • Full localStorage persistence — only lastMember is persisted; filter state persistence deferred (NOTES item 4)
  • 120th Congress dry-run — pipeline is parameterized but cannot be validated end-to-end until 120th data exists (NOTES item 5)
  • True own-party-defection seriescompare.html uses Helped Neither as a proxy; a precise monthly series requires a future analyze.py enhancement (NOTES item 6)

5. Feature Flow

Single-member view (app.html):

1. Page loads → fetches manifest.json (552 members, version-stamped)
2. Sidebar populated (chamber, party, state checkboxes)
3. User types in typeahead or applies sidebar filters → list narrows live
4. User selects a member → pushState updates URL to ?id=<bioguide>
5. App fetches data/members/<id>.json (cache-busted by manifest version)
6. Chart.js datasets mutated in place; chart.update('none') called — no teardown
7. KPI cards, sortable/filterable vote table, and member-note banner update
8. Reloading the URL restores the same member

Comparison view (compare.html):

1. Page loads → same manifest fetch and sidebar setup
2. User selects members via typeahead → color-coded pills appear
3. App fetches each selected member's JSON; 5 overlay charts update
4. URL updates to ?ids=<id1>,<id2>,... (shareable, capped at 6)
5. Pill click opens member's app.html?id=<id> in a new tab
6. Reloading restores all selected members from URL

6. Usability Concerns

  • Mobile: sidebar collapses on small viewports; responsive @media (max-width: 768px) block in app.css
  • Accessibility: all upstream strings via textContent; bill links built via createElement with validated href
  • Embedding safety: all CSS namespaced under #polisci-root (0 unscoped rules per Phase 6 audit); data-base attribute makes data path host-configurable
  • Performance: 552-member manifest loads once; per-member JSON is ~80 KB; switching members costs one fetch, not a page load; full build completes in under 5 seconds
  • No external requests after page load: verified by Phase 6 grep audit and HTTP smoke test

7. Technical Considerations

Tech stack:

  • Python 3, stdlib only (no third-party packages at runtime); pytest for tests
  • Vanilla JS (ES2020, no framework, no transpiler)
  • HTML/CSS (no preprocessor)
  • Chart.js 4.4.0 + SortableJS 1.15.2, vendored locally

Data pipeline:

clerk.house.gov XML  ──┐
                       ├─→ fetch.py → parse.py → enrich_roster.py
senate.gov XML     ────┘                              │
Congress.gov API ─────────────────────────────────────┘
                                     ↓
                             build_members.py (parallel pool)
                             → data/119/members/<id>.json × 552
                             → data/119/manifest.json
                                     ↓
                               build_app.py
                             → results/119/  (embeddable artifact)

Data sources:

  • clerk.house.gov — House roll-call XML (553 votes cached)
  • senate.gov — Senate roll-call XML (789 votes cached)
  • congress.gov/v3 API — complete 119th roster + Senate LIS-to-bioguide crosswalk; API key in .env (gitignored)

Security posture:

  • All upstream strings rendered via textContent, never innerHTML
  • parse.py rejects strings containing <, >, or control characters
  • Query string id matched against ^[A-Z]\d{6}$|^S\d{3}$ and verified against manifest before any fetch or DOM use
  • ids capped at 6; each validated against manifest
  • localStorage values regex-validated against manifest allowlist on read
  • No postMessage API (frame-boundary attack surface closed in v1)
  • No CDN; no SRI question

Deployment: fully static; serve results/119/ from any file host or web server. Recommended CSP and iframe sandbox snippet documented in results/119/README.md.


8. Implementation Plan

Phase 0 — Reset and scaffolding

  • Task 0.1 — Create .env with API key; add .gitignore (.env, __pycache__/, *.pyc, data/*/cache/)
  • Task 0.2 — Redact API key from DOCUMENTATION.md §2; add rotation guidance
  • Task 0.3 — Create NOTES.md with 6 deferred concerns
  • Task 0.4 — Download Chart.js 4.4.0 + SortableJS 1.15.2 into template/vendor/

Phase 0.5 — Complete-roster enrichment

  • Task 0.5.1 — Write enrich_roster.py: paginate Congress.gov /member/congress/119; write members_directory.json; build lis_to_bioguide.json via second-pass senator detail fetch
  • Task 0.5.2 — Modify parse.py: merge members_directory.json into roster.json; apply served_partial flag to zero-vote members; validate len(roster) >= 535

Phase 1 — Foundation

  • Task 1.1 — Write build_members.py: multiprocessing.Pool, atomic writes, _meta block, build_report.json, manifest array + version field
  • Task 1.2 — Write tests/fixtures/*.xml (partisan, bipartisan, absent, failed-blocking) + tests/test_analyze.py
  • Task 1.3 — Write skeleton template/app.html + template/app.css (namespaced under #polisci-root)
  • Task 1.4 — Gate: enrich_roster.py runs clean; parse.py produces merged roster; pytest passes; build_members.py emits ≥535 JSONs + manifest

Phase 2 — Single-member view

  • Task 2.1 — Write template/app.js: manifest loader, sidebar filters, typeahead, member-fetch + in-place Chart.js updates for all 5 charts
  • Task 2.2 — Sortable/filterable vote table (ported from legacy; textContent substitution)
  • Task 2.3 — URL deep-linking (pushState on selection, replaceState on filter typing, popstate handler)
  • Task 2.4 — Member-note banner for served_partial members
  • Task 2.5 — localStorage persistence for lastMember (namespaced + validated)

Phase 3 — KPI parity gate (hard gate)

  • Task 3.1 — Regenerate 8 legacy members through new pipeline; diff all KPIs against legacy/*.html
  • Task 3.2 — Confirm MTG banner, deep-links, and no CDN traffic
  • Task 3.3 — Gate result: 8/8 PASS — safe to proceed to Phase 4

Phase 4 — Full member build

  • Task 4.1 — build_members.py --congress 119 for all members; confirm 552 JSONs, 0 failures in build_report.json
  • Task 4.2 — Smoke-test 10 randomly chosen members across both chambers and all parties

Phase 5 — Comparison view

  • Task 5.1 — Write compare.html + multi-select pills, shareable ?ids= URL (scaffolding)
  • Task 5.2 — Overlay chart 1: alignment-over-time line (per member, with alignment-class switcher)
  • Task 5.3 — Overlay chart 2: voted-against-own-party rate over time (Helped Neither proxy)
  • Task 5.4 — Overlay chart 3: side-by-side KPI grouped bar
  • Task 5.5 — Overlay chart 4: defection scatter (X: % against GOP, Y: % against Dem)
  • Task 5.6 — Overlay chart 5: vote-distribution grouped bar (Yea/Nay/Present/Not Voting)

Phase 6 — Embedding, security hardening, polish

  • Task 6.1 — Write build_app.py: copy template → results; stamp manifest version into HTML; copy data; write results/119/README.md with CSP + sandbox snippet
  • Task 6.2 — CSS namespace audit: 0 unscoped rules confirmed; inline-div embed smoke test in results/119/_embed_test.html
  • Task 6.3 — data-base attribute support; iframe embed test (_iframe_test.html, _iframe_compare_test.html); HTTP smoke via python3 -m http.server 8765; external-URL grep audit (zero runtime external calls)

Phase 7 — Documentation and close-out (in progress)

  • Task 7.1 — Update DOCUMENTATION.md §8 (new file layout), §9 (new regeneration commands), §11 (change-log entries)
  • Task 7.2 — Write PROJECT_SCOPE.md reflecting shipped reality (this file)
  • Task 7.3 — Delete legacy/ after user confirmation

9. Success Criteria

  • pytest tests/ — all green
  • build_members.py --congress 119 — 552 member JSONs, 0 failures, completes in under 5 seconds
  • Phase 3 KPI parity gate — 8/8 members PASS against legacy output
  • Phase 6 CSS audit — 0 unscoped rules
  • Phase 6 external-URL audit — 0 runtime external network calls
  • Switching members in app.html re-renders without a page load
  • Comparison view accepts up to 6 members and renders all 5 overlay charts
  • Iframe and inline-div embedding verified
  • Manual cross-browser smoke (Chrome, Firefox, Safari) — deferred to user (no headless browser available on build host)

10. Known Risks and Open Items

See NOTES.md for the full 6-item list with rationale. Summary:

  1. Editorial label wording (High, compliance Finding 1) — biggest risk before any third-party publication. Labels impute intent; neutral alternatives exist. Must resolve before v1.1 public release.
  2. Comparison chart redundancy (Medium) — 5 charts shipped; trim later if user research shows overlap.
  3. Visible caveats panel (Medium, compliance Finding 2) — methodology limitations not surfaced to end users yet; deferred to v1.1.
  4. localStorage scope (Low) — only lastMember persisted; full filter persistence not implemented.
  5. 120th Congress validation (Low) — pipeline parameterization untested against real data; validate when 120th data lands.
  6. Own-party-defection proxy (Low) — Helped Neither used as proxy in compare chart; a true series requires a future analyze.py change.

11. Future Congresses

The pipeline is fully parameterized. To build the 120th Congress dashboard:

python3 fetch.py          --congress 120   # once roll-call data is available
python3 parse.py          --congress 120
python3 enrich_roster.py  --congress 120
pytest tests/
python3 build_members.py  --congress 120
python3 build_app.py      --congress 120
# artifact at results/120/

No code changes are required. The only prerequisite is live data from clerk.house.gov and senate.gov for the 120th Congress.


12. Extra Features

Features added after initial scope. Complete current Implementation Plan progress before starting these.

Feature Description Added On Rationale
(none yet)