# Methodology

How the 119th Congress voting-dashboard metrics are computed, and what they do
and don't measure.

## 1. Data sources

- **House roll calls** — Clerk of the House XML, one file per roll-call vote
  (`https://clerk.house.gov/evs/<year>/roll<NNN>.xml`). User-facing index:
  https://clerk.house.gov/Votes
- **Senate roll calls** — Senate.gov LIS XML, one file per roll-call vote
  (`https://www.senate.gov/legislative/LIS/roll_call_votes/...`). User-facing
  index: https://www.senate.gov/legislative/votes_new.htm
- **Member roster** — Congress.gov v3 API
  (`/member/congress/119`) supplements vote-derived rosters so members who
  served but never cast a recorded vote are still represented.

Coverage for the 119th Congress as of the most recent build: 553 House +
789 Senate roll-call votes; 552 distinct members (449 House + 103 Senate).

## 2. Vote normalization

Each member's recorded vote on a roll call is one of `Yea`, `Nay`, `Aye`,
`No`, `Present`, `Not Voting`. For analysis, `Aye`/`Yea` are merged into
`Yea` and `No`/`Nay` are merged into `Nay` — the procedural/substantive
distinction is preserved only in the per-vote table column.

For every vote, each party's **majority position** is computed:

```
party_position = Yea   if yea > nay
               = Nay   if nay > yea
               = Split otherwise (tie or zero)
```

## 3. Alignment classification

Per vote, the member's normalized `Yea`/`Nay` is compared to each party's
majority position:

| Condition                                | Label                |
|------------------------------------------|----------------------|
| Member matches BOTH party majorities     | `Helped Both`        |
| Member matches only Republican majority  | `Helped Republicans` |
| Member matches only Democratic majority  | `Helped Democrats`   |
| Member matches NEITHER party majority    | `Helped Neither`     |
| Member did not cast a Yea/Nay            | `N/A: <state>`       |

- **Helped Both** arises on bipartisan votes (post-office namings, suspension
  calendar items, broadly popular measures).
- **Helped Neither** arises when the member is on the losing side relative to
  both party leaderships — typically a small protest/defector cluster.

## 4. Blocking analysis

A **blocking win** is recorded when ALL of the following hold:

1. The member voted `Nay`.
2. The measure failed (`result` matches `fail`, `reject`, `not agreed`,
   `not passed`, `not invoked`).
3. The other party's majority was on the opposite side (it was a partisan
   defeat, not a bipartisan one).

- `blocked = "Democrat"` — Democratic majority was Yea, Republican majority
  was not Yea, member voted Nay → counted as helping sink a Democrat-backed
  measure.
- `blocked = "Republican"` — symmetric.

This is a per-share count, not a marginal causal estimate: the metric
credits the member individually for a defeat that may have involved hundreds
of other Nay votes.

## 5. Voted-with / voted-against by party majority

Across votes where each party had a definite majority (not Split), the
member's normalized vote is counted as matching (`with`) or differing
(`against`) that party's majority. Reported as KPI tiles and as a
stacked bar chart of raw counts and percentages.

## 6. Lone-wolf defection

A vote counts as a **lone-wolf defection** when:

1. The member's own party (per roster) had a definite majority position.
2. The member's normalized vote opposed that majority.
3. The number of fellow same-party defectors was at most the chamber
   threshold (**5 in the House, 3 in the Senate**).

Identifies stubborn outliers — members who repeatedly break with their
own caucus when very few others do.

## 7. Monthly trend

Each vote's date is bucketed by `YYYY-MM`. The four primary alignment
classes (Helped Republicans / Helped Democrats / Helped Both /
Helped Neither) are summed per month and rendered as a multi-series
line chart.

## 8. Ranking metrics

The Rankings page sorts the chosen chamber by any of:

- Raw counts: Total Votes, Yeas, Nays, Voted With/Against GOP, Voted
  With/Against Dem, Lone Wolf Votes
- Percentages of votes cast: Participation %, Voted With/Against GOP %,
  Voted With/Against Dem %, Lone Wolf %

Percentage metrics exclude members with zero votes cast (denominator
undefined).

## 9. Comparison view

The comparison view overlays up to six members on five charts: alignment
over time, "voted against own party" rate over time (using
`Helped Neither` as a proxy — see Limitations §10), grouped KPI bar,
defection scatter (% against GOP vs % against Dem), and vote distribution.

## 10. Known limitations and caveats

- **Roll-call votes only.** Voice votes, unanimous-consent agreements, and
  motions adopted without a recorded vote are invisible. A member's
  silence on a controversial measure that passed by voice cannot be
  detected here.
- **Procedural vs. substantive votes are treated equally.** Bill-subject
  classification is not attempted. Heavy weighting of procedural-calendar
  votes can inflate "Helped Republicans" / "Helped Democrats" counts vs.
  the substantive picture.
- **"Helped Both" interpretation.** A bipartisan vote that passes
  overwhelmingly is a real instance of the member helping both sides — but
  it can visually dilute the more interesting partisan classes.
- **Blocking-wins attribution.** Each blocking tally credits the member
  individually for a defeat that may have involved hundreds of other Nay
  votes. The metric is a count of partisan defeats the member's Nay vote
  belonged to, not a marginal causal estimate.
- **Lone-wolf threshold** (≤5 House / ≤3 Senate) is a heuristic.
- **"Voted against own party" overlay is a proxy.** The comparison-page
  monthly chart uses `Helped Neither` per month as an approximation; the
  pipeline does not currently emit a true monthly own-party-defection
  series.
- **Mid-term resignations / replacements** are not flagged on dashboards
  beyond the `served_partial` banner for members who cast zero recorded
  votes. A low participation count may reflect resignation, illness, or
  campaigning for higher office — check the underlying date pattern in
  the votes table.
- **Editorial label wording.** Labels such as "Helped Republicans" and
  "Blocked Dem-Backed" attribute intent that the math does not measure
  directly. They describe a counting relationship between the member's
  vote and party majorities, not motive.

## 11. Reproducibility

Each per-member JSON includes a `_meta` block with `schema_version`,
`pipeline_version`, `classifier_hash` (SHA-256 of `analyze.py`),
`data_snapshot_date`, and per-chamber `source_xml_count`. The classifier
hash changes if and only if the classification logic changes.