How The Optimal Classification Scaling (OC) Program Works In One Dimension:

The "Edith" Algorithm

7 May 2001

I first started working on modeling roll call voting in 1978 when I was at the University of Oregon. I developed a simple scaling program that I dubbed

YYYYYYYYYYYYYYYYYNNNNNNNNNNNNNNNNNN * NNNYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY * YYYYYYYYNNNNNNNNNNNNNNNNNNNNNNNNNNN * Etc.

The asterisks indicate the "cutting point" for the roll call; that is, the midpoint of the Yea and Nay outcomes. A legislator located exactly on the midpoint would be indifferent between voting Yea or voting Nay.

Suppose roll call voting is in accord with this model. Then the scaling problem consists
of taking a roll call matrix and "unscrambling" it Â– that is, finding a rank ordering
of legislators and the correct **"polarity"** (**Yea** to the left of the cutpoint or **Yea** to the right of the cutpoint) for each roll call such that patterns like those above
are produced for each roll call. This is what **Edith** is designed to do. If the data is perfect, then the solution is easy and the correct
rank ordering is always found (see Proof that if Voting is Perfect in One Dimension, then the First Eigenvector Extracted
from the Double-Centered Transformed Agreement Score Matrix has the Same Rank Ordering
as the True Data).

However, when there is error in the data, things get a bit complicated. When there
is error the aim is to find a rank ordering that maximizes the correct classification
of the observed Yeas and Nays. This is easier said than done because if there are
n legislators, then there are n!/2 possible rank orders to check to find the best
one. For example, for 50 legislators this number is about 1.52 * 10^{64} Â– a formidable number even with modern supercomputers. Consequently, **Edith** embodies a "sensible" search procedure (what the Operations Researchers call a "Heuristic")
to find a solution. Namely, a good starting rank order of the legislators is generated
using the technique discussed for the perfect voting problem ( Proof that if Voting is Perfect in One Dimension...) and the corresponding cutting points are found. These cutting points are used to
get a new rank ordering of the legislators, and so on. At each step the correct classification
increases until a rank order is found that produces cutting points that in return
reproduce the rank order.

Finding the optimal rank ordering requires three steps:

- First, generate a starting estimate of the rank order of the legislators using the
method cited above and translate this rank ordering into evenly spaced coordinates
between -1.0 and +1.0;
- Second, given these coordinates, find the best cutting point for each roll call that
maximizes correct classification;
- Third, given these cutting points, find a new coordinate for each legislator that
maximizes correct classification.

To see how this process works, let the current ordering of the legislators from left
to right be represented by coordinates from **x _{1}** to

**-1 Â£ x _{1}Â£ x_{2}Â£ x_{3}Â£ ... Â£ x_{p}Â£ +1.**

**(-1, x _{1}), (x_{1} , x_{2}), ... , (x_{p} , +1)**

Actual Voting Pattern

Y Y Y Y Y Y . . . . Y Y * N Y * N Y * N N . . .N N

___________________________________________________________

-1.0 x_{1} x_{2} x_{3} x_{4} . . . . . . . 0.0 . . . . . . . x_{p-1}x_{p} +1.0

Perfect Voting Patterns

(-1 , x_{1}) produces nnnnnnnnn.....nn or yyyyyyyyy.....yy

(x_{1} , x_{2}) produces ynnnnnnnn.....nn or nyyyyyyyy.....yy

(x_{2} , x_{3}) produces yynnnnnnn.....nn or nnyyyyyyy.....yy

(x_{3} , x_{4}) produces yyynnnnnn.....nn or nnnyyyyyy.....yy

(x_{4} , x_{5}) produces yyyynnnnn.....nn or nnnnyyyyy.....yy

(x_{5} , x_{6}) produces yyyyynnnn.....nn or nnnnnyyyy.....yy

etc.

(x_{p-1} , x_{p}) produces yyyyyyyyy.....yn or nnnnnnnnn.....ny

(x_{p} , +1) produces yyyyyyyyy.....yy or nnnnnnnnn.....nn

Since there are only 2p perfect patterns, it is a simple matter to compare each perfect
pattern with the actual pattern of votes. This can be done very efficiently by first
assuming that the cutting point is in the region **(-1 , x _{1})** and calculating the corresponding number of correct classifications. Next assume
that the cutting point is in the region

Let the ordering of the q roll call midpoints estimated above be represented by coordinates
from **z _{1}** to

**-1 Â£ z _{1}Â£ z_{2}Â£ z_{3}Â£ ... Â£ z_{q}Â£ +1.**

LLLLLLLLLCCCCCCCCCCCCCCCCCCCCCCCCC

or

LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLCCCCCC

The Figure below shows that by taking the polarity of the roll calls into account the legislator problem is equivalent to the roll call cutting point problem. Hence, the algorithm described above can be used to find the optimal legislator location. There are only 2q possible perfect patterns and only 2q calculations are required to find the maximum classification location for each legislator.

Actual Voting Pattern

L L L L L L . . . . L L * C L * C L * C C . . .C C

___________________________________________________________

-1.0 z_{1} z_{2} z_{3} z_{4} . . . . . . . 0.0 . . . . . . . z_{q-1}z_{q} +1.0

Perfect Voting Patterns

(-1 , z_{1}) produces CCCCCCCCC.....CC or LLLLLLLLL.....LL

(z_{1} , z_{2}) produces LCCCCCCCC.....CC or CLLLLLLLL.....LL

(z_{2} , z_{3}) produces LLCCCCCCC.....CC or CCLLLLLLL.....LL

(z_{3} , z_{4}) produces LLLCCCCCC.....CC or CCCLLLLLL.....LL

(z_{4} , z_{5}) produces LLLLCCCCC.....CC or CCCCLLLLL.....LL

(z_{5} , z_{6}) produces LLLLLCCCC.....CC or CCCCCLLLL.....LL

etc.

(z_{q-1} , z_{q}) produces LLLLLLLLL.....LC or CCCCCCCCC.....CL

(z_{q} , +1) produces LLLLLLLLL.....LL or CCCCCCCCC.....CC

This simple algorithm converges very rapidly to a solution in which the rank ordering
of the legislators and the rank ordering of the roll call midpoints reproduce each
other. This is a very strong form of ** conditional global maximum**. Technically, if there are multiple sets of parameters (for example, as in this case,
parameters corresponding to rows of a matrix and parameters corresponding to columns
of a matrix), and every set of parameters is at a global maximum conditioned on the
other sets being held fixed, and these sets reproduce each other, then they are at
a conditional global maximum. Note that the overall global maximum, by definition,
is a conditional global maximum.

Conditional Global maxima are quite rare because of the nature of the constraints. Indeed, in the metric similarities problem, conditional global minima are very rare and their number declines as the number of parameters increases (Poole, 1990).

A simple illustration of how- Suppose we start with the scrambled ordering of the legislators shown below. Using
the method described above, we would place the Ace at the left end of the dimension
because this results in only one classification error. Placing it between the Ace
and Two of diamonds would produce four classifcation errors -- the Six, Seven, Four,
and Nine of Diamonds. The same logic holds for the Two and Three of Spades. No interior
point does better. When we get to the Four of Spades, however, placing it between
the Three and Eight of Diamonds produces four classification errors (the Six, Seven,
Nine and Five of Diamonds). The Four of Spades could be placed at the left end also
producing four errors. In the case of a tie, I always put the cutpoint/legislator
at the interior most position.

- With respect to the ordering of the cutpoints represented by the Ace to the Nine of
Spades, the new legislator ordering is shown below. The Ace of Diamonds is placed
to the left of the Ace, Two, and Three of Spades, the Two and Three of Diamonds are
placed at the same position as the Ace, Two, and Three of Spades because of the tied
ranks. The Four of Diamonds is placed between the Ace, Two, Three stack, and the Four
to Seven stack. The Five, Six, Seven of Diamonds are placed at the same position as
the Four to Seven stack of Spades. The Eight of Diamonds is placed between the Four
to Seven stack and the Eight and Nine of Spades. The Nine of Diamonds is placed at
the same position as the Eight and Nine of Spades. Finally, the Ten of Diamonds is
place to the right of the Eight and Nine of Spades.

- With respect to the ordering of the legislators represented by the Ace to Ten of Diamonds,
the new ordering of the cutpoints is shown below. The Ace of Spades is placed between
the Ace of Diamonds and the Two and Three of Diamonds. The Two of Spades is placed
at the same position and the tied Two and Three of Diamonds. The Three of Spaces is
placed between the tied Two and Three of Diamonds and the Four of Diamonds. The Four
of Spaces is placed between the Four of Diamonds the the tied Five to Seven stack.
The Five and Six of Spaces are placed at the same position as the Five to Seven stack
of Diamonds. The Seven of Spades is placed between the Five to Seven stack and the
Eight of Diamonds. The Eight of Spades is placed between Eight and Nine of Diamonds.
Finally, the Nine of Spades is placed between the Nine and Ten of Diamonds.

- With respect to the ordering of the roll cutpoints represented by the Ace to Nine
of Spades, the ordering of the legislators represented by the Ace to Ten of Diamonds
is shown below. All are now in the correct position with perfect classification.

NOMINATE Data, Roll Call Data, and Software

Course Web Pages: University of Georgia (2010 - )

Course Web Pages: UC San Diego (2004 - 2010)

University of San Diego Law School (2005)

Course Web Pages: University of Houston (2000 - 2005)

Course Web Pages: Carnegie-Mellon University (1997 - 2000)

Recent Working Papers

Analyses of Recent Politics

About This Website

K7MOA Log Books: 1960 - 2015

Bio of Keith T. Poole

Related Links