Why Would Russia Target Voter Registration Databases?
Why were voter registration databases targeted? How could hacking a voter registration database be part of hacking an election?
The most obvious answer is disenfranchisement. If a voter is deleted from a voter registration database, or if the voter’s address or other identifying information is changed, that voter may be issued a provisional ballot, or simply walk away confused, without casting a vote.
Hacking and/or altering voter registration databases is a significant act. It’s highly unlikely this would be done without a specific outcome in mind. Voter registration lists include specific, personal information. Altered information in the databases, and the resulting disenfranchisement, could strategically target specific groups of voters based on gender, race, address, political party, or other criteria in an attempt to suppress their votes.
Disenfranchisement could be a “tasty meatball” for someone attempting to influence election outcomes. However, that is not the only way a hacked voter registration databases can be used for the purpose of hacking an election.
Warren Richey identifies another, even more ominous, way that an altered registration database could contribute to election hacking.
Hiding Election Hacking by Manipulating Registration Data
When most people think of “election hacking” they think of vote totals being changed – hacked electronic machines flipping vote totals as voting occurs, or hackers later changing the information stored on individual voting machines, or even hackers altering tallies electronically. As hackers at the 2017 DefCon discovered, these activities could be easy to accomplish, particularly on a smaller scale, such as for a specific precinct, or for a specific type of voting machine. These types of hacks could be enough to alter an election outcome when smaller numbers of votes make the difference between winning candidates.
But what if large numbers of votes needed to be changed in order to “flip” an election?
Following what we could jokingly call the “Law of Laziness”, people do the minimum needed to achieve the objective. It’s just practical. Hacking and altering information in a voter registration database is a high-difficulty, high-risk activity. So, why not stick to the easy hacks described above? Likely this would only happen because there was a need for it. Hacking voter registration databases may be the minimum needed to achieve the objective of altering the outcome of the election. This bigger hack offers the potential to manipulate and control large numbers of votes, paired with the ability to mask the activity by changing or adding “voters” in the database. Change a few numbers, manipulate the voter demographics that everyone looks at and talks about, control the narrative. It’s high-risk, high-reward for nefarious entities who intend to commit election fraud.
What if voter rolls were changed by the addition of “zombie voters” who have no idea they have been added to the rolls and no intention of voting? How could these “zombie” voters be used? Could their numbers be used to falsely inflate totals of registered voters for a specific party? Could they be included in totals used to manipulate the narrative regarding voter turnout? Could electronic votes cast by these zombie voters actually be recorded in the system?
Anyone altering election outcomes by changing vote totals surely would want to avoid detection and make us think nothing really happened. Blatant election fraud would risk having the falsified election result challenged and nullified, and could have severe consequences. An election whose results are questioned because they make no sense could blow the secrecy of the hack, burning it as a covert operation, so it cannot be used again in the future.
To hide a massive hack like changing large numbers of vote totals, the hacker must create an explanation for the unexpected outcome, in order to keep us from rethinking our election systems and ending the game for good. The intent would be to create falsified data that people would believe are reasonable, data that would indicate the election results were real and would provide an apparent explanation for what occurred. If changes were made to voter registration database information, nearly impossible-to-believe results can begin to look valid.
Nothing to See Here. Please Move Along
We rely on voter registration lists to give us information about voters. How many are registered in each precinct? Are they Democrats? Republicans? Not affiliated with either major party? How many voters went to the polls? Was there low turnout in some areas? Higher turnout in others? Registration data answers these questions, and the answers to these questions help support explanations for unexpected outcomes. In the case of the 2016 presidential election, researchers and journalists turned to registration data to make sense of the unexpected outcome. If “zombie voters” were strategically added to the registration data to create a false picture of increased numbers of voting Republicans in rural and suburban areas, and large numbers of apparently apathetic Democrats in urban areas and college towns, a narrative could be created that explains what happened.
What if some voters were disenfranchised through altered data in voter registration lists, thus preventing them from voting or causing them to cast provisional ballots that may or may not have been counted? What if “zombie” voters were also added to the voter registration databases in order to alter apparent demographics?
Voter registration databases have the potential to be mighty weapons for those seeking a high-risk, high-reward way to manipulate the outcomes of elections.
Please read our full analysis here.
There are several ways this might be done. There is SQL injection, for instance where a hacker takes advantage of code vulnerabilities to alter a database by sending code through a form input or even through the URL.
In fact hackers targeted voter registration systems in as many as 21 states during the 2016 campaign season, according to the Department of Homeland Security, including known successful penetrations in Illinois and Arizona in the months before the 2016 general election. Though Arizona allegedly only had a username and password of a single county employee compromised, access was gained to the Illinois registration system. At least one report indicates that some registration records were altered during the breach.
The Illinois attack has been fairly well documented and examined. A timeline provided by the state gives a substantial amount of detail on the attack and the response to it. The attack was used to compromise the database – accessing, and possibly altering an unknown number of records. Wisconsin, in a response memo written after the DHS report, noted that either the same SQLi vulnerability or a similar one was present in their systems but had been addressed during upgrades conducted in January 2016.
Despite these documented attempts to compromise election systems, no state has come forward with a statement that its voter registration system was altered by these attempts, some states have denied the DHS reports.
But there are other ways to penetrate these systems.
For instance, a voter registration system could be compromised by an employee or contractor working on one of these systems via a built-in “backdoor” in the code. A small snippet of malicious code could easily be hidden in the systems that allow privileged users to add, update, or access data in these databases. This code could be used to steal passwords, or simply to allow remote database access to selected outside users.
Finally, a voter registration database can be altered maliciously by any member of the public via the “Change Your Registration” form on the websites of many states. A study published in September by Harvard researchers shows how easy it would be to manipulate voter data using these forms.
The researchers found that many states allow you to purchase enough information about voters that anyone can impersonate that voter and change information – address, party preference, or even name – using these online forms.
We quickly found at least one state where we could easily manipulated voter data this way. We made no actual changes, because that would be a felony. The state we looked at was Pennsylvania, where Trump edged out Clinton by under 45,000 votes out of 6.2 million, or 0.73%. Where the smallest hiccup in the electoral process could have affected this thin margin.
We purchased a snapshot of the Pennsylvania voter registration data for $20. Over 8.5 million records. Names, addresses, dates of birth, political affiliation, voting history. A wealth of data. These data sets are available to the public, here
Armed with this information, we went to the “Change your Registration” page of the Pennsylvania Department of State website. The information marked with red is mandatory. Everything else is optional. Our $20 investment gives us all we need to impersonate a voter, or many voters.
According to the instructions, this form can be used to change the party of a voter. Or his or her name, or address.
Using this form we could move a number of voters to different polling places. We could outright prevent these voters from voting, by moving them away from their local precincts. Or we could look through the data for voters with no recent history of voting, perhaps the very elderly. By changing the addresses and perhaps the political parties of voters who are unlikely to actually show up at the polls we could change the apparent demographics of a precinct without ever being detected.
In this article, Jonathan Albright documents some Python scripts that were posted to the code-sharing website github by an employee of Cambridge Analytica. One of the scripts finds the geographic coordinates for a given address. Oddly, this script specifically mentions “VoterID”.
This script is capable of creating valid new addresses to assign our voters to. Using this script and the information contained in our purchased data set, voters could be electronically moved to new precincts. This could even be done automatically via a browser plugin that could read a list of voters and desired addresses and fill out the Pennsylvania “change in registration” form automatically.
But how do we figure out which addresses are assigned to which polling places? We simply use this nifty online polling place locator interface, brought to you by the ever-helpful state of Pennsylvania:
This convenient public-facing voter registration hacking API is very well documented. Even a non-Russian could probably figure it out!
Journalists and politicians, and even the Department of Homeland Security, insist that despite these obvious vulnerabilities, voter registrations weren’t changed. But how do they know that for sure?
Lets start with the 1993 National Voter Registration Act, also known as NVRA, or the “motor voter” law. NVRA tells the states what they must do to maintain accurate and current voter registration lists.
Under NVRA, all who register to vote must submit a signed voter registration application. Submitting a false application is prohibited by law. No one can register another person to vote!
Under NVRA, the states are required to mail a notice to each voter who submits an application. This notice is either a valid voter ID card, or a rejection notice if the voter is found to be ineligible.
NVRA tells the states that the are not allowed to remove voters due to a failure to vote! And before a voter is removed because he or she moved to a new county, the state MUST send a notice to the new address, so the voter can easily update his or her records.
NVRA tells that states that “all records and papers relating to any application, registration, or other act requisite to voting in any election for federal office, be preserved for a period of twenty-two months from that federal election”. If something is changed in a voter registration record, the state is required to document that change. (nvra-records.png)
The state of Pennsylvania has its own law about voter registration – Title 25. It is the section of Pennsylvania’s statutes that deals with all election related legislation. The majority of its current framework was accepted Jan. 31, 2002.
Section 1222 provides for the creation of the Statewide Uniform Registry of Electors, also known as the SURE database, to store data on each unique voter in the state, including identifying information such as name, date of birth, and state ID (PennDOT) number. SURE was set up to be the definitive repository of voter registrations in the state – ensuring that people who die, change their names, or move between counties are reflected accurately in the registration.
Once a voter is registered ANY CHANGE to his or her registration requires some action by the voter, except for changes to address or district caused by street renaming or redistricting. In these cases the voter must be notified.
Changing other personal data is more complicated. Date of birth and gender are considered part of an individual’s identity, thus they can not easily be changed without full documentation. To correct a DOB mistake you need to provide a copy of your birth certificate, and submit this form.
Title 25 does not directly address gender changes, but presumably a similar process is followed. This form can be signed by a licensed medical or social worker and submitted in person to a PennDOT licensing center.
Once the records have been altered to reflect the voter’s request, the voter then re-submits their registration with the updated information. Under section 1328 this will be evaluated, and the voter informed of the acceptance or rejection of the registration by mail.
To re-cap, street name changes or redistricting can cause changes without voter involvement. All other changes need to be initiated by the voter, involve mailed confirmation, and generally require additional documentation from the voter. What we are saying here is that voter registration information isn’t supposed to just change by itself. It isn’t supposed to change because of some kind of database error. Or sunspots. Or Mercury retrograde. There is a specific procedure, and the voter is always involved.
So, how’s Pennsylvania doing with all this?
By both their own state laws and by federal law, we shouldn’t find too many oddities in the SURE voter registration database. Especially not ones the voter doesn’t know about.
Mike’s companion Twitter thread is here.
Each data set has over 8.5 million voter registration records. Each voter’s record contains 153 data fields. That’s over 50 Million records, and nearly 8 BILLION pieces of data.
Rather than importing these records into a database and running simple statistical queries, we uploaded it onto a high speed cloud computer and wrote scripts to look for duplicate and anomalous data within each data set, and also to look for changes in voter data between data sets.
At first we didn’t know what to look for, so we just probed the data. We counted the voters in each county, for each data set, by party. Then we started noticing odd things. Duplicate records. Voter records changing in ways that didn’t make any sense at all.
We wrote more scripts, based on what we found. Scripts to look for duplicate and anomalous data within each data set. Looking for changes in voter data between data sets. We rubbed our eyes and went back to double-check the raw data. And yep, it was THAT weird. We pulled out over a million lines of data for closer examination. The more we looked the less it made sense.
This took months of work – writing code, improving script efficiency, moving data to the cloud and back, and analyzing what we found.
What we found was shocking.
Mike’s companion Twitter thread is here.
Unhack the Vote Investigative Team
Even that simple exercise yields very interesting results. We found that in the seven months between April 4, 2016 and the election the total number of registration records increased by nearly 460,000 voter registrations. Looking at these records by party and by county, we found something striking. In that short period of time there was an apparent decrease in Democratics by 0.7% overall. Up to nearly 3% in some counties.
We became curious about exactly what was causing this statewide change in apparent political affiliation. Were people actually making changes to their political party after the deadline for the primary, when such changes would make no difference at all?
To answer this question we began looking at the records of individual voters. How many were added? How many were removed? How many showed a change in party? As long as we were checking, we looked for other changes, again tracking specific voters.
Changes to the Data
The increase in records between the April 4 data set and the November 7 data set was caused by the addition nearly 550,000 new voter IDs, the removal of over 93,000 voter IDs, and approximately 2000 additional duplicate voter IDs. Duplicate voter IDs you ask? Yes. We will get to that later.
Oddly enough, 5000 of the voters who were added during this time period were marked as “Inactive” in the November 7 data set.
Over 200,000 voters changed party in the seven months between April 4, 2016 and the election. This is 2.3% of the voting population. Nearly 120,000 of these changes occurring after August 15. Why would ANY voter change party in the months after the primary?
Looking at the “record changed” date for voters that appear in both the April 4th and the November 7th data set, we see there were a total of nearly 1.5 million changes made to the 8.5 million (on average) voter registration records in this seven month period. In other words, more than one in six records was altered in some way during this time.
Last Minute Changes
As reported in the Philadelphia Inquirer, in the weeks before the election county officials reported a huge surge in the number of voter registrations that were being filed. Although the deadline for registering or changing was October 11, election officials worked hard up to the day of the election processing these registrations. The dates of registration shown in the November 7 data set reflect this surge. Interestingly, the records show an even greater number of changes to registration information that to new registrations.
In all, during this eleven day period there were over 270,000 new registrations, and over 615,000 registrations that were changed in some way. An overwhelming number indeed.
Even more interestingly, of the voters who made these last-minute registrations and changes, over 20% were NOT marked as having voted in the November 2016 election, according to the February 27 2017 data set. Here are the counts of recorded new registrations and changed registrations for each two week period between the May 1st and the election:
All of these changes, especially the last minute ones, perplexed us.
So we went even deeper.
Mike’s companion Twitter thread is here.
Unhack the Vote Investigative Team
A few features of the data immediately became apparent.
1. There are a great many January 1 birthdays. We counted them. The data from our November 7, 2016 data set shows that 34,384 of our 8.73 million voters was born on January 1. Since January 1 occurs once every 365.25 days, we roughly estimated that one in 365.25 voters, or approximately 24,000 should have that birthday. The odds against 10,000 extra people having that particular birthdate seem high, to us. Since date of birth is considered an identifying feature and a correct date of birth must be provided in order to register to vote or to obtain a Pennsylvania identification card.
2. Looking at these January 1 birth dates more closely, we noticed a great many birth dates of January 1, 1900, and January 1, 1800. In fact 202 voters have a birth date of January 1, 1900, and 1689 voters were born on January 1, 1800.
It has been suggested that the Jan 1, 1800 birth dates were a placeholder for unknown birth dates – perhaps from older registration data that was missing this information. We thought we would have a closer look at these voters to see if that was true.
Of the 1689 voters with a Jan 1, 1800 birthday, 426 apparently registered to vote in 2000 or later, and 928 registered in 1990 or later, leaving only 711, or 42% who registered before 1990.
Looking even more closely, we started searching for these people by name, one by one. The results were very interesting. We found good matches for the 17 of the first 19 we looked for. The matches we found ranged in age from 29 to 83, with only five over the age of 60. Most of these voters must have registered in the 1980s, or later. It seems unlikely that the registration system used at that time would have omitted date of birth.
3. The discovery of these extremely old voters led us to count the voters in this data set who were over the age of 100 on election day. The answer? 7,572. This is particularly extraordinary considering 0.0713% of Americans reach the age of 100. Given Pennsylvania’s population of 12.78 million, there should be approximately 2,200 centenarians in the entire state.
4. Since it’s possible that some people who had passed away were not removed from the voter rolls, an even more interesting question is how many people over the age of 100 cast a ballot on November 8. To estimate this we looked at the February 27, 2017 data set. 6.2 million of the records in that data set show a “last voted” date of 11//08/2016. Of these voters, how many were over 100?
Of these voters, 2,109 were over 100.
That’s nearly all of the Pennsylvania’s 2,200 centenarians.
Of these very old voters, 798 were apparently born on Jan 1, 1800. Three other voters had a birthday of zero, making them over 2000 years old. Additionally, 69 were born between Feb 9, 1800 and Jan 1, 1900, making them older than the oldest known living person at the time of the election. 13 people were born on Jan 1, 1900.
That leaves 1,223 people between the ages of 100 and 117 with legitimate looking birth dates who voted in the election.
31 of these voters, ranging in age from 100.2 to 109.9 registered in 2016! 440 of them first registered after 2000, at a minimum age of 84.
5. Finally, we noticed a voter whose “last voted” date was 11/06/2012, and whose birth date was listed as 12/04/1997. This indicates that that voter was under the age of 15 when she last voted. Looking at the November 7 data, we counted all such voters and found 257 who were under 18 when they last voted. Looking at the February 27, 2017 data set we found 144 such voters, including two who voted in the November 8, 2016 election.
Mike’s companion Twitter thread is here.
Unhack the Vote Investigative Team
In order to look more closely at this, we made some new “flat arrays” from our complete set of voters. By this we mean we made lists that contained just one text entry per voter, with that text entry containing a subset of the voter’s information.
In Pennsylvania each voter record has a “base id” of nine digits, then a hyphen, then a two digit code indicating the county. For instance the voter id 00123467-01 indicates that the voter lives in county 01, or Adams County. If that voter moved to Allegheny county, his or her new “full id” would be 00123467-02. That voter’s “root id” is 00123467, and that id should identify the voter as long as he or she is a voter in Pennsylvania.
As stated in Pennsylvania’s Title 25, above, each voter should have exactly one ID. This means that when the voter moves to a new county, the previous id MUST be deleted.
Given this, we made lists of all the “full ids” and “root ids” in each data set, and checked for duplicates.
We found nearly 13,000 exact duplicate records (same root id and same county) and over 2000 “two county” duplicates (same root id, different counties) in our November 7th and November 28th data sets. This is particularly surprising because very simple database maintenance should have detected these duplicates and deleted them, perhaps after mailing requests for address verification in the case of “two county” duplicates.
These duplicates could not have been caused by any action, inaction, or attempted fraud on the voter’s part.
Even more surprisingly, we found that ALL the exact duplicates were missing in our post-election data sets, and that the number of “two county” duplicates was greatly reduced.
We then took a closer look at the complete records for the voters with two identical ids, and we found something truly bizarre and extraordinary.
The two records were completely identical, except in one respect. Pennsylvania keeps a record of whether you did or didn’t vote in the last 40 elections. There is no commonality in the data between the voting patterns of the duplicate voter and the legitimate voter. Because of these staggered election dates between the voter and his or her clone, if the vote count for a given election was tallied in a database only one of the records would show up for any election that was being examined.
In all of these records we could distinguish the original from the duplicate by looking at the “last voted date” field that is separate from the election history data. The last voted date for the original record corresponds to a record in the election history. In all cases the duplicate voter’s last voted date does not match up with any of the elections in that record’s election history.
How many of the duplicates cast a ballot? It’s hard to know exactly because only 53% of the votes had been counted by November 28, and in our next snapshot these duplicates were all gone. But we counted the duplicates who were marked as having voted in the November 28 data set, just to get a sense. The answer: 7,299 of our 12,939 ghost twins were marked as having voted. That’s 56%. And only 53% of the votes had been counted. We think it’s a safe guess that almost all of our 13,000 ghost voters were marked as having voted.
Does this mean that their votes were counted? We have no idea.
Mike’s companion Twitter thread is here.
Unhack the Vote Investigative Team
© 2017 Unhackthevote, all rights reserved