goncvoters

package module
v1.3.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 28, 2024 License: MIT Imports: 4 Imported by: 0

README

go-ncvoters

Go Report Card PkgGoDev

Contents:

Overview

Downloads selected columns from the NC voter registration data at the NC Board of Elections website and create an SQLite database file from it. The data is updated every Saturday.

The file at the website is a very large (about 478MB) zip file, and the CSV file it contains is nearly 4GB, which can make it hard to manage. Not all the columns may be of interest, so this script allows you to select a subset of columns.

File layout

The file format is described at https://s3.amazonaws.com/dl.ncsbe.gov/data/layout_ncvoter.txt, which I summarize below:

/***********************************************************************
* name:     layout_ncvoter.txt
* purpose:  Contains all legally available voter specific information.
*           Personal identifying information (PII) such as birth date
*           and drivers license number are not included. Voter registrations
*           with a voter_status_desc of 'Removed' are omitted whenever the
*           most recent last voted date is greater than 10 years.
*           This is a weekly point-in-time snapshot current per file date
*           and time.
* updated:  02/11/2022
* format:   tab delimited
* instructions:
*            1) extract using a file archiving and compression program
*               (e.g. WinZip)
*            2) can be linked to ncvhis file by ncid
************************************************************************/
Choosing the columns

Here are all the available columns:

------------------------------------------------------------------------
-- File layout (from 02/12/2022)
-- ---------------------------------------------------------------------
name                       data type          description
------------------------------------------------------------------------
0: county_id               int                County identification number
1: county_desc             varchar(15)        County name
2: voter_reg_num           char(12)           Voter registration number (unique to county)
3: ncid                    char(12)           North Carolina identification (NCID) number
4: last_name               varchar(25)        Voter last name
5: first_name              varchar(20)        Voter first name
6: middle_name             varchar(20)        Voter middle name
7: name_suffix_lbl         char(3)            Voter suffix name (JR, III, etc.)
8: status_cd               char(2)            Registration status code
9: voter_status_desc       varchar(25)        Registration status description
10: reason_cd               varchar(2)         Registration status reason code
11: voter_status_reason_desc varchar(60)       Registration status reason description
12: res_street_address      varvarchar(65)     Residential street address
13: res_city_desc           varchar(60)        Residential city name
14: state_cd                varchar(2)         Residential address state code
15: zip_code                char(9)            Residential address zip code
16: mail_addr1              varchar(40)        Mailing address line 1
17: mail_addr2              varchar(40)        Mailing address line 2
18: mail_addr3              varchar(40)        Mailing address line 3
19: mail_addr4              varchar(40)        Mailing address line 4
20: mail_city               varchar(3)0        Mailing address city name
21: mail_state              varchar(2)         Mailing address city code
22: mail_zipcode            char(9)            Mailing address zip code
23: full_phone_number       varchar(12)        Full phone number including area code
24: confidential_ind        char(1)            Confidential indicator (by General Statute
                                               certain data is confidential for this record)
25: registr_dt              char(10)           Registration date
26: race_code               char(3)            Race code
27: ethnic_code             char(3)            Ethnicity code
28: party_cd                char(3)            Registered party code
29: gender_code             char(1)            Gender/sex code
30: birth_year              char(4)            Year of birth
31: age_at_year_end         char(3)            Age at end of the year (was: birth_age - 02/09/2022)
32: birth_state             varchar(2)         Birth state
33: drivers_lic             char(1)            Drivers license (Y/N)
34: precinct_abbrv          varchar(6)         Precinct abbreviation
35: precinct_desc           varchar(60)        Precinct name
36: municipality_abbrv      varchar(6)         Municipality jurisdiction abbreviation
37: municipality_desc       varchar(60)        Municipality jurisdiction name
38: ward_abbrv              varchar(6)         Ward jurisdiction abbreviation
39: ward_desc               varchar(60)        Ward jurisdiction name
40: cong_dist_abbrv         varchar(6)         Congressional district abbreviation
41: super_court_abbrv       varchar(6)         Superior court jurisdiction abbreviation
42: judic_dist_abbrv        varchar(6)         Judicial district abbreviation
43: nc_senate_abbrv         varchar(6)         NC Senate jurisdiction abbreviation
44: nc_house_abbrv          varchar(6)         NC House jurisdiction abbreviation
45: county_commiss_abbrv    varchar(6)         County commisioner jurisdiction abbreviation
46: county_commiss_desc     varchar(60)        County commisioner jurisdiction name
47: township_abbrv          varchar(6)         Township jurisdiction abbreviation
48: township_desc           varchar(60)        Township jurisdiction name
49: school_dist_abbrv       varchar(6)         School district abbreviation
50: school_dist_desc        varchar(60)        School district name
51: fire_dist_abbrv         varchar(6)         Fire district abbreviation
52: fire_dist_desc          varchar(60)        Fir district name
53: water_dist_abbrv        varchar(6)         Water district abbreviation
54: water_dist_desc         varchar(60)        Water district name
55: sewer_dist_abbrv        varchar(6)         Sewer district abbreviation
56: sewer_dist_desc         varchar(60)        Sewer district name
57: sanit_dist_abbrv        varchar(6)         Sanitation district abbreviation
58: sanit_dist_desc         varchar(60)        Sanitation district name
59: rescue_dist_abbrv       varchar(6)         Rescue district abbreviation
60: rescue_dist_desc        varchar(60)        Rescue district name
61: munic_dist_abbrv        varchar(6)         Municpal district abbreviation
62: munic_dist_desc         varchar(60)        Municipal district name
63: dist_1_abbrv            varchar(6)         Presecutorial district abbreviation
64: dist_1_desc             varchar(60)        Presecutorial district name
65: vtd_abbrv               varchar(6)         Voter tabulation district abbreviation
66: vtd_desc                varchar(60)        Voter tabulation district name
------------------------------------------------------------------------------------

See selected_columns in the sample configuration YAML file for a list of what I selected as the columns of interest. You can change this by adding or deleting lines.

Installation

Configuration

Create a subdirectory named go-ncvoters in your user configuration directory, which is:

  • On Unix/Mac: $HOME/.config/go-ncvoters
  • On Windows: %USERPROFILE%\AppData\Roaming\go-ncvoters

Copy sample_config.yaml to that directory, renaming it config.yaml. There are two sections:

selected_columns:
  - county_id
  - voter_reg_num
  - last_name
    ...
  - age_at_year_end
  - birth_state

This is where you specify the subset of the available columns described above that you want to be included in your local copy of the database. Note that you need to indent each line in the list by two spaces (or a tab character)

sanitize_columns:
  - last_name
  - first_name
  - middle_name
  - res_street_address
  - res_city_desc

The sanitize_columns section is where you list the column names that need to be cleaned up - they may have sets of multiple whitespace characters that need to be replaced by a single space.

Edit these two sections as you like to specify how to build your database. If you are not familiar with YAML, a good introductory page is YAML tutorial.

Additional tables

You can also add a section to describe additional tables you want to have added to the database after its construction. Create a tables section with a list of entries for each additional table. Specify the SQL statements that will create your additional tables as follows:

tables:
  - |
    CREATE TABLE family AS
        SELECT
            last_name, first_name, middle_name
            res_street_address
            (etc.)
        FROM        voters
        WHERE       first_name = 'LARRY' AND last_name = 'FINE'
        OR          first_name = 'CURLY  AND last_name = 'HOWARD'
        OR          first_name = 'MOE'   AND last_name = 'HOWARD'
        ORDER BY    1, 2, 3
        ;
        
  - |
    CREATE TABLE friends AS
        SELECT
            last_name, first_name, middle_name
            res_street_address
            (etc.)
        FROM        voters
        WHERE       first_name = 'TOM' AND last_name = 'THUMB'
        OR          first_name = 'SNOW  AND last_name = 'WHITE'
        ORDER BY    1, 2, 3
        ;
        

Note the pipe character "|" after the hyphen that begins the SQL for your table. This allows you to continue the SQL onto multiple lines.

Running the application

There is a mainline in cmd/get_voter_data.go that will run the download application.

You can also compile a native executable by running the following command from the root directory of the project:

$ go install cmd/create/get_voter_data.go

This will create an executable in your path named get_voter_data, which you can invoke from a command line like any other program:

$ get_voter_data -h
usage: get_voter_data [OPTIONS] [DBNAME]

Creates a database of North Carolina voter registrations

positional arguments:
  dbname         Name of database file to be created (default /tmp/voter_data.db)

options:
  -h, --help     Show this help text and exit
  -f, --force    Force the zip file to be downloaded, not reused

This takes a little over two minutes on my Linux machine.

I generally run this every Saturday night, since the database is updated on the website that day. The database is built as voter_data.db in the /tmp directory (%TEMP%, on Windows). I typically create a soft link to it in my home directory and the desktop for quick access.

You need to delete the .zip file from /tmp before the next Saturday update, because the system will reuse the existing ZIP file if it exists. You can do these delete and download steps in your cron table so that it is all done automatically every Saturday night.

Viewing the database

I use DB Browser for SQLite to work with the database. It provides a very good user interface into the data and the table metadata. You can also run SQL queries from the UI to select subsets of the database in which you are interested.

There is also a command-line tool named sqlite3 that allows you to run queries and export data into CSV, JSON, HTML, and other formats.

SQLite3 is also supported natively in the standard Python library.

References

Documentation

Overview

Package goncvoters provides a Go program to download selected rows and column from the voter registration data at the NC Board of Elections website and create an SQLite database file from it. The data is updated every Saturday.

See the README.md for details.

Index

Constants

View Source
const PACKAGE_NAME = "go-ncvoters"

Variables

View Source
var Configuration *configuration
View Source
var OptQuiet bool

Functions

func Map added in v1.3.2

func Map(f func(any) any, inch chan any, bufsize ...int) chan any

Map accepts input from a channel and applies the specified function to it, writing it to an output channel

func Reduce added in v1.3.2

func Reduce(f func(x, y any) any, inch chan any, initial any) any

Reduce applies the specified function repeatedly to accumulate a single value.

Types

This section is empty.

Directories

Path Synopsis
cmd
create
A Go program that addresses the requirements of optimizing performance, memory usage, and disk space while downloading the zip file, processing the CSV data, and creating an SQLite database.
A Go program that addresses the requirements of optimizing performance, memory usage, and disk space while downloading the zip file, processing the CSV data, and creating an SQLite database.
Package create has functions to create an SQLite3 database from a zipped CSV file.
Package create has functions to create an SQLite3 database from a zipped CSV file.
Package download has functions to download the latest voter registration file from the NC Board of Elections website
Package download has functions to download the latest voter registration file from the NC Board of Elections website
Package util contains utility functions used by the download and create packages.
Package util contains utility functions used by the download and create packages.
Package webdata has functions to handle downloading the file layout from the web and creating DDL for it.
Package webdata has functions to handle downloading the file layout from the web and creating DDL for it.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL