It is hard to keep up with the fast-paced work of Congress. Tens of thousands of bills are introduced each year and each new bill may be a clone of–or contain similar text to–other bills this year or from previous years. It is often useful to find those similar bills, in order to keep track of (and predict the fate of) any newly introduced bill.
I’ve previously described the technical challenge of finding similar bills, and the beginnings of our efforts, working with Daniel Schuman at Demand Progress, to translate that work into an API–a web service that answers the question: which bills are similar to this one.
Our team in Ukraine has taken this work to another level:
-
API of bills and titles
With Oleksandr and his team, we’ve built on the earlier API to match bills by their titles and text. A bill may have the same title as a previous bill– meaning that it intends to accomplish the same purpose– but much of the text may have been changed. It is valuable to find those matches and see why legislators chose a different approach. Bills may also have different titles, but contain the text of a previous bill, either whole or in parts. It can be important to find the previous bill to learn who sponsored it, and what budget assessments or Congressional reports may have been written about it earlier.
-
Finding matches faster
Dmytro, working with Oleksandr, developed two new approaches to quickly find similar text among billions of combined matches (50,000 bills compared to 50,000 other bills; and 300,000 bill sections compared to 300,000 sections). The solution, which we describe in a separate blog post, involves converting the text into a series of zeroes and ones, storing these in a database, and performing clever searches to find patterns that ‘match’ the zeros and ones corresponding to any new bill.
We’re excited to provide the API for public use, and to test this approach on bills in Ukraine and other countries.