Thursday, October 22, 2015

Git for Law Revisited

Laws change. Each time a new U.S. law is enacted, it enters a backdrop of approximately 22 million words of existing law. The new law may strike some text, add some text, and make other adjustments that trickle through the legal corpus. Seeing these changes in context would help lawmakers and the public better understand their impact.

To software engineers, this problem sounds like a perfect application for automated change management. Input an amendment, output tracked changes (see sample below). In the ideal system such changes could be seen as soon as the law is enacted -- or even while a bill is being debated. We are now much closer to this ideal.

Changes to 16 U.S.C. 3835 by law 113-79

On Quora, on this blog, and elsewhere, I've discussed some of the challenges to using git, an automated change management system, to track laws. The biggest technical challenge has been that most laws, and most amendments to those laws, have not been structured in a computer friendly way. But that is changing.

The Law Revision Counsel (LRC) compiles the U.S. Code, through careful analysis of new laws, identifying the parts of existing law that will be changed (in a process called Classification), and making those changes by hand. The drafting and revision process takes great skill and legal expertise.

So, for example, the LRC makes changes to the current U.S. Code, following the language of a law such as this one:
Sample provision, 113-79 section 2006(a)
LRC attorneys identify the affected provisions of the U.S. Code and then carry out each of these instructions (strike "The Secretary", insert "During fiscal year"..."). Since 2011, the LRC is using and publishing the final result of this analysis in XML format. One of the consequences of this format change is that it becomes feasible to automatically match the "before" to the "after" text, and produce a redlined version as seen above, showing the changes in context.

To produce this redlined version, I ran xml_diff, an open-source program written by Joshua Tauberer of govtrack.us, who also works with my company, Xcential, on modernization projects for the U.S. House. The results can be remarkably accurate. As a pre-requisite, it is necessary to have a "before" and "after" version in XML format and a small enough stretch of text to make the comparison manageable.

Automating this analysis is in its infancy, and won't (yet) work for every law. However, the progress that has been made points the way toward a future when such redlining can be shown in real-time for laws around the world.