Friday, November 18, 2011

Legislative Model: How Much to Open Source?

Should legislative data schemes be open source?  That is the question that Grant Vergottini raises in his blog post today, To Go Open Source or Not.  It's a thought-provoking topic and I encourage you to join the discussion on Grant's blog. Some background and a bit of my thinking is below:

Some states (e.g. Oregon) have claimed copyright in the organization of state statutes in order to protect contracts that the state has with legal publishers or other monopoly arrangements to publish the state's laws. That is not the case with most states or jurisdictions, whose bills and statutes themselves are indisputably part of the public domain.

However, even when the legislative text and organization is part of the public domain, access is limited by inconsistent publishing formats and lack of common standards.  Anyone who has tried to use the public internet to search for information on state or even federal laws realizes how difficult this can be.  I have discussed the situation with the U.S. tax code, which my company, Tabulaw, is working to make more accessible at tax26.com.

I have also discussed the difficulty of accessing California's laws, which gave rise to a hackathon to improve the situation.  California, thanks to Grant's work, does have an underlying XML-based data structure, SLIM, that allows California's legislature to easily research and modify the laws and makes the technical process of writing bills more efficient.  However, this benefit has not--until recently--translated into improved access for the public.  Grant and his company have recently open-sourced SLIM which, in theory, could make it easier to make California's laws more accessible to the public, and also make the model available to use with legislation in other jurisdictions.  This could move us toward a standardization of legislative data.

On the one hand, that would be a big step forward for public access, but it does raise some concerns, as Grant points out: it would mean that one company (in this case Grant's) would own the basic data structure for public laws.  This is something that already happens, de facto, with large swaths of government documents stored in pdf, a proprietary but open sourced format.   I am also disturbed by the claim, by the private publishers of the BlueBook, to copyright in a principle standard that has been adopted for citations of legal sources, and other copyright claims that encumber the basic ways that legal citations are written. So there are clear potential problems with a privately owned standard even if open-sourced.

Wouldn't it be nice if governments at all levels would collaborate to create a single nationwide public domain data standard for legislation? That would, for example, make it easier to identify all state laws related to abortion or to compare education laws across jurisdictions.  It might be nice, but it's also less likely than the Congressional SuperCommittee reaching a compromise.  I won't be holding my breath.

I do think that a privately created, widely adopted, and open sourced standard is the next best option.  I think that the value of having a standard set of metadata in legislation outweighs the risks of private ownership of the standard.  And I believe that it is in the interests of all involved, including the owners of the standard, to make the open source licensing of the standard clear and permanent, in order to encourage the widest possible use of the standard.