Tuesday, January 31, 2012

Cato Institute: Legislative Transparency and Data Model

An email to the Sunlight Foundation's OpenGov listserve from Jim Harper of the Cato Institute points to his blog post, chock full of links and information, about the upcoming House conference on legislative transparency.  Among the references are a number of quite detailed and on-target recommendations that Cato is making for the drafting process and content.

In particular, Cato proposes a legislative data model that would include a great deal of useful metadata at every level of a bill.  The model is quite similar to the California legislative model described at legix.info, parts of which I worked with in the California Law Hackathon. Although the model requires adding a lot of different kinds of metadata to the text,  all of that data is easily available when a bill is being written (e.g. the bill sponsor). It is much more difficult to extract by parsing after the fact. As I've discussed in earlier posts, a richly marked-up legislative text would be very valuable, and goes hand-in-hand with the recommendations I make to make the text itself more amenable to automated analysis.

Monday, January 30, 2012

Make Change Consistent: Legislative Recommendation #2

In preparation for the U.S. House conference on data and transparency, I'm making four basic recommendations that, together, would create a framework for a more efficient electronic representation of U.S. Federal statutes.

The attention from the House on data transparency, combined with current partisan gridlock on issues of policy, make this a perfect time to "reboot" Federal legislation for the 21th Century.  The reboot comes in two mutually reinforcing parts. The first is the subject of this blogpost: make change to existing law in a consistent manner. The second is to push ahead with codification of existing law, so that future legislation can be built on a cleaner, more consistent and accessible platform. That is the topic of my next post.

Current legislation is riddled by language that describes the changes that should be made to the law.  Take as an example, the Health Care Reform Act (HR 1692) which I've also referred to in my Quora answer:
Section 1848(d) of the Social Security Act (42 U.S.C. 1395w–4(d)) is amended—
(1) in paragraph (10), in the heading, by striking ‘‘PORTION’’
and inserting ‘‘JANUARY THROUGH MAY ’’; and
(2) by adding at the end the following new paragraph...
Instead of this word-by-word description of the edits, a wholesale replacement should be made, preferably at the section level, changing the old section for the new one.  In California, for example, this is done with language like this-- "Section [53395.1] of the [Government Code] is amended to read:"

If you are co-authoring a memo, Congress's writing approach is the equivalent of describing edits to your co-author in the text of an email ("In sentence three, take out the first two words...").  What I am recommending, and what California does, essentially, is redlining.  Make changes in a consistent way, and replace entire sections with any amendment.

I understand, and have heard many of the reasons why this kind of change is not easy.  Tradition, and bureaucratic inertia plays a large part in how Congressional language is currently crafted.  Writing by committee is already difficult.  Imagine the challenge of writing by various committees, hundreds of members, two chambers with, to put it mildly, some disagreements in priorities.

There are many reasons to think, however, that this technical change to the drafting form will be welcomed on Capitol Hill. It will provide more clarity, not only for the public, but for Congressional offices themselves, about what impacts a bill would have on existing law. The "replacement" method of legislative drafting would ultimately be easier for each Congressional office to participate in.  And there are models to follow: California's legislature, not known for its easygoing legislative process is, by and large, able to make its changes using this method.

A major technical challenge to adopting this method at the Federal level is that many of our statutes, are "free floating".  Either they stand apart from the U.S. Code, and exist only in the "Statutes at Large", or they have been incorporated into a Title of the U.S. Code, but that Title, as an organized volume, has not been passed into law, in the process known as positive law codification.  Congress then, cannot technically refer to the existing text as "section 501 of Title 26", because Title 26 is not "positive law".

Instead, Congress refers to the original Acts which passed and which are being modified (e.g. the "Internal Revenue Code of 1986" or the "Patient Protection and Affordable Care Act"), and may include a parallel citation to the Code Title.  These Acts, in turn, make their amendments to prior Acts, some of which have been codified and some of which haven't. This has lead to a significant tangled backlog of legislation, which just makes the current system more difficult to change.  And that is why this change goes hand-in-hand with positive law codification, the subject of my next post.


Wednesday, January 25, 2012

Write in Plain English: Legislative Recommendation #1


I earlier listed recommendations for the U.S. House's Feb. 2 conference on legislative transparency and accessibility.  These recommendations, to improve the human and machine accessibility of Federal legislation first require changes in the way that legislation is written, and second focus on the technology to support those changes.  Some of these changes will meet with more cultural resistance and be harder to implement.  This is probably the case with my first recommendation, to Write in Plain English, but I believe that Congress can make initial steps toward this goal right away.

By plain English, I mean writing clearly and consistently.  In the highly nuanced and technical areas that are addressed by much of our legislation, plain language will still include technical language.  And statutes will still require some expertise to understand and apply.  Legislation will still include ambiguity: in fact much legislative compromise is built on carefully crafted, ambiguous, language.  However, such ambiguity also carries heavy costs in decreased certainty, increased litigation costs and increased polarization.

As for implementation, there are already plain language initiatives that apply to Federal agencies and stylistic guidelines for how to write in plain language. Although Congress is clearly a different beast, lessons from these initiatives can apply to legislative drafting, as well.  The centralized Office of Legislative Council already plays a very large role in drafting laws and crafting legislative language, for which most offices are very grateful. Strong plain language guidelines can be incorporated into the existing OLC guidelines (pdf) and, especially when backed by intelligent automation, can eventually make drafting a bit more of a science and less of an art.  For example, to the basic stylistic guidelines, we can add a bit of technology: expand the vocabulary that is explicitly defined, set standards for the categories and kinds of words and phrases that require definitions, and work to harmonize definitions, to the extent possible, going forward.

Another single, but very powerful, stylistic change is to write all legislation as a full text replacement, at the section level, as is done in California and some other state legislative systems.  I will discuss this more in my next post.

Friday, January 20, 2012

Legislative Standards: U.S. House Conference

The U.S. House will hold a conference about transparency and legislative data on February 2, 2012 (with remote streaming?).  As Daniel Schuman of the Sunlight Foundation writes, "This is a big deal."  In this time of extreme partisanship and one-upmanship in politics, legislative data transparency is both important and that both sides can, in principle, agree on.

I've written before about the importance of consistent standards in legislation, the use of meaningful metadata, and the value of version control. Federal legislative data currently passes through at least five offices on the way to being codified.  Because of many historical quirks, the codified version (the "U.S. Code") is often not the actual law, though for convenience most people, even lawyers, pretend that it is.  Simplifying the content of the law, as President Obama and others have called for in tax law, becomes harder when the basic technical issues of changing the law are so arcane and complex.

The Feb. 2 conference, brings together six offices that prepare legislative data on its way from bill to code, and can spark the creation of a unified data standard to make creating and understanding the law more accessible.  In the next few posts, I will detail some suggestions of key changes that would advance these goals.  For example:

  1. Write in plain English
  2. Write changes to statutory sections as full replacements of previous sections. Use a consistent format to make changes (e.g. "Section 444 is amended to read as follows:"). 
  3. Commit to enacting positive law codification for all Titles. A positive law codification project for Title 26 should be considered as a non-partisan starting point for the effort to simplify tax law.
  4. Adopt a clean and simple XML data standard for Federal Legislation.
These are not original suggestions, nor are they comprehensive, but making these changes would provide the foundations for a far more transparent legislative framework.

Tuesday, January 17, 2012

New Users = New Bugs. Patience requested.

We're getting a lot of new sign-ups for the private beta of Tabulaw's legal research and writing platform.  Most of this was triggered by the review by Bob Ambrogi on his blog, LawSitesBlog.com, and word-of-mouth growth from there.  One of the results is that the new users have quickly found many bugs in the software.  Some small (logins are case sensitive), some large (rendering doesn't work on all versions of Internet Explorer), and many that will require significant changes in the application.

I've been pleasantly surprised by the range of people who are interested in an integrated legal research and writing platform like Tabulaw: lawyers from large and small firms, many law professors and students, legal technology entrepreneurs, and government lawyers from an impressive selection of agencies from across the country.

Along with bugs and feature requests, the higher volume has also created a significant problem that may take more than software engineering to resolve: access to Google Scholar from the application has been disabled.  The reason --Scholar limits search requests from a single source-- points out one of the paradoxes of Google Scholar.  Scholar has become a terrific free resource for legal research.  Many articles have pointed out the potential for making Google Scholar (and Google Documents) a mainstay of legal work. As a free source of court opinions, connected to the open web, Google Scholar stands in contrast to the proprietary, walled-off databases of the major legal publishers.  This makes it -- theoretically -- possible to create a fluid workflow for lawyers who want to access legal information on the web and directly integrate it into their writing, without the inconvenience and cost of a publisher's paywall.

That is where Tabulaw comes in.  We are building a set of tools that helps to organize sources such as Google Scholar and integrate them seamlessly into the documents that lawyers and legal researchers are writing.  It makes it possible to imagine an ecosystem of applications by entrepreneurial startups using a common set of open access data, similar to the access developers have to APIs (programming interfaces) from Google and other tech leaders.

But there's the rub... Google Scholar Law, like other free online sources still have hidden limits, vestiges of the way that legal data is collected and owned, which make this vision of an open access web of legal data part of a more distant future.  Through initiatives like the California Law Hackathon, and our development of primary tax law resources at tax26.com, we are working to bring that future closer.

I hope that some of the people who are trying out the private beta will work with us toward that goal. In the meantime, we have a lot of bugs to fix!

Wednesday, January 11, 2012

Tax Simplification: Use Data

This year, I hope politicians on both sides of the aisle heed professor Annette Nellen's advice to simplify tax law.  Nellen's article, for the American Institute of CPAs (AICPA) highlights remarks by IRS Commissioner Douglas H. Shulman on simplifying tax law.  Nellen provides a table showing broad agreement between the Commissioner's prescription for simplification, and recommendations by the AICPA.  They both start with setting common standards and seeking a single, simple approach to a tax problem.

I have often thought about tax simplification from the starting point of tax statutes, and simplification of the underlying Code. That may be what David Brooks had in mind, in this week's column, advising Obama to champion simplifying the tax code among other good government measures.  However, by using the data at its disposal, the IRS can go a long way toward simplification on its own.  What I mean is this:

The IRS has massive amounts of data on the transactions of citizens and businesses and what impact those transactions had on tax determinations by the IRS.  This data could be used to provide deterministic answers for a huge variety of specific taxpayer questions.

The IRS already provides guidance in a number of forms to individuals and businesses, with respect to the agency's interpretation of the law for particular circumstances.  However, this advice comes in the form of written documents and letters that add to the overall volume of information that is required to understand and act on the law.

What I imagine is a tax calculator -- something like H&R Block or TurboTax software  -- but using prior years' data, combined with IRS policy decisions, to prospectively help taxpayers determine the consequences of certain events or tax decisions.

Of course, the devil is in the details.  But this is a place where the high volume of data that the IRS deals with can actually be an advantage toward simplification, just as Google uses its tremendous data advantages to simplify many informational challenges that would otherwise be far too complex.