Why Do We Need an Open Data Benchmark Study?

Posted by Kevin Merritt on July 15th, 2010
all question

Yesterday we announced that on July 21st, we will launch a broad market study to benchmark the state of Open Data in government. We believe this will be the very first time that government stakeholders, mainstream citizens and civic application developers will all be invited to share their perspectives on this nascent movement.

Why are we launching this study? To begin with, it’s important to find out how far we’ve come in this evolution towards government data transparency and civically-engaged citizens. For example, how aware is the public about Open Data initiatives? How do they feel about it? To be more precise, the responses to the survey will allows us to answer questions such as: What percentage of people believe Open Data is important enough to fund with taxpayer dollars? Would the public be more likely to support elected officials who champion data transparency? How would people expect to consume and interact with public data? For that matter, what would be examples of high-value datasets in their view?

For data publishers within government, we’re also eager to uncover what the motivators are for doing this. Is it because it’s fashionable? Perhaps it is a result of a mandate by elected officials. Or is it something more enduring and more fundamental than that? Is it simply the right thing to do in a 21st century democracy? If so, what are the real-world constraints that are hindering progress? Lack of funding? Political will? Data governance issues?

For Socrata, this study is important in many ways. We believe in the Open Data movement and have built our company to help enable it. Although we operate a public service for individuals and small organizations to find and share data on Socrata.com, we are also a for-profit technology startup which has its own business sustainability motive in seeing this movement grow and prosper. We believe that we need to help drive adoption among all its key constituents: governments, citizens and developers.  We’re also thinking about the role of media, researchers, analysts and all other data-consuming groups.  Since technology adoption is a function of usefulness and usability, we’re constantly looking for ways to improve our platform along those two dimensions.  And since we can’t solve every Open Data problem ourselves, we’re trying to make it easy for developers to extend our platform to create new data assets and civic applications for the benefit of governments and citizens. This study will help us gain additional insight into how publishers, developers and consumers expect to interact with data.

Finally, in yesterday’s announcement we wanted to recognize that in addition to political courage, this movement owes much of its current momentum to passionate advocates like the Sunlight Foundation which have worked to influence transparency policy for many years. We also feel indebted to technology visionaries like Tim O’Reilly who have made the case for technology’s transformative power in Open Government and have helped usher in new technology thought leadership in government.

Our announcement yesterday was our way of saying: If you care about Open Data in government, join us in this study! We welcome the participation of like-minded, passionate advocates, thought leaders and interested media organizations. We’ve come a long way thanks to the effort of so many people. We’re at the cusp of something great and transformative for our democracy. Let’s find out how we can make it better.

NYC Open Data Hearing

Posted by Kevin Merritt on June 20th, 2010

NYC Logo

Led by New York City Council Member Gale A. Brewer, on Monday June 21, 2010 the New York City Council Committee on Technology will hold a hearing on Open Data standards for all NYC agencies. The specifics of the hearing are as follows:

The New York City Council Committee on Technology will hold an important hearing on open data standards for all city agencies at 10:00am on June 21, 2010 at 250 Broadway, New York, NY (Across the from City Hall).  This bill, Introduction 029-2010 (formerly Intro. 991-2009), is an effort to increase government transparency and facilitate easier access to public data.  Beyond the ‘good government’ benefits of this legislation, the bill will also unlock City data to enable web developers and entrepreneurs to interact with City government in new and unforeseen ways.  Data published under this legislation will be readable by any computer device, whether that is a laptop or a phone, for innovative developments.  This Gov 2.0 inspired transparency legislation, targets application developers, startups, small businesses, and academics with the ultimate goal of strengthening the connection between government and the public, while re-energizing the small business-tech sectors.

Please visit http://nycctechcomm.wordpress.com/opengov/ for information on Int. 029-2010.  If you wish to testify, please contact the Office of Council Member Gale A. Brewer, Kunal Malhotra, Legislative and Budget Director, at (212) 788-6975 or  Samuel Wong, Legislative Aide on Technology, at (212) 788-6975.

Socrata plans to testify in person at the hearing.  The key points we wish to make via our testimony are:

1. Disseminating public data is the right thing to do;

2. Doing so helps hold government accountable, improves efficiency, reduces costs and ultimately stimulates economic growth;

3. There is no need to build an Open Data solution from scratch;

4. Socrata offers a purpose-built Open Data platform empowering government organizations large and small to share their data with the widest array of data consuming audiences. It’s proven in major U.S. cities like Seattle and Chicago as well as in federal agencies, states and counties;

5. Socrata delivers its configurable, customizable platform as a cloud-based, Software-as-a-Service (SaaS) solution. We are a market-driven shared services provider. Each organization invests a fraction of the cost alongside the other organizations on our platform. It’s very cost effective and affordable. Organizations benefit from our evolving platform as a monthly service subscription. Depending on the features desired, storage and bandwidth Socrata has plans ranging from hundreds to thousands of dollars per month, not  hundreds of millions of dollars as has been previously speculated.

6. We created a 6-minute screencast introducing the Socrata platform and encourage you to watch it as an overview of the platform’s capabilities. You can watch the video at http://links.socrata.com/yx2x/mockups/videos/socrata-platform-v3/

screencast_still

It’s great news that NYC is having these discussions and as importantly that the discussions are taking place in an open and transparent way. We’ll see you at the hearing.

Accessibility, Section 508, and the Open Government Movement

Posted by Chris Metcalf on April 28th, 2010

Can the government really be “open” if it’s not open for everyone? As Federal, state, and local governments put more of their data online, it is important that they take into consideration the needs of all of their constituents, including those who are disabled and use special technologies to access the Internet. This becomes especially important as new data portals such as Data.gov are built and put online.

In 1998, Congress amended the Rehabilitation Act of 1973 to require Federal agencies to ensure that their information technology products were accessible to those with disabilities. Section 508 introduced specific requirements that determine what it means for a product to be accessible and along with specific details on what it means for a web-based product to be considered compliant. The World Wide Web Consortium (W3C) has also provided additional guidance through their Web Accessibility Initiative (WAI). In essence, to be considered “accessible,” visitors must “have access to and use of information and data that is comparable to that provided to the public who are not individuals with disabilities.” Disabled visitors must be afforded the same access to the content of a website that is available to those without disabilities.

Here at Socrata, we take accessibility seriously, and strive to make it possible for everyone to explore government data catalogs and discover interesting and provocative datasets. As such, we’ve invested significant engineering effort into making Socrata data sites Section 508-compliant for visitors who use accessible technologies. Some of the steps we’ve taken to aid accessibility include:

  • Ensuring that all content images include descriptive “alt” attributes, that textual links are used for navigation whenever possible, and that form elements use “label” tags to describe form fields.
  • Designing our Cascading Style Sheets (CSS) to allow accessible technologies to automatically scale the text and content of our website to make it viewable for those with reduced vision.
  • Providing “skip links” to allow users to jump between sections on the page using only a keyboard.
  • Keeping the semantic layout of our pages separate from their visual layout, allowing them to gracefully degrade when Cascading Style Sheets are disabled or when the site is viewed with a screen reader or refreshable braille display.
  • For pages where the use of AJAX or JavaScript would interfere with the use of accessible technologies, we provide keyboard-accessible skip links to allow the visitor to switch to an alternate version of the site that does not make use of JavaScript or AJAX technologies.

Together, these features allow visitors taking advantage of technologies such as screen magnification tools, screen readers, or refreshable braille displays to discover, view, and manipulate data made available through Socrata-powered data sites. For more information about Socrata’s Section 508 efforts, visit our accessibility statement or our profile on GSA’s BuyAccessible Product Directory.

Photo by lissalou66 and released under the CC BY-ND 2.0 license.

The Three Constituents of Open Data

Posted by Kevin Merritt on April 20th, 2010

Infographic

There is a global open data movement underway. Cities, counties, states and national governments are sharing their data with citizens. But all we citizens are not alike. One size does not fit all.

Socrata has spent the majority of the last three years focused on understanding the consumption side of the data publishing equation. We’re passionate about making data accessible and comprehensible to the widest audiences possible.  Our work in this area has led us to a classifying the kinds of consumers of data – a taxonomy of data consumption if you will.

There are three major constituent groups of people who consume data:

The Non-Technically Trained But Nonetheless Interested. In a retail analogy, this is the 7-Eleven shopper. This is the ad hoc class of consumers of data. They are convenience driven. These people are not programmers or DBAs with extensive training in data analysis. They are mainstream people, including students, who perhaps most regularly use Facebook, Excel, Word, PowerPoint and GMail. Their interest in data is often temporal. They want to look up how much ARRA money is being spent in their neighborhood. They want to know when was the coldest year on record. Or perhaps how many wolves live in Yellowstone National Park. They want to know how their senator voted on the lastest bill. Their mental picture of data varies from person to person and dataset to dataset. When asked “what does data look like?” one might say a table, another might say a graph or chart; another might say it looks like a map; another would say it looks like the search results on Yelp or Linked In; yet another might say it looks like the closing stock prices of the Wall St. Journal.  In order to comprehend data, they want to at least absorb and digest it and preferably sort, filter and search through it. The key to this group’s positive data consumption experience is that it needs to be interactive and visual. Because their needs are so diverse, it’s the hardest group to satisfy well.

Programmers. This is the Radio Shack shopper. They want to build things with data. Technically speaking, they’d rather not consume data, but rather they prefer to consume an API – an application programming interface – that “points to” data. Providing bulk data in download format is actually a burden to this group. Giving them the raw data imposes upon them to find a place to store the data – like a relational database. Providing data in bulk imposes upon them some method for keeping the data current. They are writing a program or mashup they hope endures for a quite some time. Write once, run forever. This group is interested in a consistent API from one dataset to another. Providing data in bulk imposes upon them to create their own API for accessing the data once they’ve stored it and figured out how to keep it up to date. What they really want is access to data through an open, standards-based REST API designed for consuming data programmatically. API enabling data isn’t particularly hard, but it does require some deliberate design, effort and execution. And of course, if thousands of data publishers expend the energy and effort to offer home grown APIs not based on open standards, the result will be an entirely different frustration for programmers – dealing with thousands of different variants of APIs, which ultimately means the bar will be too high for most programmers to bother writing programs that make interesting use of public data.

Analysts, Researchers, Scientists and the Media. This is the Costco shopper. They want data in bulk, machine-readable formats like XML, CSV, XLS and JSON or maybe even RSS or RDF. Often they want multiple datasets from multiple sources so they can pour them into their own analysis system. They want to mine the data, looking for undiscovered meaning, hidden and as yet untold truths.  This is the domain of investigative journalists. This is the easiest group to satisfy, as the easiest way to share data is make a CSV or Microsoft Access file available.

The open data movement is good for us all. It will take time, but eventually it means that government will run more transparently and better. Maybe even businesses will someday too. It means that new insights from a plethora of public data sources will be formed.  But the bar for sharing data has been raised. It’s simply no longer acceptable to publish a circa-1996 five-page web page full of caveats, disclaimers and instructions for decoding encoded data, at the bottom of which page there is a link to download a 17MB Microsoft Access file. The new bar for sharing data is to publish data in way that is the most accessible and the most comprehensible to the widest array of audiences by ensuring that all three core data consumption constituent groups are adequately represented.

So what’s your role in open data? It’s simply to raise your voice for your constituent group. Are you civic-minded but not technically trained? Demand that public data be shared in interactive ways that allow you to sift through it in real time, without requiring a download. Are you a programmer? Push for API access to data. Tell data publishers about SODA. Don’t accept a download. Are you a scientist, researcher, analyst or part of the media? Ask for bulk, machine-readable access to data in the format that’s easiest for you to consume. Data publishers need to hear from you.

Where To Find Socrata in the Community

Posted by Kevin Merritt on March 15th, 2010

You know that the whole open government, transparency, government 2.0 movement is reaching critical mass when there are overlapping, conflicting events. You’ll find Socrata folks at two upcoming events, which might interest readers of this blog as well.

Socrata Technical Program Manager Chris Metcalf and CEO Kevin Merritt will be at Transparency Camp in DC March 27-28, 2010.

Closer to home, a number of our software engineers and our CTO will be attending Open Gov West in our hometown Seattle March 26-27, 2010.

If you’re interested in helping transform government, come join us.

Open Government Directive

Posted by Kevin Merritt on December 17th, 2009

Last week, Peter Orszag, Director at the Office of Management and Budget (OMB) issued a memorandum for the heads of executive departments and agencies. This memo is the Open Government Directive. You can read the full 11-page memo, including the attached Open Government Plan here. I read the memo in detail and wrote up an abbreviated outline. Some of you may be interested in my outline, so I’m sharing it here.

- Written by Peter Orszag, director OMB

- Effective date is December 8, 2009

- The OGD memo was written by directive outlined in President Obama’s January 21 2009 Memo on Transparency and Open Government

- That earlier memo identifies the three principles that form the cornerstone of an open government: a) Transparency; b) Participation; c) Collaboration

- The OGD memo establishes deadlines for action

- The OGD memo requires each department and agency to take 3 steps toward fulfilling the goal of creating a more open government

1. Publish government information online

a. Machine readable

b. Publish proactively, not just respond to FOIA requests

c. Have at least 3 datasets online by January 21, 2010

d. Have a web page up by February 8, 2010 that serves as a gateway for agency OGD related activities

e. Allow the public to provide feedback on the quality of published data, help prioritize the schedule for dissemination of data and provide input on the Open Government Plan

f. Comply with Presidential open government initiatives such as data.gov, recovery.gov, USAspending.gov

2. Improve the quality of government information

a. Appoint a data quality official by January 21, 2010

3. Create and institute a culture of open government

a. Publish an Open Government Plan on the agency’s website by April 8, 2010 describing how it will improve transparency and integrate public participation and collaboration

- The OGD memo requires the administration to take the following steps to support departments and agencies

1. In support of improving the quality of government information

a. OMB will issue a framework for federal spending data by February 8, 2010

b. OMB will issue a long-term strategy for federal spending transparency by April 8, 2010

2. In support of creating and institutionalizing a culture of open government

a. Federal CIO (Vivek Kundra) and CTO (Aneesh Chopra) will set up an Open Government dashboard on whitehouse.gov by February 8, 2010

b. OMB and the federal CIO and CTO will establish a transparency, accountability, participation and collaboration workgroup by January 21, 2010

c. OMB will issue guidance on how agencies can use contests and other incentives by March 8, 2010

3. Create an enabling policy framework for Open Government

a. Evolve policies to allow for use of emerging technologies, which can help agencies become more open

b. By April 10, 2010 OIRA will review existing OMB policies to identify impediments to Open Government and/or the use of emerging technologies and where necessary will provide clarifying guidance and/or propose appropriate revisions to those policies

- Attached to the Open Government Directive is an appendix that describes the Open Government Plan [see 3(a) above]

+ Each agency’s Open Government Plan is its detailed public roadmap for incorporating transparency, participation and collaboration into the agency’s core mission

+ Each agency’s Open Government Plan should be published in a machine readable format on its own agency Open Government page as well as the forthcoming Open Government dashboard

+ The components of each agency’s Open Government Plan

o) Transparency

* Inventories of what data is available online today

* Inventories what data is not yet available online with a reasonable dissemination schedule

* Foster and promote the public use of your data

o) Participation

* What is your agency going to do to improve public participation?

o) Collaboration

* How is your agency going to more proactively collaborate with other agencies, private sector companies, universities and non-profits?

o) Flagship initiative

* Each agency’s Open Government Plan should describe at least one initiative that the agency is currently implementing

Overview of the initiative including how it fulfills at least one of the three openness principles

How will you engage the public?

With whom will you collaborate?

How will you measure success?

How will sustain and evolve it?

o) Public and agency involvement

* Incorporate ideas and feedback from the public and from agency employees

* Stimulate ongoing public feedback as part of the period review process

This memo lays the foundation and direction for agencies to share their data more openly, to engage the public more proactively and to collaborate with each other, the private sector and universities and is excellent and welcome news for all citizens.