Many Agencies Miss Obama’s Deadline For Openness

Alarm Clock 3 by Flickr user alancleaver_2000

"Alarm Clock 3" by Flickr user alancleaver_2000

According to Doug Ward’s excellent OpenGovBlog, the first deadline under President Obama’s “Open Government Directive” has come and gone with 26 agencies failing to meet the Directive’s requirements. Here’s what Obama is requiring: “Within 45 days, each agency shall identify and publish online in an open format at least three high-value data sets and register those data sets via Data.gov. These must be data sets not previously available online or in a downloadable format.” The deadline was January 22.

What’s more, in many cases what counts as “meeting the requirements” is just lame. One agency took data that had been available in PDF form and posting it as an Excel spreadsheet with headers. Another agency reposted data that had been available since 2004, just labeling it with a more specific timeframe.

The Sunlight Foundation, which focuses on this issue relentlessly (and well), has written a piece recapping what they are seeing so far as they sift through the data. Their take:

As a first step toward making agency data available in more accessible formats for sophisticated users, the open government directive is so far somewhat successful–plenty of data sets that had been available only as PDFs, or had to be pulled down by scraping Web sites, are now there for the taking (we’ll have better counts of this later in the week). But new data sets are not predominant: the major agencies covered by the directive released 58 data sets, of which, by our count, 16 were previously unavailable in some format online.

That sounds like progress, I suppose . . . but a long way to go before we have real “transparency.”



Facebook Analyzes User Diversity Data

The Facebook data analysis team recently finished taking a hard look at the diversity statistics for the more than 94 million users who live in the United States. (Fun fact: There are more than 350 million users worldwide, making Facebook a larger “country” than the U.S.)

Ethnic Makeup Of Facebook Users

The data team dove deeply into the numbers and used a range of tools to make sure that they were doing their best to remove bias and error. The chief tools they used are statistical breakdowns of ethnicity and last names. Their report goes into detail about the methods they used and while one can quibble with things here and there, it appears overall reasonable.

The upshot: “We discovered that Facebook has always been diverse and that the diversity has increased significantly over the past year to the point where U.S. Facebook users nearly mirror the diversity of the overall population of the country.”

The graph illustrates this. The dotted lines represent the distribution of various (nonwhite) ethnicities in the overall Internet population, while the solid lines represent U.S. Facebook users:

From Facebook

From Facebook

You can see that each solid line is trending toward its corresponding dotted line — implying that the ethnic distribution within Facebook is moving, over time, to match the distribution of general Internet users.

Ethnicity Of Internet Users Vs. All Americans

Note that the Facebook analysis team is comparing  their statistics to Internet users, not U.S. population as a whole. That raises the question, how do the Internet penetration rates map onto the ethnic makeup of the U.S.?

The answer is that with overall Internet adoption reaching 80%, Facebook’s statistics tend to roughly mirror the U.S. population that is  online, but that the digital divides persist. That’s because Internet use does not distribute across the population in the same way for each ethnicity.

According to the latest data from the Pew Internet And American Life Project, penetration rates are higher among whites (80% of Non-Hispanic Whites are online) than among Blacks (72% are online) and Hispanics (61% online).

Here’s another way to look at it, using data from NetRoots Nation and from the U.S. Census:

Internet Penetration Compared To Ethnic Distribution

Internet Penetration Compared To Ethnic Distribution

In other words, White Non-Hispanics are slightly over-represented online, while other ethnicities are slightly underrepresented. Hispanics show the widest gap.

(Note that I am comparing households and individuals here, so the numbers aren’t precisely comparable, but they illustrate the point.)

The Real Digital Divide

While there are very real divisions in the United States when it comes to race and ethnicity, when it comes to the “Digital Divide,” a larger driver is economics and education (which itself is in large part driven by economics).

For instance, according to the Pew Internet and American Life Project, 94% of college graduates are online, while just 72% of high-school only Americans are. And for adults with less than high school the online rate is just 37%.

And, while 95% of people who make more than $75K per year are online, the number drops to 62% for those who make less than $30K.

The suggests an interesting avenue for the Facebook team to pursue, which is a study of economic and education data as it relates to Facebook users.



Personal And Public Security — What’s The Answer?

lax security by Flickr user Abulic Monkey

"lax security" by Flickr user Abulic Monkey

By now, most users of Twitter know of what’s come to be called “Twitfail” — French hackers gained access to the personal email accounts and passwords of top executives at Twitter. To prove it, they emailed a large cache of internal strategy documents to the widely-read Silicon Valley blog TechCrunch. After agonizing over it for a bit, and after informing Twitter executives, TechCrunch published some of the documents.

According to a recap of how the hacker (”Hacker Croll”) got it done: “The list of services affected either directly, or indirectly, are some of the most popular web applications and services in use today – Gmail, Google Apps, GoDaddy, MobileMe, AT&T, Amazon, Hotmail, Paypal and iTunes.”

Hacker Croll didn’t crack the main Twitter network first — he cracked the founder’s Gmail password. Once into that playground, he had access to almost everything. Twitter executives shared and interacted on a number of publicly available platforms, just like many people do. For any email user who uses Gmail, Yahoo, Hotmail, or any other public “cloud” service: imagine if you were a high profile person. In other words, a target. Now imagine what damage a dedicated person could do if they got full access to your email. You probably have usernames and passwords in there. You probably have password reminders too (pet names, for instance). On your public profile on Facebook or somewhere else, maybe you’ve mentioned where you were born.

With all this information, it’s possible to to do a lot of damage. According to Nik Cubrilovic of TechCrunch:

Taken individually, most of these services have reasonable security precautions against intrusion. But there are huge weaknesses when they are looked at together, as an ecosystem. Like dominoes, once one fell (Gmail was the first to go), the others all tumbled as well. The end result was chaos, and raises important questions about how private corporate and personal information is managed and secured in a time when the trend is towards more data, applications and entire user identities being hosted on the web and ‘in the cloud’.

It does indeed raise a few questions. The first is, “What can I do to make sure this does not happen to me?” You are probably already doing it — by not being famous. While it is straightforward for a dedicated person to crack another’s accounts, it takes time and energy (unless you do something silly to make it easy like use an obvious word like “password” for your password).

More interesting to me, though, is a question about current online culture in general. The space rewards funky startups with moxie and attitude. Enterprises started on a good idea, Red Bull, and (later) a small tranche of VC capital. This is a culture where even between competitors there is a high degree of trust and everyone is using the same tools (Gmail, Google Apps, etc.). It’s like everyone is locking their front doors, but they leave their cars unlocked and park them in the same lot.

Video producer Loren Feldman has a scathing critique of Twitter’s security (which uses a few bowling words, so you might want to put earphones on if you are at work):

It’s easy to say this has got to change. But a rampant culture of collaboration is part of what makes so much creativity possible. People can work together on new things seamlessly, with very little friction. So it’s a trade off.

In such a situation, if you are the owner of a business — at what point do you realize you need to get serious about your personal security? Sure, the easy answer is “When you form your company.” But that’s not the issue. When do you decide to take the large step, perhaps, of resetting every one of your personal passwords? The inertia against doing that is high. Knowing what you now know, are you going to run out and change all your passwords so each one is different, unique, and unguessable? I didn’t think so.

One answer, I think, is that a few large sites ought to become just a little less user friendly. It should be much harder to regain a lost password or reset a password. (If I know your dog’s name, I might be able to fool the “remind me my password” function of your favorite site just by guessing what your username might be.) But the user-friendliness is what has allowed such comanies to thrive. Again, a trade-off.

How do we secure a system that relies on ease-of-use? That’s the key question. The very thing that has allowed today’s culture is the thing that could be its downfall. This is something we need to pay attention to.

Whoever answers it will become very wealthy.

Internet Radio, The Next Plastics

According to MinOnline, a little noticed phenomenon began to coalesce last year. While everyone was watching social networking and video sites soaring in popularity — Internet Audio was quietly taking off too.

It may be one of the Big Things of 2009 — indeed, it may already be. Says Arbitron: 33 million Americans listen to Internet radio each week. Among at-work listeners, Internet use went from 12% to 20%. And among college graduates, 30% of all radio listening is over the pipes.
It makes sense. Sites like Pandora and Last.fm allow listeners to tailor their experience and — more important — share favorites and playlists with ease. As people in general demand more and more customization from their organizational and institutional relationships, it boggles my mind that anyone still puts up with broadcast at all.
From the article:
Why is the rise and success of Internet radio important to publishers? On several grounds. First, this is what your prize in-office users are doing with much of their day. Finding ways to weave into one of the things they most enjoy about broadband should be a no-brainer for any veteran Web content provider. If you think they like social networks and video, then wait until you see how much users love their Pandora. The average session time is three hours. Also, Web radio is an enormously robust channel for audio programming, including podcasts. Services like Last.fm, for instance, let users find and save popular podcasts into their libraries for later playback as a channel.

More to the point, however, streaming audio represents a massively popular mode of online behavior that invites a range of publisher partnerships: branded audio channels or “editor’s choice” channels, for instance. Why shouldn’t an online site offer an audio feed of its editor’s Web radio channel or channels created by that issue’s featured celebrities? What would an Utne radio channel sound like, or a BHG or High Times channel, for that matter? Lifestyle, art, regional and certainly music publications all aggregate taste groups that likely share musical or even talk radio preferences. Web radio listeners already swap their music channels in much the same way the rest of us trade and share article links in social media. Audio is the next content type users will want to coalesce around and share. This is a Web trend in the making that Web publishers should not take lightly.

As I tweeted a few days ago, the rise of Internet radio seems to me to spell game over for satellite. SiriusXM recently got a reprieve from having their stock delisted from NASDAQ, but how do they stay afloat over the long term?

Taxonomy of Hotel Internet

I’ve been travelling again. It’s got me thinking about different kinds of hotel internet access:

I see four basic kinds:

  • Just plain on — This is great, but unusual. Walk into the hotel room, flip on your laptop, find wireless network, connect, go.
  • Free but signup — Courtyard by Marriott has this. It’s quite good. You need to sign on, but once you’re on, it works pretty bulletproof. Plus, you have a choice between wired and wireless.
  • Outside vendor — I was in a hotel recently that had teamed up with T-Mobile. I had to sign up for a daypass from T-mobile to use Internet. A bit of a pain (and I did not like paying), but it worked.
  • Fancy hotel, in house system, barely works — That’s where I am now. I paid $12.95 for a system where I had to sign up through multiple screens including pop-ups, could only use wired (no wireless available at all) and . . . once I was on, the connection failed reliably every four minutes or so. Come on, people!

Let me know in the comments what kinds I’ve missed!