USEFUL RESOURCES FOR SOME, USELESS RANTS FOR OTHERS

Imagining A Reliability Index

This post stems from a discussion I had on Twitter last Friday. Journalism professor and media critic Jay Rosen tweeted:

Bloggers are increasingly credentialed as “press,” but that means we need a reliability index even more that we did before.

I responded that perhaps instead of a reliability index, it might better serve the users to present them with a trust index broken down by demographics. I want to expand on that thought here. I should clarify up front that I don’t think this should be the only tool we use for determining reliability of a story or source, but I do think it can be an effective one.

What would it look like?

Such an index would be compiled from votes by users on whether they trust a particular source or story. It would show an aggregate trust rating, but more importantly, it would also have multiple tabs showing different demographics and what percentage (and how many) within each group trusts that story or source. Here’s a quick-and-dirty mockup of what such an index would look like. Imagine something like this on every piece of journalism you come across online:

By demographics, I certainly mean basic categories such as age, education, or political affiliation (for stories where that’s relevant), but I also envision categories that are more specifically relevant to a particular story or source. For instance, the trust index on a science-related story would include a breakdown by voters who are scientists. A story about a particular neighborhood would include numbers for voters living within a certain proximity to that neighborhood. The index on an iPad review would include trust levels among people who own Macs vs. people who own PCs. A story about health-care reform would include a tab that shows how closely the voters in the index have kept up with the health-care debate, or how many of them work in the health-care industry, or how many are happy with their health care, or how many are from each income level.

How would this work?

Anyone would be able to sign up to use this trust index, both as a voter and as a source. The way I envision it, when you sign up, you would need to provide some basic demographic information, such as birth year, zip code, level of education, etc. All stories and sources are categorized, and users can pick which categories they want to vote on, such as politics, science, medicine, technology, or news about Chapel Hill, N.C.

From the other side, a user who is a source can also choose to add his/her story or site to the index for others to vote on. The index graphic will be displayed on the story or site, along with a “Trust/Don’t Trust” button for users to vote. When adding a story/site to the index, the source would have to pick which categories it falls into.

In order to vote on stories/sources in a particular category or to add a story/site to a category, users would first have to answer additional questions that provide more category-specific demographic data. For instance, if I want to to vote in the science category, I would have to first answer questions such as:

  • Do you work in a job where you conduct scientific research?
  • Do you hold an undergraduate/advanced degree in a science field?
  • How many science-related stories do you read in a typical week?
  • Where do you come down on evolution-vs.-intelligent design?
  • Which organelles are the power plants of cells?
  • What is the process by which cells reproduce?

Such questions would not only ask the users about their backgrounds as it is related to science, but also actually gauges in some elementary way their basic science literacy. The same goes for other categories. In the political news category, you might get tested on basic knowledge of how the government works; in news about a particular location, you might need to tell us how much time you’ve spent there. Now, considering that not all users would want some of these details associated with their names, I think the index must allow pseudonymous participation. Yes, that would open it up to being gamed, but then again, the same can be said of any open rating system. A hotel owner might sign in under several different names on TripAdvisor.com to give his own establishment five-star ratings. However, if you get enough participation, the numbers offset individuals’ attempts to game the system.

In addition to the questions users must answer when they sign up to vote in or add a story/site to a category, a source can also add one or two questions on a specific story that asks for demographic information specifically relevant to the subject. But these questions would be purely optional for the users who vote. After all, how many hoops are we willing to jump through to vote on something (and that may be one of the reasons this idea won’t work. It might be simply too much hassle to gain widespread participation)?

Why would sources use it?

Legitimacy. If this index catches on and gains widespread use, then the downside of not displaying it on your site would outweigh concerns about using it and getting mediocre ratings. Think about it: When you’re looking at gadgets on Amazon.com or hotels on a travel site, how likely are you to pick one without any reviews? Having a rating on your blog could be in effect saying, “Hey, this blog is participating in this index, which shows, at the very least, that we take ourselves and our content seriously enough to put an accountability meter on our site.”

Why would readers use it?

For the same reasons they leave comments on blog posts and rate products they’ve purchased: to make their opinions known and to help fellow readers. The key, as I mentioned above, is to not make them jump through too many hoops to do it, and it’ll be a balancing act between user convenience and getting enough demographic data for the index to be useful.

Why do it this way?

There a few reasons I favor doing a trust-by-demographics breakdown voted on by users over a one-number rating determined by one or a few editors running the index. First, it harnesses the power of the crowd, and in this case, I think that’s a good thing. Like product ratings at an e-commerce site or hotel reviews on TripAdvisor.com, the more participation you get from people who are actually using a particular piece of journalism, the more useful your rating system is to someone trying to decide whether to trust a source or a story.

My second reason has to do with the way I think people react to journalism versus the way they react to, say, a hotel room. On Twitter, Daniel Bachhuber responded to Jay’s and my tweets with the suggestion that reliability be derived from the quality of the content. My feeling on that is, while there is some kind of baseline, when you’re talking about how people view journalism, “quality of content” is usually a very subjective concept, in part because the subject matter journalism deals with often speaks to (or against) people’s deeply held beliefs and values. Think of it this way: Regardless of whether you are a Republican or Democrat, you are likely to have the same values when it comes to judging the quality of the hotel room you stayed in last night (cleanliness, comfy beds, nice toiletries, etc). The same cannot be said for the values you use to judge government policies or stories about them. Thus, in the case of journalism, the baseline for quality that most people, regardless of their individual backgrounds and values, can agree on would be extremely basic and deal mostly with form and structure, such as not having any typos or using good grammar. The quality of the substance of a story, however, is something much more divisive and subjective.

Therefore, if a reliability rating is handed out by only one person or even just a few people, it’ll invariably raise the question among users of the index: “What makes their judgment any more legitimate than my own?” While I agree with Daniel that the rating may be less subjective if the metrics are transparent, I don’t think it’s possible to have an objective rating system or for a rating system to escape the (correct) perception of subjectivity. In fact, which metrics you pick to evaluate a story is a subjective decision in itself.

That doesn’t mean rating systems can’t be useful if they are subjective, but I think it does mean that instead of taking the tone of, “This story/source is reliable,” a reliability index would be more effective and more widely accepted if it focuses more on telling people, “Here is how much trust this story is getting from different groups of people, classified by background attributes relevant to this story. Use this information to help yourself decide whether you trust it.”

Certainly, I think the breakdown-by-demographics index can, and perhaps should, be complemented by a rating handed down by one or a few editors, much like how e-commerce sites have their own ratings for a product as well as a rating by users. However, personally, I’ve always found the user ratings to be much more helpful, due in no small part to the simple fact that there are more of them, giving me a wider range of opinions and a better idea of how people generally feel about a product. In that context, how much value is there in a numerical rating by one editor when it is compared to the ratings of tens, hundreds or even thousands of users? For that reason, I think an editor’s job should instead be to provide objective metrics about a story or source — how many factual errors there are, how many times this source has interviewed the same person in stories about the same subject in the past three months, etc. — and to point out any important relevant facts people should know when determining the trustworthiness of the story or source (i.e.: A source writing about health-care reform moonlights as a lobbyist for for insurance companies).

Most Importantly

To talk about what I feel is the most important strength of a breakdown-by-demographic index, let’s use an example. Let’s say someone writes a piece that presents intelligent design with equal scientific legitimacy as evolution. It would probably spark the usual back-and-forth in the comment section, which is all fine and dandy for the spirit of free debate, but for someone who hasn’t made up their mind about whether this story is trustworthy, that back-and-forth is basically he-said-she-said.

In that scenario, a reliability rating handed out by one or a few editors does relatively little to help the reader, since that’s just one or a couple voices in the crowd, and regardless of whether the rating is favorable to the story or not, it’s not that hard for the other side to call into question the legitimacy of the rating since it’s merely “the biased opinion of just a few people.” However, if you have a rating that’s a composite of hundreds if not thousands of users, then it begins to 1) attain more legitimacy, and 2) give you a better idea of how this story is viewed by the public. Furthermore, and more importantly, the breakdown by relevant demographics would play a crucial role here. Imagine how it would influence your decision on whether to trust the story if, say, you can see that 99 percent of every voter who has an advance science degree don’t trust this story, or that despite an 85-percent trust rating, you see that 300 of the 400 people who voted believe in intelligent design but that of the 35 voters who are scientists, none trusts the article.

That last point illustrates what I believe to be the greatest value of such an index: It’s not just an index on the reliability of the story or source being rated; it’s also an indirect index on the reliability of the people doing the rating. I believe the latter might even trump the former in significance in how someone decides whether to trust a story or source. If the news is social, then it is the people who discuss a report and the frame in which they pass it on that give it context and weight. Therefore, information about those people are crucial to our understanding of a story. If the story in the example above appeared in a forum that attracts a mostly pro-intelligent design crowd (but doesn’t clearly label itself as such), the composite reliability rating and most of the comments might favor the article, and someone who hasn’t been thoroughly informed on the subject might be misled by that seemingly lopsided discussion. However, if there’s a breakdown right there showing that most of the people who voted have a poor understanding of science, but that the few scientists who voted all distrust the article, that puts the discussion in a new context. In this way, the breakdown-by-demographics index alerts us when we stumble into echo chambers and acts as a check-and-balance mechanism for composite ratings and discussions that are skewed simply because the story is being presented to a skewed audience.


Tagged as:




2 Comments

  1. John, greetings. I wonder if you have read the book Anathem, by Neal Stephenson? There is a scene in the book in which a 'net-savvy user assesses the accuracy of online information by noting a writers' reliability score. My memory of the exact word Stephenson used is hazy, but that seems to be exactly what you're talking about.

    Keep it up. :) The 'net needs these indices.

    • Hi noxpopuli, thanks for reading and commenting. I have not read Anathem, but might have to check it out now. Thanks for the tip.