Build a business for your newsroom or make building a newsroom your business?

Posted: July 14, 2011 in My Data Journey
Tags:

Burt Herman, former bureau chief and correspondent for The Associated Press, CEO of Storify and founder of Hacks/Hackers gave the following webinar to the participants of the Knight-Mozilla Learning Lab:

 

Here is my take on how the elements needed to build a business for a newsroom are also what you need to make building a virtual newsroom your business.

Follow your passion

My passion is data journalism, data and journalism in equal measure. Burt established his career before he took the big step of starting his own business. I had yet to start my career when I decided to make data journalism my business. My foray into coding is not to build an end-of-line product to mediate information but to build machines for the factory. To build things for journalists to machine read information – to unearth stories that cannot be got by the human eye. It may be a quick way to gauge where Cabinet Office money is going (click on the image to get to the view). Every time the data is updated the visual will update automatically.

Or a way to get information to the public in a way where you can catch the conversation, such as judges who have been reprimanded over personal conduct (read the blog post).

Or an email alert system to bring potential stories to the journalist.

Burt already had a name for himself in journalism. I have not. I’m not looking to make my name; I’m looking to make things that help me find stories that might not otherwise be told.

Build a community

Around this time last year I started this blog and my twitter account not to broadcast what I know but to act as a semantic sink for all things data journalism. So I could find the people who can educate me by what they publish: my data miners. As much as I was working for them, they were working for me. They led me to the Hacks/Hackers community and ultimately to the ScraperWiki team.

Build a team

ScraperWiki is not my team (as much as I love them). The ScraperWiki community is my team. The Australian building planning alerts is on my team. The Icelander looking into foreclosures is on my team. I can see their code; I know what they build; I can ask them for help. My scrapers are my team (and I’ve built those!).

Just build it

Just scrape it. Data in the public interest is public data. Now I can write a scraper in a day or add little things on. And that’s how I’m learning to code but every piece of code I write has to have a journalistic purpose (what ever way you define that!).

Listen to your users

Listening to the stream of information delivered to me by my data miners is what led me to take a leap of faith and leave CNN to join ScraperWiki. I would never have been able to judge, even from within a news organisation, that data journalism was worth pursuing. But I was able to glean this by tracking the metadata from my blog and my twitter footprint.

Stay flexible

Use backend, barebones code. Make it open. Mould it to the purpose of your journalistic endeavour. Here’s what I want to make (which is limited by what I can make!).

About these ads
Comments
  1. Nick Doiron says:

    For your word cloud, have you tried using the embeddable word cloud at http://www.roytanck.com/2008/03/06/wordpress-plugin-wp-cumulus-flash-based-tag-cloud/ ? It’s Flash, but I used it in a recent project and it worked out pretty well.

    • Nicola Hughes says:

      That cloud generates the font size depending on the frequency of the word occurring. I generated the word cloud using the font size as the logarithm of the total spending that they received from the Cabinet Office during the selected time period. Journalistically these are two very different things. I don’t want to know how often they received money during the selected time period. I want to know how much they got!

      Another ScraperWiki user, Tim Retout, forked the code and made a pie chart date slider for the HMRC that very night http://scraperwikiviews.com/run/hmrc_spending_pie_chart_date_slider/

  2. Great post. More like this. :)

    Also, have a look at TweeQL (https://github.com/marcua/tweeql) and DataWrangler (http://vis.stanford.edu/wrangler/) — you may find them useful for some of the tasks that you’ve outlined in the video above.

    – Phillip.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s