Posts Tagged ‘data mining’

As part of my data journey, I’m learning to scrape. And so I’m looking for small pieces of data in the usual forms to work on first. That being said, I decided to scrape a csv file of UK Ministerial Gifts received in Cabinet Office 2009-10.

For all you novices out there, csv is a basic spreadsheet format which you can open in Excel so it’s fairly usable. That being said, from a data journalism point of view it was less than clean. All the departments were entered and ones which didn’t receive anything had a ‘NIL RETURN’ entry under ‘Minister’. I have no use for that. And the department entry was left empty if the next gift fell under the same department. I had to fix that with code. The entry of dates is appalling. But my main issue is with the data collection. Only gifts worth over £140 are registered. I doubt some poor civil servant is calling up foreign dignitaries to ask how much that bottle of wine you gave the PM is worth, so most gifts are valued at ‘Over limit’. Regardless, as an exercise, here’s what’s of interest:

The King of Saudi Arabia, Abdullah bin Abdul Aziz, gave Alastair Darling jewellery! He also gave Gordon and Sarah Brown an ornament and jewellery.

Nick Keller, founder of Beyond Sport, gave Tessa Jowell a travel alarm clock worth over £140. Beyond sport ambassadors include Tony Blair, Michael Johnson and Dame Kelly Holmes.

Other gifts by non-dignitaries include: Bathrobe, slippers, towels set, and a bed linen set for Gordon Brown from Enrico Marinelli, EMI gave a selection of CDs to Ben Bradshaw (some of which he purchased), Lola Rose gave jewellery and a scarf to Sarah Brown, Naomi Campbell gave her a hamper, and Sir Gulam Noon (a controversial Labour donor) gave Gordon Brown a hamper.

No. 10 must be full of rugs, 3 from Pakistan, 1 from Afghanistan and 1 from Azerbaijan.

Wine given by Nicholas Sarkozy and the President of Algeria, Abdelaziz Bouteflika, were used for official entertainment whereas that given by the President of Tunisia and the Sultan of Brunei were given to charity. Either they didn’t bring good enough wine or Nicholas Sarkozy and Abdelaziz Bouteflika didn’t trust No. 10 to stock good enough wine.

I think I’m going to hack gifts for s bit so stay tuned.

Here is a timeline of my data journey. Starting from when I first heard about this thing called Computer-Assisted-Reporting.

Vodpod videos no longer available.

Well, computers have moved on since journalists were hacking away on spreadsheets a decade ago so I decided to see how CAR has come along. This proved puzzling. In almost all news institutions it has been overlooked.

So I had been at ITN, BBC and working at CNN during this time of data curiosity on my part. Social media provided some sort of platform to explore data in the newsroom. It being the latest buzz word that execs are actually interested in (unlike data which, in my opinion, is a much more fruitful venture when it comes to generating actual news).

So I did my data journalism stuff out of hours. Gathered a lot of news from social media during hours. This made possible by the many web applications made by developers (as they make money, the sore point in data journalism).

At the beginning of this year, I up and left the newsroom for the programming terminal. To look at applications for data, serious data. I’m now at ScraperWiki. The thinking behind this: The Times paired a programmer and journalist to start working on stories for the web. So the programmer has the newspaper as his playing field. So what if you pair a journalist and programmers in the programming playing field? You can make the field. You create the platform for a purpose. And then repurpose it for the data, not the story.

It’s hard to explain but hopefully this blog about my progress will reveal whether this experiment will ultimately work.

In the meantime I should endeavor to be more punctual with my blog. My excuse is that I’ve been working on my first news project involving data. There will be a post. There will also be two more Hacks and Hackers Days with ScraperWiki this month. So hopefully new horizons will become clear.

Oh, and if you want to find out how I got the timeline on it’s in this post.