Archive for the ‘News Story’ Category

It’s been a while since I liberated any data and that’s because I’ve been wrestling with a scraper of Government Salaries. I’ve only looked at pay floor for the tables below. This is the minimum pay with ceiling pay being £4,999 more than the floor. Salaries ranged from £35,000 to £235,000.

There was the coding to deal with of course. I’ve had my third formal lesson. Lists, ooh er. But the main difficulty came from the awful state the data was in. All CSV files but spread over the web with a blatant disregard for consistency. Anyway, you can download the data for yourself. There are 440 rows so looking back at the poor quality of the data one could easily have done a copy and paste job, but a scraper that trawls a site for the links to the pages that has the download ensures you get all the data sets and makes the next collection a matter of hitting a button (hopefully).

Here is the top 10 for pay:

If you’re looking to climb your way up to the top pay band then here’s the top 10 departments:

And since the Information Commissioner has ordered release of names of high-paid civil servants who did not want their salaries disclosed the data shows 48 names have been withheld amounting to £508,000 in cumulative pay. Most of these redactions came from the Cabinet and Home Office. I can understand why names from the Office for Security and Counter-Terrorism Unit were not disclosed. One consistency with the withholding of names is from the lawyers. The entire list of names for the Office of the Parliamentary Counsel is N/D as is the legal advisers branch of the Cabinet Office. Also not named are the Media Director and Communication and Change Director. Here are the details where the names are not disclosed:


If you follow Scrape_No10 on twitter you will be receiving all the meetings, gifts and hospitalities received at No.10 by ministers, special advisers and permanent secretaries. If you follow #Scrape10 then you should be getting those as well as all the tweets relating to any item of data. The current database should be tweeted out by July.

The information contained in each tweet comes from the data published by the Cabinet Office. I scraped each data set and sorted the aggregated information chronologically, so that it can be tweeted out in the order in which the events happened (not every day was given). The links to the original data source and the scraped data are given at the bottom of this post.

With respect to what I have done, I would like to remind people that Freedom of Information does not equate to contextual knowledge regarding the information or making it useful. I am a data journalist. Data is my beat. But data is a public right. Not just your data but also the data of the people who work for you – government data.

Journalism involves information but also conversation. Each data entry now has the ability to start a conversation. Just use #Scrape10. If a tweet is interesting or someone somewhere has added a piece of news relating to the tweet, then #Scrape10 should trend and the tweet should be sent around to the community where the information matters. That’s the theory.

Information is now socially enabled and should be socially enabling. What you would like to know, what matters to you or your wider social community should not only be made available to you but should be made useable in a way that matters.

I am an experimental data journalist, playing with code. All my source code for collecting this data is open and you can download the entire dataset. The code to get it onto twitter is not available, as the publication of the authorisation keys would allow people to hack into the account. I have also written a scraper to store #Scrape10 tweets into a database everyday, so you can catch them all here if you want to.

You can also read a previous post on the Special Advisers’ gifts and hospitality dataset here.

Source Scraper
Permanent Secretaries’ Meetings Permanent Secretaries’ Meetings
Ministerial Meetings Ministerial Meetings
Ministerial Hospitality Ministerial Hospitality
Ministerial Gifts Ministerial Gifts
Special Advisers’ Gifts and Hospitality Special Advisers’ Gifts and Hospitality

The Cabinet Office, in a move towards greater transparency, are attempting to publish all their data online. This isn’t really news but I don’t think news organizations are looking at this data so I’m scraping it and seeing what it has to offer. So as an exercise I’m scraping the page where ministerial gifts, hospitality, travel and meetings with external organisations are published as CSV or PDF. All this should be pretty much covered by Who’s Lobbying but I’m hoping to set up a little social media experiment (more on that to come). So here is all the data, set to scrape the site every month. You can download it all.

I whacked it into Google Refine to deal with the different spellings, nuances and the change in the format of the date. The date transformation option never seems to work for me in Refine so I exported it and opened it up in Excel to get the data out in chronological order. This may sound cumbersome to those who don’t work with data it’s actually quite quick and easy once you’ve tried it. Anyway, I looked at some of the more popular reasons for meeting ministers and grabbed a screen shot of the Excel table (Refine allows you to export a html table but I’ll have to get it to open up in Firefox so I can use my full page grab add-on).

I looked at the meetings for Big Society:

The major meeting with the Prime Minister and Deputy Prime Minister in May involved Young Foundation, Community Links, Antigone, Big Society Network, Balsall Health Forum, London Citizens, Participle, Talk About Local, CAN Breakthrough, Mayor of Middlesborough, Business in the Community, Esmee Fairbairn, Greener Leith, St Giles Trust, Big Issue Invest, Kids Company. Since then there has been a steady trickle of over 30 meetings with Nick Hurd, Oliver Letwin and Francis Maude about Big Society. Note that these are all Conservative MPs so the Big Society is already looking smaller along coalition party lines.

Sure, they have the titles to be involved but the trend in the data seems to be more about big financing. Meetings with the likes of Goldman Sachs, Barclays, British Banking Association and Co-op Financial Servies leads one to believe that Big Society is being outsourced to local communities but the big financing has to come from the top. In Building the Big Society, the Cabinet Office writes:

We will use funds from dormant bank accounts to establish a Big Society
Bank, which will provide new finance for neighbourhood groups, charities,
social enterprises and other nongovernmental bodies

What are ‘funds from dormant bank accounts’ and why didn’t they use these instead of looking to the government to bail them out? The banks and their reckless trading in toxic assets and credit default swaps led to a massive recession. This shed the light on reckless government borrowing and the massive deficit. This led to budget cuts to local services and the need for the Big Society. Which is now being funded by the banks! Am I missing something?

The next thing to look at from the data is the category ‘Introductory Meeting’:

Introductory meetings interest me as I imagine it pays to be at the back of a politicians mind. It must be worthwhile to have some ear time and get your points across. I’m sure not any old Joe can get an introductory meeting. There must be PR companies that specialise in getting these meetings (lobby firms) so it’s interesting how many large companies are going to appear on this list. In fact, the purpose for one meeting was put down as ‘Lobbying’ with UK Public Affairs Council. They have a register of firms and clients published in evil PDF (go figure). Will have to scrape that.

Lastly, I thought the ‘Renegotiation of Contract’ category might be of interest so here it is:

A lot of these are big technology companies yet the government is notorious for accumulating huge costs with little effectiveness when it comes to implementing new IT systems. I also wonder whether Vodafone’s tax dispute was known during the negotiation of their contract with the Cabinet Office.

I’m getting the data out so that anyone with inside knowledge can put two and two together to further the information. I’m churning the data in so that what can be churned out is journalism and not churnalism. That’s the idea anyway. Just looking at the data is a step in the right direction so anyone interested in data journalism, just keep on looking at what’s coming out. And I’ll try and put it into a context that has journalistic value.

A recent blogpost by TotalPolitics says:


In order to get on top of growing mountains of correspondence and keep on digging through acres of committee and legislative papers MPs are having to take on more staff on a fixed staffing allowance, either paying lower wages or taking people on a volunteer basis.

This comes off the back of Nick Clegg’s initiative to get Westminister interns paid. The blog also addresses Westminister pay in general, quoting a staff survey. For a clearer picture a ScraperWiki user, MemeSpring, scraped the jobs data from Work4MP. This is the historic data from when the site first started in 2004.

So I threw the data into Google Refine and of the 2,661 job postings 30% were unpaid internships (791). Shockingly, there was only ever one internship posting that paid minimum wage and this was with Citigate Dewe Rogerson.

The highest demander of unpaid interns is actually the British Youth Council (with 18 postings) and the MP who advertised the most for unpaid work is Liz Lynne with a total of 10 internship positions. Now these internship listings include Parties and groups like Alcohol Concern. But looking at just MPs, Parties and Westminister, they account for over 300 unpaid positions with most ‘salaries’ consisting of travel, lunch and reasonable expenses.

It’s also odd that this is Nick Clegg’s initiative because the vast majority of internships sought for by political party groups comes from the Liberal Democrats.

So when TotalPolitics writes in defence of Government pay:


The vast majority of people involved in politics are volunteers – canvassers, committee members, deliverers, agents and organisers who want their party to succeed and gain office; of those few who are paid they are by and large paid poorly and work extraordinarily long hours, with precious little thanks.

Could this not be read the other way around? MPs geting paid whilst using an army of young naive interns to do their work for free. No doubt they put these interns on their list of expenses.

The road to No.10 is paved with advisers, they lead you in, they open doors. Often for themselves. Previous advisers include Alastair Campbell, Ed Balls and the Miliband brothers. Until they’re in the door they generally don’t command the political spotlight. That is, unless they’re on the way out like Andy Coulson. What they do command is fine wining and dining.

The Cabinet Office publishes Special Advisers’ gifts and hospitality in various Excel sheets that are filled-in depending on how much coffee the civil servant had that morning i.e. inconsistently. They weren’t even consistent with the appointed minister the adviser falls under. So I scraped it and put all the files into one download which covers May to September 2010. You can get it all by hitting the ‘Download spreadsheet (CSV)’ link here.

The Trends:

Here are the advisers listed according to the amount of hospitality they received:

Note that Nick Clegg’s chief adviser, Jonny Oates, has been taken out the most followed by the then PM’s communications chief, Andy Coulson. Most hospitality is provided by media organisations (see table below) and by using Google Refine I dug deeper into the data to look for a bias between advisers for the Prime Minister and Deputy Prime Minister (seeing as there’s a party split). It turns out the BBC only court Cameron’s advisers (15 times in 5 months). The same is true of the Daily Mail. Whereas The Financial Times dine only with those close to Clegg. The Guardian similarly enjoy Lib Dem company, inviting them to their table twice as many times as they did the Tories.

What’s very noticeable from this information is that Special Advisers are wined and dined mostly by media organisations. Here is a list of the top 10 hospitality givers:

If you add up all of Rupert Murdoch’s empire, they account for 20 occasions split 13:7 Cameron’s to Clegg’s.

The close relationship between advisers and media organisations (this is all within a five month period) makes me wonder: when a ‘No.10 insider’ or ‘someone close to the Prime Minister’ is quoted, how often is that piece of information plucked from the lips of these well-fed advisers? A lot I imagine.

The Outliers:

In fact, media and PR are so predominant in hospitality for advisers, I’ve decided to list the rest of the givers in order of how many times they appear in the data: Bell Pottinger (mostly business clients, Airbus, Sky, Unilever, etc), News Corporation, Tetra Strategy (clients include the Government of Dubai and the jailed Russian billionaire,
Mikhail Khodorkovsky), The Daily Telegraph, The Mail on Sunday, The Sunday Times, The Telegraph, Alexander Kutner, Baron Wolfson of Aspley Guise (Conservative life peer and CEO of Next), Business in the Community, Center for Court Innovation (New York think-tank), Citi, Connect Communications (lobbying), Demos (think-tank), General Sir Richard Dannatt, ITN, ITV, Ian Osborne and Partners, Institute for Public Policy Research (think-tank), Islamic Relief, James Kempton, Lansons Communications (clients include J.P. Morgan, Lloyds TSB and Barclays), London Palladium (Whoopi Goldberg), Malaria No More, Martyn Rose, Medley Global Advisors (“provider of macro policy intelligence service for the world’s top hedge funds, institutional investors, and asset managers”), News International, Lawn Tennis Association, Not to Scale, Open Road, Pakistan International Airlines, Policy Exchange (think-tank), RSA, Ramesh Dewan, Richard Thaler, Royal Bank of Scotland, SAB Miller (De Klerk Foundation Event), Save the Children, Taxpayers’ Alliance, The Daily Telegraph and The Daily Mirror, The Economist, The Evening Standard, The Spectator, The Sun, The Sunday Express, UK Music, Wall Street Journal and Wellington College.

Bell Pottinger and Martyn Rose are now with the Big Society Network.

Only six entries weren’t lunch or dinner dates. Steve Hilton was given champagne from Not to Scale, Steve Chatwin received concert tickets from Malaria No More, Naweed Khan got his flights upgraded by Pakistan International Airlines, Andy Coulson was given theatre tickets by Whoopi Goldberg and a bottle of wine by a one Alexander Kutner.

The Anomalies:

Now the only Alexander Kutner I can find happens to have been the Vice President and Principal Engineer of Software Development at Electronic Evidence Discovery. They reduce the risk of electronic discovery, a process which involves digital forensics analysis for recovering evidence.  See comment below regarding the identity of Alexander Kutner

Also, Ian Osborne and Partners, who dined Tim Chatwin, has no existence according to Google.

The MetaData:

What’s missing is what went on at these meals. Who attended. What was said, or agreed upon. Who was being represented. What goes on is not an entry in the data sheets and it never will be. But this data should make you more aware of the existence of these meals on deals.

You can find a list of Special Advisers and their salaries here.

As part of my data journey, I’m learning to scrape. And so I’m looking for small pieces of data in the usual forms to work on first. That being said, I decided to scrape a csv file of UK Ministerial Gifts received in Cabinet Office 2009-10.

For all you novices out there, csv is a basic spreadsheet format which you can open in Excel so it’s fairly usable. That being said, from a data journalism point of view it was less than clean. All the departments were entered and ones which didn’t receive anything had a ‘NIL RETURN’ entry under ‘Minister’. I have no use for that. And the department entry was left empty if the next gift fell under the same department. I had to fix that with code. The entry of dates is appalling. But my main issue is with the data collection. Only gifts worth over £140 are registered. I doubt some poor civil servant is calling up foreign dignitaries to ask how much that bottle of wine you gave the PM is worth, so most gifts are valued at ‘Over limit’. Regardless, as an exercise, here’s what’s of interest:

The King of Saudi Arabia, Abdullah bin Abdul Aziz, gave Alastair Darling jewellery! He also gave Gordon and Sarah Brown an ornament and jewellery.

Nick Keller, founder of Beyond Sport, gave Tessa Jowell a travel alarm clock worth over £140. Beyond sport ambassadors include Tony Blair, Michael Johnson and Dame Kelly Holmes.

Other gifts by non-dignitaries include: Bathrobe, slippers, towels set, and a bed linen set for Gordon Brown from Enrico Marinelli, EMI gave a selection of CDs to Ben Bradshaw (some of which he purchased), Lola Rose gave jewellery and a scarf to Sarah Brown, Naomi Campbell gave her a hamper, and Sir Gulam Noon (a controversial Labour donor) gave Gordon Brown a hamper.

No. 10 must be full of rugs, 3 from Pakistan, 1 from Afghanistan and 1 from Azerbaijan.

Wine given by Nicholas Sarkozy and the President of Algeria, Abdelaziz Bouteflika, were used for official entertainment whereas that given by the President of Tunisia and the Sultan of Brunei were given to charity. Either they didn’t bring good enough wine or Nicholas Sarkozy and Abdelaziz Bouteflika didn’t trust No. 10 to stock good enough wine.

I think I’m going to hack gifts for s bit so stay tuned.

So far I’ve called myself a data journalist. But then again Paris Hilton calls herself a business woman. From my previous post, you can see my interest growing. But I haven’t really done anything. I am trying to learn the skills. These skills aren’t actually known and neither is the job description. But the best way to learn is to do. So here’s something I actually worked on.

This is a visual made from the most inaccessible (both data and journalistically) PDFs of the National Asset Register. the information it contained was used for a Dispatches live debate and this repurposing was put into an article on the Channel 4 News website. I was fortunate enough to be part of the ScraperWiki team that took on the project and produced it in a matter of days. I have written a blog post on ScraperWiki here.

We also made a map of county council brownfield sites available for redevelopment which featured on Channel 4. I actually made the scraper for this data set as it was contained on Excel sheets by region on the Homes and Communities Agency website.  The links to all the scrapers and code can be found on the ScraperWiki blogpost.

These show something of what ScraperWiki can do. I particularly like the fact that the bubbles link back to the data in the PDF. I think if you engage people in a data driven story with simple and effective visuals then they can consume the raw data, and possibly provide better insights.

The map is good in that it allows users to get to their local data, where it matters to them. Local information from a global story. Yet what tickled my journalistic senses is the use of feedback. I suggested to the ScraperWiki team that we always have a feedback from what we do. We got responses on the story. Not great ones. One rather bluntly told us that a school was already being built on the land. The latest data available is only as recent as 2008.

But I engaged this user and found that the bluntness of the response was owing to the fact that the land was ‘once a well loved open space’. If he/she felt so strongly about this change of use I suggested requesting the consultation documents from the council under the Freedom of Information Act and pointed in the direction of WhatDoTheyKnow. Part of ScraperWiki’s remit is building a data democracy and data driven journalism should go some way to promoting an information democracy. I think news organizations fear this as their revenue is linked to their role as information gatekeepers. But social media and the web is breaking down this ideology.

Power comes in breaking down information structures. I wanted to do more with the asset bubbles. Looking back, the orbits are connected to the bureaucratic structure of the data. Given enough time, I would have liked the visual to build an asset pyramid. Where larger bubbles (those with assets of the highest value) would float to the top and lots of little bubbles would form the base of the pyramid. So by looking down you see the more asset intense areas of government which the country has invested in. But when you smash the bubble the components would then fall to their various levels. So by looking into the levels you see all the little areas, museums, barracks, hospital, etc, that equate to similar fixed asset investments. This would break through the departmental structure that was built in the chapters of PDFs.

For that, I need to learn to code. So I’d better get back. Again, if I’m able to do anything of interest you’ll hear about it!