Just to let you know that the Twitter account @Scrape_No10 which tweets out ministers’, special advisers’ and permanent secretaries’ meetings, gifts and hospitalities is back up and tweeting. You can read the post about its creation here and download all the data the account contains. This account needs more coding maintenance than the @OJCstatements account (read about it here) because the data is contained in CSV files posted onto a webpage. I code sentences to be tweeted from the rows and columns. The scraper feeding the twitter account feeds off 5 separate scrapers of the CSV files. Because of this, the account is more likely to throw up errors than the simple scraping of the Office for Judicial Complaints site.
So I decided, as I’m learning to code and structure scrapers, to run the scrapers manually every time the twitter account stops, fix the bugs and set the account tweeting again. There will be better ways to structure the scrapers but right now I’m concentrating on the coding.
Learning to scrape CSVs is very handy as lots of government data are released as CSV. That being said, there is CSV documentation/tutorial on ScraperWiki, although it is aimed at programmers. For those interested in learning to code/scrape I would recommend “Learn Python the Hard Way” (which is the easiest for beginners, it’s just ‘hard’ for programmers because it involves typing code!). For more front end work I have recently discovered Codecademy. I can’t vouch for it but it looks interesting enough. I have also put all the datasets for the @Scrape_No10 account on BuzzData as an experiment.