Sorting the Social Media Chaos

Posted: May 30, 2011 in My Data Journey
Tags: , , , , , , ,

I gave a talk at News Rewired which with some slight technical hiccups and a tight time frame, I spoke too fast to be coherent. But thanks to the power of Twitter (and the people who go to conferences just to tweet about them!) I’ve realized that there was a huge interest in what I was saying (at high speed). I was speaking in the session called ‘Sorting the Social Media Chaos’ along with Alex Gubbay, social media editor of BBC News, Neal Mann, freelance producer at Sky News and Fergus Bell, senior producer at Associated Press.

To be put amongst the pro’s was an honour but also intimating (which I’m sure added to the speed of my talk). The only leg I had to stand on was the fact that I was assigned to the Digital Media Team at CNN by the head, Katherine Green (who I’m eternally grateful to). I was not given a proper job description, and left up to my own devices I decided to use the social web as my beat for news gathering. My talk was based on what I had learnt and a few examples of how I used social media. So for anyone who was there and couldn’t quite catch it all and indeed anyone who is interested in getting news from the ground as it is happening, here it is in web form:

On any given day the signal coming from the social web takes on one of two forms. 1)Noisy – a regular day where you see the top peaks of trending terms and stories may be split across terms that apply to the same story. I used to be able to see these using Google ‘Hot Topics’ but this service has gone since I started. By probing the social web for 8 hours every day I begun to have a feel for the moods with which it generates noise. The more useless white noise of Lady Gaga, Justine Bieber and Glee began signalling at around 2pm London time i.e. when the US comes online. By subjecting myself to the social algorithm I was able to see when Mumbai came online at lunch and even predicted terms which trended the next day a couple of times. But knowing I was only getting the peaks of the noise levels, I wanted to interrogate it further to get to the interesting stories that were being drowned out. More on this later but for some of the tools I used for social searching check out this page.

Social Media Signal – Noisy

2)Loud – This is when there is a big breaking news story such as earthquakes, volcanic eruptions, tsunamis and giant ash clouds. Rather than a lot of peaks you get one large saturated term. I say saturated because if you’re using a client like TweetDeck or a website that uses the Twitter API like TrendsMap then there’s a limit to how many times it can hit the Twitter API (so as not to crash Twitter). So you never see everything once the tweets start pouring out faster than your applications’ API can handle. At his point you become a social media curator and what you are looking to curate is a link to an image, video or local news/blog post. To verify these you need three pieces of information: place, person and time.

Social Media Signal - Loud

When the Social Media Signal is Loud:

Now the first social instinct for a loud breaking news story is to spread it geographically i.e. news spreads away from the source. The best way to look at tweets coming for a particular location is TrendsMap. Not only can you see the big trending terms by location but you can search for specific terms. By clicking on a term at a location you get the tweets coming in but also a thumbnail of the pictures and videos attached to that term and some of the top links. Now the brilliant part of this is that the pictures appear and the videos play out in a frame on the website so you’re not constantly redirected. Any application that does this is of little use when the social media signal is loud. TrendsMap also has a list of Twitter terms breaking globally at the bottom of the page and can drill down to locations. This is a brilliant brilliant tool. One tip I would have when the strength of the social media signal is maxing out you API is that when you’ve found an interesting picture or video your first reaction should be to scroll down on the tweet stream. If the scroll bar is at the top, the application will keep on calling the Twitter API so you’ll see a flash of 100s of tweets and then it’ll disappear as the next lot comes in. If you scroll down the application will stop calling the API and stay on the list where you’ve found your picture/video of interest i.e. you won’t lose the original tweet the link came with. You will need this for verification.

Another thing you should be aware of is that tweets appear on Twitter in a language which does not use the Roman alphabet e.g. Arabic. This is a write only function enabled by something called Unicode. This is not a read function in so much as no computer or trending algorithm can read it. This is bad in one way, in that no language which doesn’t use the Roman alphabet can produce a trending term. But it’s good in another way, all links and hashtags have to use the Roman alphabet, and that’s all you need (since you can throw the body of the tweet into Google translate). For instance, the Iranians know how to use social media. The underground youth movement that arose during the Iranian Election of 2009 used #iranelection to get social information out. And they still use it. It is always a trending term coming out of Tehran. They use it like a beacon to amalgamate all the dissent coming from different areas using Persian, Arabic or English. For instance, in late 2010 I spotted a wire saying a terrorist threat has been called in to embassies in Greece from an Iranian group. I had checked #iranelection that day and knew that a video of Iranian asylum seekers on hunger strike in Athens being forced to the ground by the Greek police was being broadcast from #iranelection. No doubt the threat was called in to keep the police busy and away from the asylum seekers.

The next best tool to sort out the social media chaos is the Twitter search. This is not the search bar in Twitter but a well hidden URL – search.twitter.com which has an advanced search where you can pare down to place, hashtag, person, date, etc. There are lots of websites that use Twitter in different ways, some of which I’ve listed and explained in my Social Searching page. The only one I find of real value when the social media signal is loud is Topsy. It has this nifty side bar that marks how many times your search term has appeared in the last hour, day, month and all time. Here I will use an example from my time at CNN. Two months into my 3 month contract a volcano erupted in Indonesia, Mt. Merapi in fact. Now after the US and the UK, Indonesia is the third largest country by users on the social web. This meant the TweetDeck and TrendsMap APIs were maxing out. I was looking for an image of the erupting volcano. I saw from the tweets of the news organizations I was following that an old image of the volcano erupting in 2005 was going around social media. I knew I had to be careful. By looking at #Merapi coming out of Yogyakarta on TrendsMap, I spotted one that looked different. But it was a retweet. I needed to find the original and verify the location of the user (note: TrendsMap will use the mobile location when a user has allowed it but will also use the location input by the user which is usually their home town but may not be their location at the time of tweeting).

I plugged the shortened link into Topsy and saw it didn’t appear on Twitter later than an hour ago. I found the original user by opening up the photo in Twitpic. I pulled up her profile in TweetDeck. Her location was not given in longitude-latitude (when the location is given as a string of numbers with commas that’s longitude-latitude, just plug that into Google Maps and you get the location). I added her to my ‘sources’ list and in the TweetDeck column filter box I typed in ‘4sq’. This is a Foursquare check-in link on a map. She had checked into the Atma Jaya University in Yogyakarta 30 minutes before she posted the photo. I looked at it on Google Maps using the terrain layer to confirm the angle of the picture looked correct. On her Twitter profile she had a link to her blog. From the about page I found out she was a Science Communication student and got her real name (nicknamed used on Twitter). I called up the university andasked to be put through to the Science Department. I then asked to be put through to her dorm room. She answered the phone and we got her onto the roof reporting what she saw and followed her throughout the evacuation. A tsunami also struck soon afterwards and I got a skipper whose boat had capsized and exploded. He managed to survive by swimming to an island and climbing to the top of a tree for 3 hours to talk on CNN’s prime time show. For me, the best form of verification was to find the source soon after publishing (less than an hour) and track them down by phone or on Skype.

Upwardly Mobile:

Towards the end of my contract I got confident enough in my social media sorting to send information to correspondents on the ground about what’s happening in and around them. So when preparing for my talk I began thinking about how to get this information to field producers and reporters so they can act on it. I was thinking mobile phone apps. TrendsMap does not have an app but what can be used is the buzz layer on Google maps. This finds your location and lists at the top the buzzes around you. There’s also a Tweeps Around layer on the augmented reality app Layar. The good thing about this is there’s a radar at the top right hand that pin points the tweets, so if you’re on the ground during a major event this could help you decide where the action is coming from. My experience is that the internal communication chaos of a large media organisation is what’s needs sorting just as much as the social media chaos.

From left: Buzz layer on Google Maps, Tweeps Around layar on Layar and search by tags in Diigo apps

So here’s what I would do if I were a social media editor. Firstly, I would use a Mac, a good Mac. This is for two reasons, i) the command-shift-3 function takes a screen grab and saves it to the desktop and naming it by date and time, as a curator I would not just save links but screen shots as things go down and missing from the social web and I have found that people respond to an embedded image in an email more so than a link (on a PC the ‘print screen’ button saves it to the clipboard where you have to open an application, paste it in and save it i.e. too time consuming); ii) Macs are less prone to viruses, the trend I have been seeing is phishing viruses in social media as it’s a more effective way to get people to click on the link (if these viruses get your passwords within a news organisation they can spread to internal systems, that being said I would also run my social media searching off a virtual machine to be safe) so expect a big exposure to malware if you’re doing what I’ve listed above. I would set this Mac up with a minimum of three screens, one with TweetDeck, one with the website tools such as TrendsMap and Topsy and the third with the three Google sites: Translate, Maps and YouTube. Make sure YouTube is logged in with your organisations username as you’ll want to contact people on YouTube via this account.

I would open my website tools in Chrome. This is because I would have the Diigo extension installed. Diigo is a bookmark sharing application that also includes sticky notes and highlighted areas on a web page. Now, I would have one account for the whole organisation so everyone can access the bookmarks from a webpage but also because the extension means that if you search for a term covered by a bookmark in Diigo it will be flagged up in your Google search. What I think could make this of the utmost value is that you can tag a bookmark. So for each loud event I would tag the picture/video/blog link with the hashtag (the Chrome widget will remember your last tag so you don’t have to type it out each time) and add notes about the place, person and time. The newsdesk can look at these links and when verified add the tag ‘verified’. Now there’s a Diigo mobile app where you can search for bookmarks via tags. All the field producer/reporter has to do is set up the app to search by tags (it’s part of the settings) and put the hashtag of the event you’re heading out to cover and ‘verified’ into the search bar. Not only will this bring up the the links to the images and videos coming from the ground but you can download them and look at them offline (e.g. when on a flight or away from a 3G signal). It’s something you can pull up on your phone, show to a fixer/translator aand ask ‘Do you know where this is and how far away are we?’.

When the Social Media Signal is Noisy:

This is the part I’m really interested in and I’ve come to the area of social investigation after cultivating this blog through social media. Here one can use some of the same tools for when the social media signal is loud (indeed I notified the ground crew of the Nigeria Independence Day bomb before the bomb went off i.e. I caught the warning call by the terrorist group and even caught a conversation between two people working in the pathology offices in Durban and Cape Town suggesting recent car-jacking killings were a disguise for a voodoo cult ritual as a pattern of organ removal was appearing) but what you’re mainly trying to do is build a social niche through which you sort the noise. You want to use this to lead you to the social spheres that will provide you with information on the niche. To get started you need to subscribe to the mantra: ‘The best place to make a name for yourself is the web but you’ll never make your name on the web’.

This may appear contradictory but the key to unlocking it’s meaning is in the prepositions. No one would search for Nicola Hughes and if they did they’d find a theatre actress and a Miss Ireland wannabe slag. Nicola Hughes means nothing on Twitter. It may mean something to the people who know me in real life but they are not an online niche. DataMinerUK has meaning and context. DataMinerUK implies a niche, implies functionality and more so implies a social service. I only blog and tweet about media related data and tools. For all those who ask me non-data related issues on Twitter I will only DM, as my Twitter stream functions as a filter for data-related news for my followers. I don’t want to generate noise. I want to work as their semantic web to bring relevancy and link all the data journalism related social spheres (that’s the mission anyhow!).

Do not create noise, be the filter and the social spheres of interest will find you. If you are a known journalist from a known institution you generate noise. Be something else and you’ll see a completely different social web. You need to gain valuable followers to lead you into newsworthy social spheres. Your followers need to be embedded in the niche. In that sense, follow your retweets, check out your lists and find the aggregator services like Paper.li. Also, get out into the real world using Meetup and check out where your followers are meeting using Lanyrd. You can penetrate interesting social spheres by figuring out who the key players are and then using this social algorithms do the work for you. There are some ‘unintended purpose’ tools I use for these investigative purposes which I cannot blog about as I don’t want the general public knowing, but this is my main area of interest and if you find me feel free to ask.

Make Social Media Work for You:

Rather than trying to filter the noise, I am putting out a steady signal on social media which I want to use to generate stories that might not necessarily be put on it. In my previous post I wrote how I am tweeting out all the Ministers’, Special Advisers’ and Permanent Secretaries’ meeting, gifts and hospitalities at No.10. I am seeding out information to spark a leak of knowledge by someone who knows something of interest (but who might not be aware they know something of interest). I am tweeting three tweets every three hours with a hashtag I have pre-assigned. The theory is is that if the hashtag trends it means one of the tweets is creating a lot of buzz and I can follow the hashtag trail to find out who and why. A point I have been making all along is that the journalistic platform does not handle data, it does not find stories in data. The big ‘data’ stories like the War Logs and MPs expenses were stories by the sheer existence of the data. They were big data dumps that no organisation looked into in detail (I found an interesting log which I want to chase down). The next time a dump hits the headlines let’s start tweeting out each and every row and find out if anyone anywhere knows something the journalists do not.

Comments
  1. […] – Afghan Civilian Casualty ExplorerSpecial Treatment for Special Advisers in No. 10Social SearchingSorting the Social Media Chaos Categories Select Category Bad Data  (2) Data Journalism  (34) […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s