Organisation: Eyewitness News (South Africa)
Publication Date: 05/31/2018
DescriptionThe game is on - and everyone’s going to be running the live score, right? Everyone is going to be offering blow-by-blow coverage and stats and player profiles and analysis. All that gets pretty boring, pretty quickly if you’re not a hardcore fan... But even if you don’t understand what’s happening on the field, you want to be part of the fun. So… how do we as journalists keep non-super fans up to speed with everything else that’s happening in real time? The buzz off the field, the endless trolling - the crazy in-jokes? And how do we surface those amazing stories, especially if we’re not super-fans ourselves? So many memes, so little time. The answer: Wild Card. It’s a machine learning platform that surfs social media for you as a sports match is underway - and finds all the weird and wonderful conversations happening around it. Wild Card can sift through hundreds of thousands of public social media posts and alert you to the really interesting stuff. We trained this prototype on more than 250 thousand tweets from the UEFA champions league final, but it can use data from any public source (so think Insta, Facebook, Wikipedia, even Google autocomplete_. Once it gets the data, Wild Card performs a quick text clean-up, looking for synonyms and repetitions. It also converts emoji into text so that we can get an idea of how people are feeling as well. And here’s where things really start to get interesting. Wild Card cuts out the noise because it disregards the most overused phrases (that’s what trending is for, right?). It also turfs the most obscure phrases (i.e. the mumbo jumbo). Instead, it searches the sweet spot in the middle, surfacing underground trends and sorting popular phrases into lists that you can scan for interesting tip-offs. That means you don’t have to read through hundreds of thousands of posts; you can quickly get a quick bird’s eye view of what the super-fan community is talking about - and you can jump on anything that piques your interest. As it learns, Wild Card will in future be able to automatically create content cards based on the phrases it surfaces - that a journalist can edit, or okay for publishing. These cards can be pushed onto anything from a custom second screen mobile app - to a Twitter bot or even a messaging app. Eventually, as we train Wild Card how to identify good and bad cards, it will hopefully become self-sufficient, able to publish the cool stuff that it finds almost instantly, while you get on with the rest of your work. The best part of it is, that Wild Card can be applied to all sorts of stories - from sports, to elections, to big developing news events. If it’s weird and on social, Wild Card will find it for you.
Technologies used for this project:Tweet collection: Python [TwitterAPI] + MySQL + pandas Text cleanup: Python - nltk WordNetLemmatizer + nltk SnowballStemmer + emoji Phrase detection + high/low frequency term exclusion: Python - gensim TfidfModel Similar word detection [currently unused in prototype]: Python - gensim Word2Vec Frontend: jQuery + slick slide + animate.css
You have to be connected to contribute
You have to be connected to follow
Leave this project and no longer be informed about this project
By joining this project, you will be informed by email when an update or a new contribution is posted on the website.
Thank you for your active participation !
The GEN Community Team