Ukweli: Dropping truth bombs on dark social

Ukweli: Dropping truth bombs on dark social

Organisation: Eyewitness News (EWN) (South Africa)

Publication Date: 11/22/2017

Description

Dark social is the black box of sharing. On private messaging platforms, we can’t track what’s being shared, where it comes from or if it’s true. So how do we interrupt patterns we can’t track? How do we call bullshit on what we can’t see? Enter… Ukweli. (Swahili for truth) It’s a messaging-focused machine learning AI, to detect mis- and disinformation on dark social. It’s like antivirus software for your messaging circle. Ukweli will detect suspect info and flag it - before you pass it on. We’ve built it as a Chrome extension as a proof of concept, and it’ll work on any text in your browser - but the intention here is not for this to live only or even mainly on desktop. For now, you download the Chrome extension and all you have to do is select a piece of text and right click. There’s an option to ask Ukweli to fact check the text. Wait a moment and Ukweli will get back to you with probabilities of whether the message is misinformation, spam or a hoax chain text. This tool can be developed into a native app that sits on someone’s mobile with a local repository that allows them to check text messages on their phone, without having to leave the app they’re using. We plan for Ukweli to work on SMS, Messenger and Telegram, using bots to forward messages to our AI. Hopefully, it’ll even work on the elusive WhatsApp on iOS - using the forward functionality that will send a message outside of WhatsApp itself. Down the line, we can build in functionality to check screen shots as well. So all of that sounds pretty cool, right? But it’s what’s happening behind the scenes that’s really the most important part of this project. What sets Ukweli apart is that it doesn’t need to match exact text to make a call on whether something is misinformation - it uses a database of spam messages and hoax stories to build an understanding of the language of mis- and disinformation. EWN already runs a daily WhatsApp news service with more than 16 thousand active and highly responsive users who regularly pass hoax messages onto us. We’ll leverage this service and our other social media accounts to continuously build up the repository. The AI can be trained using different languages, meaning it can be used throughout Africa and the world, wherever a repository exists in a certain language. If you’ve ever forwarded a message that your gut tells you may be fake, this tool is for you. It saves you from yourself and interrupts the pattern of sharing misinformation with its lightning speed. We all know the old adage: “A lie can travel halfway around the world before the truth can get its boots on.” The Ukweli team would argue... not any more.

Technologies used for this project:

We used machine learning to train two models: A neural network (using TensorFlow) classifier (for fake news) and a Naive Bayes classifier (for SMS spam). The Tensorflow technology can be used on mobile devices as well as on server-side applications. The AI works on submitted texts, learns to pick up on the language of mis and disinformation — and warns you if that message you’re about to forward to your soccer moms group, is a big, steaming pile of… misinformation. It performs a similarity check - and can warn you that a message you’re about to forward is potentially a hoax, spam or misinformation. Any message that can’t be definitively categorised by the AI can be passed on to real human beings in the form of journalists for investigation, via a verification workflow. The result of that manual verification can then feed back into the machine learning model, improving future accuracy. Existing tech and code that we used: 1. A machine learning algorithm built on the back of open source code. 2. Existing online repositories of spam, hoax messages and misinformation. What we created during the hack: 1. A Chrome extension from scratch. 2. Written logic to train the machine learning model using sample data. 3. An API to interface with the machine learning model, to allow for quick similarity checks. Python was used for the collection and processing of data, the training of the machine learning models and the development of the web application. A number of libraries were used for the machine learning part, including scikit-learn, Keras, TensorFlow and Gensim.
Follow this project
Wait
Comment

Comments (0)

You have to be connected to contribute

You have to be connected to follow

Leave this project and no longer be informed about this project

By joining this project, you will be informed by email when an update or a new contribution is posted on the website.

Thank you for your active participation !
Best,

The GEN Community Team