Unitj - unify Google Sheets
Organisation: SRF Data (Switzerland)
Publication Date: 10/27/2016
DescriptionWith our Add-On to Google Sheets, we make Google Sheets Journalism-compatible. Let's think of a typical newsroom / investigative journalism workflow: A journalist wants to investigate payments of pharma companies to doctors (cf. "Dollars for Docs"). The journalist defines a data model for these payments. Each pharma company actually releases data that somehow fulfill this model, but in very different structures (different order of doctors' names, different column names, different address specification, etc. etc.) The journalist thinks it's a good idea to store all the collected source data - as it is not sensitive - in Google Sheets, so everyone in the newsroom can see the current status of the investigation and manually correct stuff. So he/she asks his/her co-workers to go and collect the source spreadsheets from pharma companies and store them on Google Drive. With our tool, these journalists are then able to import/integrate their collected source sheets into one consistent master sheet that greatly helps them in cleaning their data and enforcing a certain data model. Why all of this? With a raw dataset that already has a consistent structure, it will be way easier later on to do further (automated) preprocessing, e.g. deduplicating companies' names, georeferencing adresses, and so forth. Trust us: We've been there and done that, many times. And we've lost many nerves. For example in our year-long investigation on vested interests of Swiss universities (http://srf.ch/uni) and others (http://srf.ch/data). We know that you can (theoretically) save a lot of time and when enforcing a certain structure during data collection. And that's why we build this tool. Features: - Define a master sheet with a data model (column names, types, restrictions, etc.) - already implemented - Select a source sheet on your Drive and import it into the master sheet - already implemented - Warning console that helps you identify problems in your source sheets - already implemented - Split, merge, map source columns to master columns - partially implemented - Graphical drag-and-drop toolbox which makes the above operations easy & fun - partially implemented - Enforce column types & semantic consistency - will be implemented - Maintain uniqueness & traceability (IDs, timestamps, authors) - will be implemented - Save a lot of headaches & nervous breakdowns later on - priceless Monetization plan? Hell no, this is all going to be F(ree as in free beer)OSS.
You have to be connected to contribute
You have to be connected to follow
Leave this project and no longer be informed about this project
By joining this project, you will be informed by email when an update or a new contribution is posted on the website.
Thank you for your active participation !
The GEN Community Team