Argentina's Senate Expenses 2004-2013
Organisation: La Nacion (Argentina) (Argentina)
Publication Date: 06/04/2014
DescriptionData Driven Investigation (Large Newsroom) After finding out that Senate have published expenses since 2004 in raw PDFs, some of them as images and completely unstructured, LA NACION data team managed to scrape, transform, normalize and structure three datasets into one and began an interrogation process that included front page stories, replies from actual and former Argentina´s vice presidents (senate presidents), and caused the reopening of a judicial investigation over vice president Amado Boudou regarding these expenses.
Technologies used for this project:Project phases and tools: 1) Application based in Excel Macros (and VBA) to download files (searches in 4 different site sections). 2) Remove PDF´s protection against printing and copy. 3) Convert PDFs with Onmipage 18. 4) Analyze and parse data. 5) Generate a 33.000 rows dataset in excel. 6) Macro that analyzes the 33.000 rows searching for the "SECURITY AGENTS", name of Senator, number of bodyguards, destination , dates and money requested. 7) Microsoft Project to generate a Gannt chart that showed in a time line the distribution of the trips, and their overlaps. 8) Tableau Public for interactive datavizes.
You have to be connected to contribute
You have to be connected to follow
Leave this project and no longer be informed about this project
By joining this project, you will be informed by email when an update or a new contribution is posted on the website.
Thank you for your active participation !
The GEN Community Team