The Norwegian Victims of World War II

The Norwegian Victims of World War II

Organisation: VG (Norway)

Publication Date: 04/04/2016

Applicant(s)

Size of team/newsroom:large

Description

Norway was under Nazi occupation from 1940 to 1945. Although Wehrmacht's behavior was less brutal in Norway than in many other European countries, we still had our share of victims. The year 2015 marked the 75th anniversary of the invasion and the 70th anniversary of the liberation. During the winter, an ambitious idea emerged: Would it be possible to map every Norwegian who perished as a victim of the Second World War? The main goal being to create fully fledged digital presentation of the victims: Who were they? What did they do? How did they die? Initially we thought that most stories has been told about Norwegian war history. VG being a newspaper created by the Norwegian resistance shortly after the German defeat in 1945, we have a tradition for telling ww2 stories. After some questioning, it was revealed to us that the Norwegians victims had not been mapped since 1948. It quickly became clear that our main source would be ”Våre falne (Our Fallen)” a four-volume book series released by the Norwegian government in 1949, with information about 11,724 Norwegian victims of the war. The books included short biographies and pictures of (almost) each person. The data was not digitalized in any way. Luckily for us, the Genealogy Society of Norway (DIS Norge) had digitalized parts of the volumes, and we quickly got an agreement on sharing the data. The database contained names and key information, but not the biographies or images. We then inquired for a digital copy at the National Library of Norway. At first we were told that the biographies were copyrighted to the authors of the volumes. We claimed that the books as a whole should be in the public domain according to the Norwegian freedom of information act (offentleglova), as they were planned and published by the Norwegian Government and financed by the Norwegian municipalities. The Library’s legal department spent one hour to consider our claim, and then basically answered “Yes, you’re right, no one thought of it that way before.” We then combined these data with additional data from the Norwegian Warsailors association. We got permission to use images and the stories of each ship that was sunk during the war (around 1000 ships). We again combined these data with information from the excellent site Warsailors (http://warsailors.com) in order to complete our map. German submarine captains neatly logged the position of the ships they sunk. This information wasn’t available to the authors of “Our fallen” in 1948, but it is now. Most of this was added through geolocation in Mapbox and scripts to extract the positioning from plain text. Our finished product became an instant hit. We started out with 11 724 names, and within a short while we added 169 more names. Mostly forgotten Jewish families, and Norwegians doing service for allied forces in other countries. Our digital feature became the most read article in 2015, and creating a new interest for war history among Norwegians.

What makes this project innovative? What was its impact?

This is a digital feature presenting and organizing historical data in a new way. The data was first and foremost only available through the book volumes. The Genealogy Society of Norway had some key data in an Excel spreadsheet, which became our Rosetta stone. After getting access to the digital copy of the volumes, we had all the pages as JPGs and, as a bonus, their XML-version of each page. Each page had been run through the National Library’s text-recognition software. Suddenly we had a digital version of the biographies and a JPG of each page. Combining the XML-data with our spreadsheet made all the biographies available for us. A core-developer at VG got hold of the JPGs, and within 24 hours he managed to create a Python-script which detected the images on each JPG-page, and with open source text-recognition (Tesseract from Google) we also managed to link 70 % of the names below the images directly to a person. For the remaining data (persons we could not detect through algorithms), we created a toolbox in order for us to manually chain an image to a person. Through scraping and some clever regex'ing we also managed to locate most ship positions (1000 ships), and could create a visualization of all locations where a Norwegian died during the second world war (http://www.vg.no/spesial/2015/vaare_falne/map_slider.php). We also made it easy for our readers to review the dataset and give feedback. Our readers have embraced the feature and helped us complete the database. We have made more than 2000 additions and changes based on reader feedback, and answered 3000 inquieries about the dataset. Lastly we have open sourced the dataset, and given all the additional data back to the The Genealogy Society of Norway.

Technologies used for this project:

We had one top priority when publishing this project. It should be easy to find a relative or someone from your local area in our database. We indexed everything in Elasticsearch, and created certains keys for birthplace, , ship sunk (almost half of the Norwegian victims died at sea) and which incident caused their death and which location they were at the time. We also had some other data as their birthdate and the date they died. MySQL where also used before publishing as part of our investigation and as a research tool. In order to detect images and connect each image to a name, we used a Python script, and combined with Tesseract from Google in order to OCR each image. Our backend was mostly written with PHP and Javascript. We also extensively used JQuery, Bootstrap and MapBox to do our visualizations in an easy way.

Video

Follow this project
Wait
Comment

Comments (0)

You have to be connected to contribute

You have to be connected to follow

Leave this project and no longer be informed about this project

By joining this project, you will be informed by email when an update or a new contribution is posted on the website.

Thank you for your active participation !
Best,

The GEN Community Team