Venice Time Machine

The Venice Time Machine is a large international project launched by the Swiss Federal Institute of Technology in Lausanne (EPFL) and the Ca ‘Foscari University of Venice in 2012 that aims to build a multidimensional collaborative model of Venice by creating an open digital archive of the cultural city heritage covering more than 1,000 years of evolution. [1] The project aims to trace circulation of news, money, commercial goods, migration, artistic and architectural patterns among others to create a Big Data of the Past. Its fulfillment would represent the largest database ever created on Venetian documents. [2]The project is an example of the new area of ​​scholar activity that has emerged in the Digital Age: Digital Humanities .

The project is wide-spread critical to the submission of a European counterpart proposal to the European Commission in April 2016. [3] The Venice Time Machine is the basis of the European Time Machine. [4]

Organization and funding

The Venice Time Machine Project was launched by EPFL and the University Ca’Foscari of Venice in 2012. It includes collaboration from major Venetian heritage institutions: the State Archive in Venice, The Marciana Library, The Instituto Veneto and the Cini Foundation. The project is currently supported by the READ (Recognition and Enrichment of Archival Documents) European eInfrastructure project, the SNF project Linked Books and the ANR-SNF Project GAWS. The international board includes renowned scholars from Stanford, Columbia, Princeton, and Oxford. In 2014, The Lombard Odier Foundation joins the Venice Time Machine project as a financial partner. [5]

Technology and tools

The State Archives of Venice contains a massive amount of handwritten documentation in languages ​​evolving from medieval times to the 20th century. An estimated 80 km of shelves are filled with over a thousand years of administrative documents, from birth registrations, death certificates, and tax statements, all the way to maps and urban planning designs. These documents are often very delicate and are occasionally in a fragile state of conservation. The diversity, amount and accuracy of the Venetian administrative documents are unique in Western history. By combining this mass of information, it is possible to reconstruct large segments of the city’s past: complete biographies, political dynamics, or even the appearance of buildings and entire neighborhoods.

Scanning

Paper documents are turned into high-resolution digital images with the help of scanning machines. Different types of documents imposes various constraints on the type of scanning machines that can be used. In partnership with industry, EPFL is working on a semi-automatic, capable robotic scanning unit of digitizing about 1000 pages per hour. Multiple units of this kind will be built to create efficient digitization pipeline adapted to ancient documents. Another solution currently being explored at EPFL involves scanning the books at home. This technique uses X-ray synchrotron radiation produced by a particle accelerator. [6]

Transcription

The graphical complexity and diversity of handwritten documents make a transcription daunting task. For the Venice Time Machine, scientists are currently developing novel algorithms that can transform images into probable words. The images are automatically broken down into sub-images that represent possible words. Each sub-image is compared to other sub-images, and classified according to the shape of the word. Each time is transcribed, it allows millions of other word transcripts to be recognized in the database.

Text processing

The strings of probable words are then turned into possible sentences by a text processor. This step is made by using, among other tools, algorithms that can identify recurring patterns.

Connecting data

The real wealth of the Venetian archives lies in the connectedness of its documentation. Several keywords link different types of documents, which makes the data searchable. This cross-referencing of imposing amounts of data on the information on giant graphs of interconnected data. Keywords in relation to graphs, making it possible to cross-reference large amounts of data, allowing new aspects of information to emerge.

The Digital Humanities Laboratory of EPFL announced on the 1st of March 2016 the development of REPLICA, a new search engine for the study and enhanced use of the Venetian cultural heritage to the end of 2016. [7]

Reception

Praise

  • Interdisciplinarity and internationalism . Major Venetian Heritage Institutions, academic institutions and professors coming from different disciplines and different institutions across the world are collaborating to achieve this collective effort. The Venice Time Machine page describes three hundred researchers and students from different disciplines (Natural Sciences, Engineering, Computer Science, Architecture, History and History of the Arts) to have collaborated for this project.
  • Development of technology . The program faces multiple technical challenges associated with a digital archive. Mass digitization not only requires the systematic scanning of ancient manuscripts, but also the automatic processing of different hand-writing styles, as well as the analysis of Latin and several other languages ​​as they evolve through time. Researchers of EPFL working on the Venice Time Machine project, for instance, presented a methodology to analyze linguistic changes by studying 200 years of Swiss newspaper archives. [8]
  • Democratization of knowledge and culture . The project seeks to open up knowledge and history to a wider audience than anybody can access, thus enhancing the link between the public and the scholars. Moreover, in reverse Digital Humanities aims to reduce barriers to the contribution and the sharing of knowledge and data by enabling the public to contribute to the effort of collecting data. The elite group of scholars and professionals should no longer be the only ones who can contribute to this problem.

Criticism

  • Skewed hearing . The whole project, along with the development of technology, seems to be a Western audience. Both the Venice Time Machine and the resulting European Time Machine are centered around European history, culture and heritage heritage. Nothing has been done so far to include more regions’ cultural history (but the project and digital humanities are still in its early stages).
  • Content selection . The scientists and researchers are working on the project to develop the dataset, which is going to the initiative’s goal of knowledge democratization. The scientists involved are in a position of power to curate the content and educational information of the Venetian database.
  • Business opportunity in disguise . Previous similar initiatives suggest that creating a link between the public and the public represents a business opportunity for those who control such a data platform. For Instance Google Books and Google Scholar helped Google’s long-term strategy to change users’ habits of searching for books of both scholarly and popular reading and making the digital world a key to finding knowledge, information and the historic past. [9]
  • Ethical issues regarding Big Data . Although the data collected is primarily from the population, it has also had a significant impact on Big Data. Data collection is not always guaranteed to be anonymous, for instance, “if an individual’s patterns are unique enough, outside information can be used to link the data back to an individual”. [10] According to Joshua Fairfield. Researchers may find that requiring consent is cost-ineffective. [11]

Other consequences

  • The program seeks to develop tools and technologies that question and challenge the role of historians and humanists altogether. Alan Liu and William G. Thomas III identify in their “Humanities in the Digital Age” [12] contribution to paradigm shift where the tools of the future become indispensable and believe humanists should shape the humanities long-term future and future proactive to avoid having the digital infrastructure built for them.

References

  1. Jump up^ http://vtm.epfl.ch/page-109836-en.html
  2. Jump up^ Kaplan, Frédéric (2015). “The Venice Time Machine”. Proceedings of the 2015 ACM Symposium on Document Engineering : 73. doi : 10.1145 / 2682571.2797071 . ISBN  9781450333078 .
  3. Jump up^ Kaplan, Frederic. “Venice Time Machine Flagship” . European Commission . Retrieved 9 May 2017 .
  4. Jump up^ Kaplan, Frédéric (2015). “The Venice Time Machine”. Proceedings of the 2015 ACM Symposium on Document Engineering : 73. doi : 10.1145 / 2682571.2797071 . ISBN  9781450333078 .
  5. Jump up^ http://vtm.epfl.ch/page-116088-en.html
  6. Jump up^ Margaritondo, Giorgio; Kaplan, Frederic; Hwu, Yeukuang; Peccenini, Eva; Stampanoni, Marco; Albertin, Fauzia (2015). “X-Ray Spectrometry and Imaging for Ancient Administrative Handwritten Documents”. X-Ray Spectrometry . 44 (3): 93-98. doi : 10.1002 / xrs.2581 .
  7. Jump up^ https://actu.epfl.ch/news/replica/
  8. Jump up^ Kaplan, Frederic; Bornet, Cyril; Buntinx, Vincent (2017). “Studying Linguistic Changes Over 200 Years of Newspapers Through Resilient Words Analysis” . Frontiers in Digital Humanities . 4 : 2. doi : 10.3389 / fdigh.2017.00002 .
  9. Jump up^ Gardiner, Eileen; Musto, Ronald G. (2015). The digital humanities: a primer for students and scholars . New York, NY: Cambridge University Press. p. 149. ISBN  978-1-107-01319-3 .
  10. Jump up^ de Montjoye, Yves-Alexandre; Hidalgo, Cesar A .; Verleysen, Michel; Blondel, Vincent D. (2013). “Unique in the crowd: The privacy bounds of human mobility”. Scientific Reports . 3 : 1-5. doi : 10.1038 / srep01376 .
  11. Jump up^ Fairfield, Joshua; Stein, Hannah (2014). “Big Data, Big Problems: Emerging Issues in the Ethics of Data Science and Journalism”. Journal of Mass Media Ethics . 29 : 38-51. doi : 10.1080 / 08900523.2014.863126 .
  12. Jump up^ Liu, Alan; Thomas III, William G. (2012). “Humanities in the Digital Age” . Inside Higher Ed .