Finding similar images in a photo collection using IPTC Martijn Kleppe [email protected] Erasmus University Rotterdam In recent products of historiography (popular and scientific) ever more similar looking photographs of (mostly symbolic) happenings are used as an instrument to tell a story about important events or structural trends. However, finding similar images in a large set of books or photographs can be a challenge given the high level of subjectivity when interpreting photographs (Finnegan 2006; Rose 2007) and the lack of standardized thesauri to describe photographs (Kleppe 2012). This poster presents an approach to find the most recurring images in a set of 5.000 photographs by adding information about the content of the photo in IPTC fields that are embedded within the digital file of the photograph. The International Press Telecommunication Council (IPTC) develops technical standards for 1 newsorganisations. By applying the same standards, a photographer can embed the info of a photo inside the file and send this to an editor at a newspaper who can download the info in their local ICTinfrastructure. This technique not only facilitates the exchange of files between journalists, but also cultural heritage institutions can use IPTC to include information about their objects in the digital files (Grijsen 2012; Reser 2012). For our research on the recurring use of photographs in history textbooks (Kleppe 2013a) we analyzed over 5.000 photographs that were included in Dutch History textbooks, published in the period 1970 – 2000. All photos were digitized and analyzed by assigning 41 variables (such as topic, person and year). We used the software program Fotostation Pro to store the values of each variable by using the IPTC fields.2 It not only allowed us to do full text searches through all assigned values and share our research data with other researchers who could ‘read’ the information in the IPTC fields in their photo – editing and viewing software. Moreover, we were also able to export all values to csvfiles that were importable into statistical software such as SPSS. By making frequency tables we calculated which topics were most present in the set of photographs and then manually went over these topics to find the images that were used most often. Results show that a photograph of socialist politician Pieter Jelles Troelstra of 1912 is used most often in the analyzed textbooks. On the photo Troelstra gives a speech in which he pleas for universal suffrage. Since our database contained all info on how this particular photo was used in all textbooks, we could then return to our database and examine in which context the photo was used. We found that in onethird of all history textbooks, this photo is incorrectly dated since it is used to illustrate Troelstra’s failed attempt to start a revolution in 1918. This outcome gives ground for several follow-up historiographical research questions focusing on both the afterlife of photographs (Kroes 2007) as well as the selection processes of historical gatekeepers (Kleppe 2013). Even though our database is relatively small, the case-study of the photo of Troelstra shows that by adding data in the IPTC-fields, we were able to quickly track-down all the textbooks in which the photo is used and determine the context in which the photo is used. Studying this afterlife can even be taken a step further when databases with the same approach can be linked, e.g. collections that are described with the ICONCLASS System (Brandhorst 2013) or the GTAA (Oomen 2010). 1 2 http://www.iptc.org http://www.fotoware.com/en/Products/FotoStation/ 1 Therefor we also made our database available for future researchers (Kleppe 2013b). All images and metadata can be downloaded by both humanities scholars as well as by computational researchers who further want to explore the possibilities of data enriched with IPTC info or use the images to train image recognition software. The photo of Pieter Jelles Troelstra (top) and a screenshot of a menu in Fotostation Pro by which the info of the photo are included in the IPTC fields of the file. 2 Literature Brandhorst, H. (2012). The Iconography of the Pleasures and Problems of Drink: Thoughts on the Opportunities and Challenges for Access and Collaboration in the Digital Age. Visual Resources, 28(4), 384-390. Finnegan, C.A. (2006). What is this a picture of? Some Thoughts on Images and Archives, Rethoric & Public Affairs, 116 – 123. Grijsen, C. (2012). In perspectief: behoud en beheer van born-digital fotoarchieven. Fotografisch Geheugen 75 (2012) 24 - 26. Kleppe, M. (2013a). Canonieke Icoonfoto's. De rol van (pers)foto's in de Nederlandse geschiedschrijving (Delft). Kleppe, M. (2013b), Foto’s in Nederlandse Geschiedenisschoolboeken (FiNGS) http://www.persistent-identifier.nl/?identifier=urn:nbn:nl:ui:13-l37n-bi. Kleppe, M. (2012). Wat is het onderwerp op een foto? De kansen en problemen bij het opzetten van een eigen fotodatabase, Tijdschrift voor Mediageschiedenis, 93 – 107. Kroes, R. (2007). Photographic memories: Private pictures, public images, and American history. Dartmouth College. Oomen, Johan. & Brugman, Hennie (2010) Thesauri gekoppeld, Digitale Bibliotheek 2 (5) 18- 21. Reser, G., & Bauman, J. (2012). The Past, Present, and Future of Embedded Metadata for the LongTerm Maintenance of and Access to Digital Image Files. International Journal of Digital Library Systems (IJDLS), 3(1), 53-64. Rose, G. (2007). Visual Methodologies – An Introduction to the interpretation of Visual Materials (London). 3
© Copyright 2026 Paperzz