Loading…
Thursday September 26, 2024 9:00am - 10:30am CEST

This session consists of 3 presentations and a joint Q&A with the presenters. The session contains:

➺ Jonas Kucharsky - (Un)Like Shazam: Towards an AI Model of Sound Recognition in Archival Context (Short presentation)

➺ Sami Meddeb, chaouki ben alia , chedly ganouchi - Preserving the Authenticity of Radio Archives Amidst the Evolution of Artificial Intelligence and Synthetic Media (Short presentation)

➺ Stefan Kaltseis - Can you read what you hear? A Report on an ongoing project implementing speech recognition at the Austrian Mediathek (Short presentation)


**Abstracts:**


➺ (Un)Like Shazam: Towards an AI Model of Sound Recognition in Archival Context
--
Jonas Kucharsky (Short presentation)
--
One of the crucial issues of sound archiving in a film archive context is content identification, as period film sound can be stored in various formats on various media (optical soundtracks, sepmag, commag, tape, mono, stereo, etc.). Without proper identification, historical sound is never truly preserved, however ear analysis and meticulous comparison of preserved sound archivalia with film works can be a lengthy and complex process. This presentation focuses on possible means of automatic sound recognition and categorization on a case study of production magnetic tapes and foley recordings preserved in the Národní filmový archiv in Prague.

The initiative delves into the complexities of sound recognition. By examining algorithmic methods of sound analysis (e.g. spectrogram cross-correlation, audio fingerprinting etc.) the paper presents multiple approaches to identifying and categorizing foley sounds. These techniques not only facilitate the identification of sound from deteriorating magnetic tapes but also serve as the foundation for creating an AI model that can recognize and classify archival sound recordings.

This research sheds light on the significance of advancing sound archiving practices through algorithmic methods and emerging AI technologies. By offering insights about our work in progress, the project also aims to create an international dialogue about possible means of collaboration on open datasets and methods of sound categorization and identification. The broader implications for sound archiving underscore the potential of algorithmic methods and AI models to safeguard and revitalize our audiovisual heritage, ensuring that the rich tapestry of foley sounds remains an enduring legacy for future generations.

➺ Preserving the Authenticity of Radio Archives Amidst the Evolution of Artificial Intelligence and Synthetic Media
--
Sami Meddeb, chaouki ben alia , chedly ganouchi (Short presentation)
--
Over the past year, technological advancements in artificial intelligence have witnessed remarkable progress, leading to the emergence of deepfakes and synthetic media. This development poses a significant threat to the authenticity of digital archives, particularly those of Tunisian radios.
These archives are essential resources not only for understanding history but also for reporting and investigations. Therefore, it is crucial to ensure their long-term reliability and integrity.
Hence, the following question arises: How can we ensure the reliability and integrity of digital media archives, especially those of Tunisian radio?


➺ Can you read what you hear? A Report on an ongoing project implementing speech recognition at the Austrian Mediathek
--
Stefan Kaltseis (Short presentation)
--
A comparison with the digital turn of the late 20th century comes to mind. In the 1990s, the invention of digital formats such as WAVE was all about the possibilities of permanently preserving analogue A/V media and making it accessible. With the establishment of AI systems, another digital turn is now on the horizon. Speech recognition and other indexing of digital audiovisual media through AI-based methods is the next major intervention into the core of audiovisual archives. Compared to digitalization, however, speech recognition does not serve as a means of safeguarding, but rather as a meaningful enrichment.
The Austrian Mediathek is currently in the process of integrating the open-source-based speech recognition tool Whisper, which was published by OpenAI in 2022, into the archive's digitization workflow. The resulting metadata – in the form of automatically created txt- and timecoded srt-files – should not only serve the desired accessibility, but above all significantly improve and simplify searchability and findability within the catalogue system.
The lecture will give an overview of the development of this ongoing project and will answer questions about the advantages and disadvantages of textualization of acoustic sources.
Tagging that will be easier to carry out and a more precise indexing of audiovisual content will be discussed, as well as the possibilities of better findability: can hidden treasures be found this way? But the question of what happens when we make the audiovisual archive "readable" should not remain unasked. After all, listening to the sources should not be replaced.

Moderators Speakers
avatar for Jonas Kucharsky

Jonas Kucharsky

Curator of Music and Sound, Národní filmový archiv, Prague
Jonas Kucharsky is an alumn of the Musicology department of the Masaryk University, Czech Republic. He studied at Humboldt University in Berlin and Cardiff University. He is a curator of music and sound at the National film archive in Prague. His research topics are experimental music... Read More →
avatar for Sami Meddeb

Sami Meddeb

Digital Cooperation Association
Sami Meddeb has been working as a digital audiovisual archivist for 8 years and is the president of a digital cooperation association focused on cultural heritage preservation, specifically digital audiovisual archives. Throughout his career, he has developed multiple projects in... Read More →
CB

chaouki ben alia

Chaouki Ben Alia, holds a PhD in Physics and a Master's degree in Computer Science, is currently dedicated to the detection of deepfakes through audio spectrogram analysis. His multidisciplinary expertise allows him to explore advanced signal processing techniques to identify alterations... Read More →
CG

chedly ganouchi

Chedly Ganouchi is a student of computer science at the Higher Institute of Communication. He is working in collaboration with a team on a project for the detection of audio deepfakes, aiming to preserve the authenticity of Tunisian radio heritage.
avatar for Stefan Kaltseis

Stefan Kaltseis

Audiovisual Archivist, Österreichische Mediathek
Stefan Kaltseis is head of A/V-digitization at the Austrian Mediathek. He studied cultural anthropology and philosophy at the University of Vienna. His work as a media archivist focuses on the implementation of numerous digitization projects and cooperations, including with the Viennese... Read More →
Thursday September 26, 2024 9:00am - 10:30am CEST
Aula Magna

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link