Introduction to the Annif automated indexing tool

Time: 14:00 – 17:00
Room: SB3-L112
Chair: Osma Suominen (National Library of Finland), Mona Lehtinen (National Library of Finland)

This tutorial focuses on introducing participants to Annif, a multilingual automated subject indexing tool that can enhance for example a library’s metadata generation system. Better metadata contributes to easier discoverability, emphasizing Annif’s – and this tutorial’s – impact on information accessibility and retrieval as well. Through hands-on exercises and demos, attendees gain practical experience in setting up Annif, training algorithms with sample data, and utilizing the tool to generate subject suggestions for new documents. The tutorial covers basic and advanced usage scenarios and allows participants to deepen their understanding and proficiency in implementing Annif for efficient metadata production in library systems. We will also demonstrate the use of Annif in institutional repositories. The tutorial is aimed at anyone with an interest in library automation including developers, repository managers and metadata librarians.

Audience

The intended audience consists of developers, repository managers and metadata librarians with an interest in library automation.

Content

In response to the growing need for efficient metadata production in libraries and related institutions, there is a concerted effort to explore automation solutions; better metadata coverage in turn contributes to improved discoverability. This hands-on tutorial presents an innovative approach by introducing participants to Annif, a powerful multilingual automated subject indexing tool designed to augment library metadata generation systems. The tutorial’s primary objective is to equip participants with practical skills, covering the complete spectrum from setting up Annif and trying it out to leveraging its capabilities for generating subject suggestions for newly acquired documents.

Through a series of engaging exercises available online, attendees will immerse themselves in the intricacies of Annif, gaining proficiency in tasks such as algorithm training using example datasets. These exercises encompass both fundamental and advanced scenarios, ensuring a comprehensive learning experience. To facilitate self-paced learning, participants have access to a curated set of instructional videos and written exercises to complete independently before the live tutorial event.

During the tutorial event, a significant portion of the time is dedicated to collaborative problem-solving, addressing queries, and fostering in-depth discussions. This interactive session provides a platform for participants to share their experiences, troubleshoot challenges, and through live demonstrations gain insights into the nuanced implementation of Annif in diverse settings. By combining pre-event resources with a live collaborative session, this tutorial aims to deepen the participants’ understanding and proficiency in implementing Annif for efficient metadata production in library systems. We will also demonstrate the use of Annif in institutional repositories. The tutorial is for anyone with an interest in library automation, including developers, repository managers and metadata librarians.

Learning outcomes

At the end of this workshop, participants know about possibilities of automated subject indexing using Annif. They are also familiar with the basic functionalities, use cases and capabilities of the Annif tool.

Requirements

Participants are instructed to use a laptop with at least 8 GB of RAM and at least 20 GB free disk space to complete the exercises. The organizers will provide the software as a preconfigured VirtualBox virtual machine. Alternatively, Docker images and a native Linux install option are provided. No prior experience with the Annif tool is required, but participants are expected to be familiar with subject vocabularies (e.g. thesauri, subject headings or classification systems) and subject metadata that reference those vocabularies.

Back to Workshops and tutorials

Audience

Content

Learning outcomes

Requirements

Gold Sponsors

Silver Sponsors

Archive