Here you can find more information about the research projects currently being undertaken in collaboration with KB.
The Astrid Lindgren Code:
Accessing Astrid Lindgren’s shorthand manuscripts through handwritten text recognition, media history, and genetic criticism
Grant: 2, 874 Mkr
Call: Riksbankens jubileumsfond, Projects within the humanities and social sciences
Administrator: The Swedish institute for children’s books/Uppsala university
Project leader: Dr Malin Nauwerck
Astrid Lindgren holds a unique position within world literature, yet her enigmatic creative process has for many years been hidden in her original drafts and manuscripts, written in Melin shorthand/stenography. These manuscripts have been considered “impossible” to access and have to date never been subject to research.
The purpose of “The Astrid Lindgren Code” is twofold: 1) To access Lindgren’s original stenographed drafts through adaptation of algorithms for handwritten text recognition (HTR), and to refine and develop this digital method further through crowdsourcing. 2) To study the implications of deletions, alterations, and revisions in the stenographic drafts of Lindgren’s work, with particular emphasis on Bröderna Lejonhjärta/The Brothers Lionheart (1973), from the perspectives of genetic criticism, sociological editing, and media history. This part of the study is structured around the three simultaneous roles of author, secretary, and editor that Lindgren took in her own creative process.
This three-year project utilises the joint competences of literary scholars, computer scientists, and professional stenographers to unlock the potential in the original drafts to produce new knowledge of world author Lindgren, enable a starting point for full digitalisation and transliteration of Lindgren’s original manuscripts in the future, and provide a general vehicle for methodological development for analysis of handwritten documents.
For updates on this project, see ”About the Astrid Lindgren Code" (The Swedish institute for children’s books) External link..
EODOPEN - eBooks-On-Demand-Network Opening Publications for European Netizens
Grant: EUR 4M
Call: EACEA 34/2018
Administrator: Innsbruck university
Project contact at KB: Kate Parson
Libraries all over Europe face the difficulty of managing tremendous amounts of 20th and 21st century textual materials which have not yet been digitised because of the complex copyright situation. These works cannot be accessed by the general public and are slumbering deep in library stacks, as they are often out-of-print or have never even been in-print at all and reprints or facsimiles are out of sight.
The EODOPEN project, as proposed by 15 libraries from 11 European countries, focuses on bringing these digitally-hidden works to the public forefront by digitising and making them available on a pan-European level whilst fully respecting current copyright regimes.
To achieve this aim, the project will – by focusing on the “demand side” rather than merely on the “supply side” – directly engage with national, regional, and local communities in the selection of material as well as the digitisation and dissemination process, finally enhancing intercultural dialogue with the help of the digitized objects. In addition, alternative delivery formats, in particular for mobile devices, as well as for blind or visually impaired users, will allow reaching a broader audience for digitised content.
Furthermore, hands-on-workshops, guidelines and special tools made available to all European libraries shall build confidence amongst library staff in dealing adequately with rights clearance, and therefore contribute to the objective of reinforcing the ability of library staff to operate transnationally.
Finally, the digitised items will be made available to the broad European public on the project participants’ well established digital libraries as well as a common portal developed during the project lifetime, ensuring transnational circulation and access to cultural works in the aftermath of the project. Creativity across Europe will be sparked as new readers and content creators discover works previously unavailable to them.
For more information visit: kb.se/eodopen External link.
Evaluation and refinement of an enhanced OCR-process for mass digitisation
Grant: 1,7 Mkr
Call: Riksbankens jubileumsfond, Infrastructure for research
Administrator: KB/The Swedish language bank, Gothenburg university
Project leader: Dr. Lars Björk
Great expectations are placed on the capacity of heritage institutions to make their collections available in digital format. Data driven research is becoming a key concept within the humanities and social sciences. The National Library of Sweden's collections of digitised newspaper can thus be regarded as unique cultural data sets with information that rarely is conveyed in other media types. The digital format makes it possible to explore these resources in ways not feasible while in print format.
As texts are no longer only read but also subjected to computer based analysis, the demand on the reliability increases. Technologies for converting images to machine-readable text – OCR – play a fundamental part in making these resources available, but the effectiveness vary with the type of document being processed. This is evident in relation to the digitisation of newspapers where factors relating to their production, layout and paper quality often impair the OCR-production.
In order to improve the machine readable text, especially in relation to the digitisation of newspapers, KB initiated the development of an OCR-module where key parameters can be adjusted according to the characteristics of the material being processed. The purpose of this project application is to carry out a formal evaluation of, and improve this OCR-module through systematic text analyses, dictionaries and word lists with the aim of implementing it in the mass digitisation process.
Expansion and Diversity:
Digitally mapping and exploring independent performance in Gothenburg 1965–2000
Grant: 13 Mkr
Call: Swedish research council, DIGARV
Administrator: Gothenburg university
Project leader: Docent Astrid von Rosen
The project aims to take on the urgent challenge of accounting for diversity in late 20th century performance history. Using data from digitized newspapers, the project will explore the following research question: How can a more inclusive history of independent performance be created by combining historiographical and urban analysis with the capabilities of new information technologies?
Drawing on previous research on independent performance in Gothenburg 1965–2000, the city will be used as a case study to investigate the unresolved historiographical problem of accounting for diversity within the expanded performance field. During the project’s first year the National Archives will digitise Gothenburg newspapers and an online database combining scholarly and digital expertise will be constructed. During the second year, we will digitally explore reviews and often devalued source materials such as adverts, captions and photographs, feeding new empirical results into the database. For example, carnivals employing migrants, outreach performances for children, and queer events will be mapped. In the third year, with the completion of the online database and its accompanying studies, it will be possible to re-conceptualize how a very diverse performance heritage is historically and spatially represented. Consequently, new research questions as well as digital methods and models will emerge to help include and make accessible a cultural heritage engaging many different people.
KB’s specific contribution to this project involves digitising the following publications for the period from 1965 to 2000: Göteborgsposten, Göteborgstidningen, Göteborgs handels- och sjöfartstidning and Arbetet väst.
Mining for meaning: The dynamics of public discourse on migration
Grant: 17,2 Mkr
Call: Swedish research council, Research environment grant within migration and integration
Administrator: Linköping university
Project leader: Docent Marc Keuschnigg
Understanding public discourse is essential for managing reactions to modern migration and integrating the newly arrived. In the wake of digitalisation, much of the public sentiment, agenda setting, and political framing of societal developments has moved to social-media platforms, publicly-accessible newspaper repositories, and machine-readable archives of parliamentary speeches and party manifestos.
The project studies the discourse dynamics between politicians, media, and the public to understand meaning making, explore shifts in sentiment, and follow how collectively agreed narratives arise. The project combines research on migration and integration with the development of machine-learning applications for the sociological analysis of digitised text.
- How do the public discourses on immigration form and change over time?
- What part do social media, traditional media, and political parties play and how do they interact with each other?
- Is there a geographical variation in the content, meaning, and sentiment contained in social text?
- How does opinion polarization occur online and how does misinformation spread?
- How do discourse dynamics in local contexts (e.g. neighbourhoods) interact with real-world events (e.g. crime) and outcomes (e.g. residential segregation)?
To answer these and related questions, the research team combines quantitative and qualitative methods for the computational analysis of text with temporal analyses for causal inference.
As many of the studied mechanisms defy disciplinary boundaries, the project approach is interdisciplinary and considers theoretical concepts from several social-science disciplines. The project brings together methodologically-oriented specialists of computational text analysis with social scientists attuned to the political process and the study of social dynamics to enable a novel combination of social-science theorizing and computational text analysis.
For more information, visit "Computational text analysis", Linköping university's website External link..
The Order of Criticism Revisited: A mixed-methods study of a century and a half of literary criticism in Sweden
Grant: 6,8 Mkr
Call: Riksbankens Jubileumsfond, Mixade metoder
Administrator: Göteborgs universitet
Project leader: Docent Jonas Ingvarsson
In Lina Samuelsson’s 2013 study The Order of Critique: Swedish Book Reviews 1906, 1956, 2006, a key feature was the study of the critical text's normative discourses: What constitutes a review? Which key concepts permeate the critical text? Despite several chosen limitations, the work gathered a significant quantity of transcribed material.
In the project The Order of Criticism Revisited: A mixed-methods study of a century and a half of literary criticism in Sweden we now intend to conduct a methodological and epistemological study of corresponding materials using digital tools. Samuelsson’s discourse analysis will be compared to a language technological analysis. By "corresponding material" we mean partly Samuelsson own transcriptions, but above all (the privilege of the big data) the ambition is to make as exhaustive analysis as possible of the critical material that is available for the chosen years. A new study, of the year 2016, will be implemented, and the project will also launch an interactive interface for visualizing – and operate – the symbiosis of text mining and discourse analysis.
The project will be able to compare, in detail, analog and digital analyses of the same historical moments. The study will thus encourage several creative challenges for the technical language analysis, and highlight several methodological meta-perspectives at the intersection of traditional humanities, language technology analysis and epistemological aspects of digitization.
QUEERLIT database: Metadata Development and Searchability for LGBTQI Literary Heritage
Grant: 6,723 Mkr
Call: Riksbankens jubileumsfond, Infrastructure for Research
Administrator: University of Gothenburg
Project leader: Associate Professor Jenny Bergenmar
Queerlit is focused on the question of how the history of sexual and gender minorities can be made accessible through metadata and indexation, and how the historical conditions and current development in LGBTQI literature can be captured with these methods. The purpose of the project is to create a database of Swedish LGBTQI (lesbian, gay, trans, queer, and intersex) literature, and to develop the indexation of LGBTQI literature, serving to enable further research within the field.
The project has four aims:
1) to develop a thesaurus for indexing LGBTQI literature and to map it to existing subject headings in collaboration with KvinnSam (National Resource Library for Gender Studies) and LIBRIS (the National Union Catalogue)
2) to identify LGBTQI literature in collaboration with an advisory board consisting of experts in Swedish LGBTQI literature
3) to construct a sub-database in LIBRIS containing bibliographic records of LGBTQI literature in collaboration with the National Library and KvinnSam
4) to make the database available through a separate interface allowing for more specialized searches than LIBRIS does, and to link the records to other open data sets
Identifying and capturing LGBTQI themes through indexing is important for both research and the public, as such literature reflects societal values, and can reflect individual experiences of sexuality and gender identity. As LGBTQI fiction is not always explicit, questions regularly arise concerning what to include and how to describe this literature in bibliographical metadata. In such a perspective, increased indexing capabilities are essential. It is urgent, in particular for scholars, that such indexing is reliable and to the point.
A quality-controlled subject-specific bibliographic database enables research on the distribution and development of LGBTQI motifs and themes through time and within different genres. It facilitates diachronic studies and can show larger patterns regarding specific concepts, metaphors, and images used to describe LGBTQI, as well as their presence in, or exclusion from, the canon. The database can thus be used to trace different and changing understandings of LGBTQI over a long-time span.
The project is a collaboration between the Centre for Digital Humanities at the University of Gothenburg, KvinnSam, National Resource Library for Gender Studies, the National Library, and the Archives and Library of the Queer Movement in Gothenburg. It involves scholars in literary studies and library- and information studies from Swedish Universities, as well as librarians with expertise in LGBTQI.
The Sámi audiovisual collection: Films and TV programs in archives and online
Grant: 13,1 Mkr
Call: Swedish research council, DIGARV
Administrator: Umeå university/Stockholm university /KB
Project leader: Professor Patrik Lantto
The aim of the project is to create an infrastructure for the Sámi audiovisual cultural heritage. This will be achieved by making an inventory of films and TV programs, quality-assuring the material’s metadata, and by creating a searchable digital sub-database to Svensk mediedatabas, SMDB External link. — the Sámi audiovisual collection— at the National Library of Sweden (KB). It will be made available primarily for research but material will also be published for the general public on-line (on Filmarkivet.se External link. and Öppet arkiv External link.).
The project is divided into four parts: 1) Creating an inventory; 2) Metadata; 3) Promoting availability; and 4) Ethical issues. Making the Sámi audiovisual heritage available for researchers creates a foundation for research from various academic. Moreover, the project is of importance in a wider societal context, since it will facilitate for the Sámi to control, protect and develop their own cultural heritage, as well as promote the majority society’s understanding of and knowledge about the Sámi and their culture as part of the Swedish cultural heritage.
The project is a collaboration between two academic institutions (at Umeå and Stockholm universities), two institutions with responsibility for Sámi culture (the Sámi Parliament, and Ájtte) and two institutions with responsibility for the Swedish audio-visual cultural heritage (KB and SFI).
Sharing the Visual Heritage:
Metadata, reuse and interdisciplinary research
Grant: 8,2 Mkr
Call: Swedish research council, DIGARV
Administrator: Stockholm university/ Royal institute of technology (KTH)
Project leader: Professor Anna Dahlgren
Metadata Culture is the overarching name for this research group. We are looking at different “metadata cultures” from an interdisciplinary perspective, focusing on the visual heritage. Methodologically we are also developing our sharing processes aiming for a more inclusive research culture inspired by participatory design practices.
The design and use of metadata are always culturally and ideologically inflicted. Accordingly, the practice and policy of tagging images in cultural heritage institutions are not only fundamental for our understanding of the past but vital in navigating the present. Especially when it comes to big data and data-driven research, we have to pay particular attention to the consequences of the interfaces that curate our common history. We are living in a metadata culture, where tagging data has become an important literacy. Metadata and the archiving practices that produce it are increasingly important for cultural heritage institutions, and for contemporary culture at large, as a mean to navigate the rapidly growing volume of data situating them historically, socially and, not least, locally.
Crowdsourcing, social media platforms for community engagement, linked open data, and other participatory and open science practices, create new challenges to for archiving institutions due to the character of the networked publics involved and the established structures between and within institutions, but also new opportunities and practices when it comes to understanding and defining our shared culture.
For updates on this project, see its website Metadata culture External link..
Speech technology based methods for increasing the availability of the audiovisual collections of the Swedish National Library
Grant: 8,6 Mkr
Call: Riksbankens Jubileumsfond, Infrastructure for research
Administrator: Royal institute of technology (KTH)/KB
Project leader: Docent Jens Edlund
The audiovisual collections at the National Library of Sweden (KB) currently contain in excess of 10 million hours (more than 1000 years) of recorded audio and video. The collections are available to verified researchers, authors and journalists, but due to strict legal requirements, only within KB’s premises. The collections make up an unfathomable resource. They can provide invaluable and incomparable insights to Swedish culture, history, literature, art, society and politics, to mention but a few fields. They are equally valuable for meta-studies of fields in which the materials was produced, from broadcasting in general to specific genres, and for technical research areas geared towards (semi-)automated analysis of multimodal materials. This inconceivably large data set constitutes a primary as well as a secondary source in dozens of areas.
In practice, the collections are not used. The reason is precisely the size of the material – it takes too long to investigate. This is doubly wasteful: a great resource potential is left dormant, while KB spends substantial resources to maintain and add materials. Speech technology is a key to these collections, as it provides a means to make them searchable.
The project develops speech technology methods that can be applied to the audiovisual collections at KB, but also to other, similar materials. The methods and their reference implementations are made freely available on the national research infrastructure Språkbanken Tal.
Swedish Cinema and Everyday Life: A study of cinema-going in its peak and decline
Grant: 3,8 Mkr
Call: Swedish research council, Projects within the humanities and social sciences
Administrator: Örebro university
Project leader: Dr. Åsa Jernudd
The purpose of this project is to further our knowledge of how cinema was present in the lives of people in 1950s and 1960s Sweden and to deepen our understanding about how cinema-going is remembered as woven into the fabric of everyday life. A first objective of the study is to understand what is specific to experiences of cinema-going in the region of Bergslagen in relation to Sweden at large and in comparison with other regional or national cases. Another objective involves collaboration with Kungliga Biblioteket (The National Library): digitalizing and making assessable visual and textual evidence of cinema exhibition. A third objective aims at collecting video-recordings of memories of audience’s experiences of cinema-going at this time in Bergslagen and to perform a triangulated analysis to understand the gendered experience of cinema audiences when cinema-going was routine and ordinary as well as of its period of decline. The final objective concerns how this project can contribute to the further understanding of the nature of cinema memory.
The study will record and make available an important cultural heritage that will soon be gone. It will offer a fresh perspective on canonized film history by providing a history of cinema-going from below. By combining methods from ethnography, human geography and historical economics, the project will open up new perspectives on the relationship between the institutional contexts of film consumption and the remembered experience of cinema-going.
Swedish Post-medieval Manuscripts at the National Library of Sweden and Uppsala University Library – a cataloguing and digitisation project
Grant: 4,2 Mkr
Call: Riksbankens jubileumsfond, Infrastructure for research
Administrator: KB/Uppsala university
Project leader: Docent Patrik Åström
Historical epochs are arbitrary and govern our thoughts. Consequently, an important body of texts, produced after 1530 but organically connected to medieval text production, has been neglected. Research related to the manuscripts concern legal history, language, book, and art history, but is hampered by lack of adequate catalogues. The better known medieval manuscripts have been attended to within the TTT project, "Text till tiden".
The purpose of this project is to create a complete digital catalogue of the approximately 235 post-medieval manuscripts at the National Library of Sweden and Uppsala University Library, and to fully digitise c. 50 representative manuscripts. These younger manuscripts have been neglected by research for a long time, not least because they were produced in a period of transition when book printing was gaining wide-scale momentum. But since the two media continued to co-exist for a long time in parallel discourses, it is important to pay attention to the handwritten production of the 16th and 17th century, thereby opening up for comparative studies of both media types, their distinctiveness and mutual impact.
The catalogue will use standard metadata formats and linked open data: TEI for the manuscript descriptions and IIIF for the digitised manuscripts. It will be made available through “Manuscripta – a Digital Catalogue of Manuscripts in Sweden”, a national infrastructure maintained by the National Library of Sweden.
Welfare State Analytics: Text mining and modelling Swedish politics, media & culture, 1945-1989
Grant: 22,6 Mkr
Call: Swedish research council, DIGARV
Administrator: Umeå university/Uppsala university
Project leader: Professor Pelle Snickars
Welfare State Analytics. Text Mining and Modelling Swedish Politics, Media & Culture, 1945-1989 (WeStAc) is a digital humanities research project with five co-operating partners: Umeå University, Uppsala University, Aalto University (Finland) and the National Library of Sweden. The project will digitise literature, curate already digitised collections, and perform research via probabilistic methods and text mining models. WeStAc will both digitise and curate three massive textual datasets — in all, Big Data of almost four billion tokens — from the domains of Swedish politics, news media and literary culture during the second half of the 20th century. The dataset of “Politics” contains already digitised Swedish Governmental Reports (SOU) and material from the Swedish Parliament, “Media” contains two digitised Swedish newspapers, Aftonbladet and Dagens Nyheter, and “Culture” —which will be digitised — contains a literary journal, Bonniers Litterära magasin and all Swedish novels from 1945 to 1989.
WeStAc will establish a scholarly ecosystem of digitisation, curation and research with a twofold objective: (A.) to develop digital curation work, including the preparation of massive datasets for research at the library, and (B.) to develop digital history scholarship and perform DH-inspired textual research. WeStAc will trace discursive changes on a scale hitherto unexplored by Swedish scholars. Considering the possibility to process large amounts of data through methods as probabilistic topic models, NER or word embeddings, WeStAc will analyse how societal transformations can be empirically measured—for example by distant reading the notion of globalisation, or data modelling ideas of emancipation and individualisation.
The project design of WeStAc is organised into three distinct, but parallel work packages: WP1 Digitisation & data curation , WP2 Text mining & modelling , and WP3 Welfare state analytics & research. Data driven humanistic research — assisted by an open notebook environment — is often explorative. Datasets ingested into and worked upon by different computational methods, usually results in a scholarly practice where researchers learn about, and gradually familiarises themselves with the data at hand. As a data driven research proposal WeStAc will indeed yield to explorative scholarly practices. However, research within WP3 will also examine more specific matters. Accordingly, the work package is organised into six research tasks — including a number of subtasks — where the first three focuses on general tendencies that cut across all three datasets, and the latter three on particular research issues within each dataset.
Task 3.1-3.3 will scrutinize three broad Swedish post-war themes central in all three datasets: globalisation, emancipation and individualisation. While Task 3.4-3.6 will approach the three macro spheres and datasets of politics, media and culture with more specific questions and methods, designated to meet the characteristics of each dataset. Given WeStAc’s combination of general and particular research questions, the project will give new perspectives on well-researched topics, and explore novel ways of analysing historical Big Data.