SwePub
SwePub is a service that harvest scientific publication metadata from the publication databases of most Swedish universities.
The SwePub metadata is accessible for end users via the SwePub search service and the data is available for free to harvest or to access through the protocols OAI PMH and SRU. We have also developed a lightweight API, called Xsearch, that accesses the data through HTTP in a simple and straightforward manner.
The metadata is available as data dumps as well.
The publications are categorized by a subject category scheme called SVEP. About 70-80% of the records are SVEP categorized. When using a SVEP category a category ID is mandatory. A list of the categories is available in RDF/XML or Excel.
List of data providers (Swedish Universities)
OAI PMH
The data is exactly as we receive it from the data providers without any data manipulation.
About OAI PMH: Open Archives Initiatives
Supported metadata formats:
Base URL: http://api.libris.kb.se/swepub/oaipmh/SWEPUB
Update frequency: every night. Some of the data providers support selective harvesting and others perform a total reload every night. As a consequence the datestamps will vary.
SRU
About SRU: Library of congress standards
Supported metadata formats:
Base URL: http://libris.kb.se/sru/swepub
Xsearch lightweight API
About Xsearch: http://librishelp.libris.kb.se/help/xsearch_eng.jsp
Supported metadata formats: http://librishelp.libris.kb.se/help/xsearch_eng.jsp
Base URL: http://libris.kb.se/xsearch?database=swepub
Data dumps
The data is available via ftp as nightly dumps.
There are two dumps: the first contains the data as provided by the local repositories (i.e. the same data as available via OAI-PMH), in the second duplicate reocords have been merged (this is the data used for loading SwePub.kb.se).
The dumps are formatted as OAI-PMH "ListRecords"-responses with SwePub-modified mods as metadata format (see passage OAI-PMH above). For the deduplicated data a repeatable non-standard element, <identifier2>, in the OAI-PMH record-header is used for listing the identifiers in a duplicate tuple.
Update frequency: every night.
Senast uppdaterad:
2010-05-18
Innehållsansvar:
Marja Haapalainen, e-post: fornamn.efternamn@kb.se