I have written a wrapper to expose openlibrary.org api data as RDF. It is written in Python and deployed on Google App Engine. It is only an illustration of Linked Data publication, and in order to get a first feedback, so do not rely on it in your application since the URL or the content will change in the future.
How to use it :
- The wrapper is deployed at http://olrdf.appspot.com
- There are 2 ways to call it :
- with « /isbn/<an_isbn> », give the ISBN (ISBN 10 or ISBN 13) of the book you want to to get data for. Example : http://olrdf.appspot.com/isbn/2070394433
- with « /key/<openlibrary_key> », give the Open Library key of the book you want to get data for. Example : http://olrdf.appspot.com/key/b/OL5218098M
- The service uses content negotiation, so if you browser does not support « application/rdf+xml » content-type, you will be redirected to the corresponding openlibrary page for the book. To avoid that :
Technical information :
- It is written in Python;
- It is deployed in Google App Engine;
- It uses and depends on the following libraries :
Linked Data Sets :
The following data sets are referenced in the data, with the following heuristics :
- The wrapper references itself, in properties such as « authors », or « rdf:type ».
- heuristic : the author or type open library key is appended to the root URL of the wrapper : « http://olrdf.appspot.com/key/ »
- The lingvoj dataset, for languages
- heuristic : the openlibrary language key (« /l/eng » for exemple) is parsed, the 3 letter language code is extracted, and an attempt is made to find the corresponding 2-letter language code, with an hardcoded mapping table. The 2-letter code is then appended to the root URI « http://www.lingvoj.org/lang/ ». If not found, the 3-letter code is appended to the same root URI.
- The RDF book mashup, in a « owl:sameAs » property
- heuristic : append the ISBN10 to the URI prefix « http://www4.wiwiss.fu-berlin.de/bookmashup/books/ »
- (plus it used to call the lcsh.info sparql endpoint to try to match the « subject » property to a corresponding Library of Congress Subject Heading. However this service has recently been shut down…)
Ontology and properties used :
The following properties are used in the generated RDF :
- Dublin Core (namespace « http://purl.org/dc/elements/1.1/ ») : publisher, subject, etc.
- Dublin Core terms (namespace « http://purl.org/dc/terms/ ») : alternative, tableOfContents, format, modified, etc.
- Bibliographic ontology (namespace « http://purl.org/ontology/bibo/ ») : edition, authorList, oclcnum, sibn10, sibn13, etc.
- Plus of course, rdf, rdfs and owl.
- For all properties not falling in one of those vocabularies, a property in the wrapper namespace (« http://olrdf.appspot.com/key/« ) is generated. No formal description of this ontology is provided.
An exhaustive description of how the Open Library data is mapped to RDF properties can be found on this google spreadsheet.
Feedback on this work is more than welcome. The python code is available on request. Potential improvements are :
- linking subjects to another controlled vocabulary now that lcsh.info is dead;
- linking authors to DBPedia;
- linking countries to geonames;
- defining a clear ontology of properties;