This document specifies the general format of nanopublications.

Introduction

Good descriptions of data are essential to finding, understanding and ultimately reusing data. Here, we describe nanopublications [1,2,3], a community-driven approach to representing structured data along with its provenance as small self-contained and citable entities. A nanopublication consists of an assertion, the provenance of the assertion (simply called "provenance"), and the provenance of the whole nanopublication (called "publication info"). Nanopublication are implemented in and aligned with Semantic Web technologies, such as RDF, OWL, and SPARQL. The different levels of provenance enable users to assess the trustworthiness of data and provides a mechanism by which authors and institutions may be acknowledged for their contribution to the global knowledge graph. Nanopublications may be used to expose quantitative and qualitative data, as well as hypotheses, claims, negative results, and opinions that usually go unpublished. With nanopublications, it is possible to disseminate individual data as independent publications with or without an accompanying research article.

This document describes the structure of a nanopublication. It offers guidelines in their composition, implementation, and use. Related code can be found on Github.

Basic Elements

Each of the three parts of a nanopublication (i.e. the assertion, provenance, and publication info) is represented as an RDF graph [4]. An RDF graph is a collection of RDF triples, which comprise of subject-predicate-object tuples. We recommend using TriG syntax for writing nanopublications. The assertion graph of the nanopublication contains such statements that form the main claim of the nanopublications. Examples of assertions include:

:assertion {
  ex:trastuzumab ex:is-indicated-for ex:breast-cancer .
}
:assertion {
  ex:BRCA1-gene ex:is-involved-in ex:breast-cancer .
  ex:BRCA1-gene ex:encodes ex:BRCA1-protein .
  ex:BRCA1-protein ex:is-expressed-in ex:breast .
}

The provenance graph of the nanopublication contains one or more RDF triples that provide information about the assertion. The provenance graph of a nanopublication MUST contain a link to the assertion graph identifier. Provenance means "how this came to be" and may include any statement that discusses how the assertion was generated, who generated it, when was it generated, where was the assertion obtained from, and any other similar information. Examples of assertional provenance include:

:provenance {
  :assertion prov:wasDerivedFrom :experiment . 
  :assertion prov:wasAttributedTo orcid:0000-0003-3934-0072 .
}

The publication information graph contains one or more RDF triples that offer provenance information regarding the nanopublication itself. In this case, the subject of the triples in the publicationInfo graph MUST be the nanopublication URI and SHOULD contain attribution and timestamp. Examples of the nanopublication provenance include:

:pubinfo {
  : dct:creator orcid:0000-0003-0183-6910 .
  : dct:created "2020-07-10T10:20:22.382+02:00"^^xsd:dateTime  .
}

The nanopublication itself has its own identifier (here denoted by :) and is linked to its parts via triples in an additional head graph:

:Head {
  : a np:Nanopublication .
  : np:hasAssertion :assertion .
  : np:hasProvenance :provenance .
  : np:hasPublicationInfo :pubinfo .
}

Nanopublication Ontology

The structure of a nanopublication is defined by the following ontology using the Web Ontology Language (OWL). The namespace http://www.nanopub.org/nschema.

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix rdfg: <http://www.w3.org/2004/03/trix/rdfg-1/>.
@prefix owl: <http://www.w3.org/2002/07/owl#>.
@prefix np: <http://www.nanopub.org/nschema#>.

np:Nanopublication rdf:type owl:Class.
np:Assertion rdfs:subClassOf rdfg:Graph.
np:Provenance rdfs:subClassOf rdfg:Graph.
np:PublicationInfo rdfs:subClassOf rdfg:Graph.

np:hasAssertion a owl:FunctionalProperty.
np:hasAssertion rdfs:domain np:Nanopublication.
np:hasAssertion rdfs:range np:Assertion.

np:hasProvenance a owl:FunctionalProperty.
np:hasProvenance rdfs:domain np:Nanopublication.
np:hasProvenance rdfs:range np:Provenance.

np:hasPublicationInfo a owl:FunctionalProperty.
np:hasPublicationInfo rdfs:domain np:Nanopublication.
np:hasPublicationInfo rdfs:range np:PublicationInfo. 

Well-formed Nanopublications

A nanopublication MUST comply with all of the following criteria to be considered well-formed:

This is an example of a well-formed nanopublication in TriG notation:

@prefix : <http://example.org/pub1/> .
@prefix ex: <http://example.org/> .
@prefix np: <http://www.nanopub.org/nschema#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix orcid: <https://orcid.org/> .

:Head {
  : a np:Nanopublication .
  : np:hasAssertion :assertion .
  : np:hasProvenance :provenance .
  : np:hasPublicationInfo :pubinfo .
}

:assertion {
  ex:trastuzumab ex:is-indicated-for ex:breast-cancer .
}

:provenance {
  :assertion prov:wasDerivedFrom :experiment . 
  :assertion prov:wasAttributedTo orcid:0000-0003-3934-0072 .
}

:pubinfo {
  : dct:creator orcid:0000-0003-0183-6910 .
  : dct:created "2020-07-10T10:20:22.382+02:00"^^xsd:dateTime  .
}

Query Template

To extract an entire nanopublication from a triple store, the following SPARQL query template can be used:

prefix np: <http://www.nanopub.org/nschema#>
prefix : <...>
select ?G ?S ?P ?O where {
  {graph ?G {: a np:Nanopublication}} union
  {graph ?H {: a np:Nanopublication {: np:hasAssertion ?G} union {: np:hasProvenance ?G} union {: np:hasPublicationInfo ?G}}}
  graph ?G {?S ?P ?O}
}

Integrity Key

The goal of the integrity key is to establish an identifier that can be used to check if a nanopublication has changed, thus enforcing the immutability of nanopublications. Trusty URIs [5] are the recommended way of assigning integrity keys to nanopublications. This is the exemplary nanopublication from Section 5 after generating and attaching a trusty URI:

@prefix this: <http://example.org/pub1/RA-0Yc_l8rK3_Ts8y7kPuZvg6FqzaOSSq0yMSS9Sg4R9I> .
@prefix sub: <http://example.org/pub1/RA-0Yc_l8rK3_Ts8y7kPuZvg6FqzaOSSq0yMSS9Sg4R9I#> .
@prefix ex: <http://example.org/> .
@prefix np: <http://www.nanopub.org/nschema#> .
@prefix prov: <http://www.w3.org/ns/prov#> . 
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix orcid: <https://orcid.org/> .

sub:Head {
  this: np:hasAssertion sub:assertion;
    np:hasProvenance sub:provenance;
    np:hasPublicationInfo sub:pubinfo;
    a np:Nanopublication .
}

sub:assertion {
  ex:trastuzumab ex:is-indicated-for ex:breast-cancer .
}

sub:provenance {
  sub:assertion prov:wasAttributedTo orcid:0000-0003-3934-0072;
    prov:wasDerivedFrom sub:experiment .
}

sub:pubinfo {
  this: dct:created "2020-07-10T10:20:22.382+02:00"^^xsd:dateTime;
    dct:creator orcid:0000-0003-0183-6910 .
}

Further Information

Please check http://www.nanopub.org for further guides and community information.

References