IRI Strategy

Author: Bart Kleijngeld (Alliander)
Editors: Ritger Teunissen (Alliander)
Robbert Hardin (Alliander)
Version: v0.draft
Feedback: Issue on GitHub (Netbeheer-Nederland/doc-iri-strategy)

Abstract

This document lays out a standardized way for assigning IRIs (Internationalized Resource Identifiers) to resources for the power and utilities industry in the Netherlands.

Assigning IRIs to resources — e.g. models, terms and real-world objects — enables their global identification on the web which is a significant part of making data FAIR, specifically by adhering to the Linked Data principles which make possible the easy and reliable linking of data and models.

Background

A well-defined IRI strategy is crucial to ensuring that resources on the web are uniquely and consistently identified, enabling the reliable linking, discovery, and integration of data. As the industry generates and shares large volumes of data across various stakeholders, a standardized IRI strategy is essential to achieving data interoperability, scalability, and long-term sustainability.

A core challenge within the industry is making data FAIR: Findable, Accessible, Interoperable and Reusable. The use of Linked Data enables the seamless connection of diverse data sources, ensuring that data is discoverable, accessible, and machine-readable. The adoption of Linked Data within the power and utilities industry delivers the following key benefits:

interoperability: ensures that data can be exchanged across different systems, independent of technology, enhancing integration within the power and utilities industry.
discoverability: facilitates better data discovery by creating structured IRIs, making it easier for stakeholders to identify and access relevant information.
machine-readability: enables machines to interpret and process data automatically, supporting advanced querying, analysis, and decision-making, which improves operational efficiency.

This IRI strategy is essential to ensuring the consistent identification and linkage of resources across the industry, advancing the broader goal of making data FAIR. This will contribute to a more connected, efficient, and sustainable power and utilities data ecosystem.

Conformance

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in [RFC2119].

Scope

This document is normative for — and only for — creating IRIs for resources defined or maintained by Netbeheer Nederland (NBNL).

Naming conventions for the variety of categories of resources are not part of the scope of this document, and neither are versioning strategies.

Resources

A resource can be anything we wish to state something about on the web: a certain web page, a company or the color "red". To make statements about a resource, one can refer to it by its IRI.

Resource kinds

Furthermore, following Cool URIs for the Semantic Web (among others), we distinguish between three kinds of resources: documentation resources, data resources and real-world entities or concepts.

Each of these will be explained in the following sections.

Information resources

Traditionally, all resources were web documents of some kind or another, and the URLs locating them can serve as their identifying IRIs. More formally, those kinds of resources are called information resources (colloquially referred to as web documents). In this document we further distinguish between two types of information resources:

documentation resources, colloquially referred to as web pages indicated by the www subdomain (e.g. https://www.nbnl.info/data-product/netburen)
data resources indicated by the data subdomain (e.g. https://data.nbnl.info/register/substation/001231)

Real-world entities or concepts

Since the dawn of the Semantic Web, however, we want (anyone) to be able to say anything about anything. This includes the ability to make statements about entities or concepts in the real world. Such resources, however, are not information resources and are not on the web. To be able to state things about them, they need to have IRIs, but since they are not information resources, they are recognized as their own resource kind with a dedicated subdomain:

real-world entities or concepts (indicated by the id subdomain)

If you get confused, just remember:

Use id if and only if the resource refers to a real-world concept or entity.

Ontology terms represent real-world concepts and are therefore of the id kind. A validation class or SHACL shape, on the other hand, are mere information resources and have no connection to the real world.

Identity and representations

The IRI that identifies a resource will sometimes be referred to as its identity IRI, to distinguish it explicitly from IRIs that identify representations of the resource.

We must be careful not to conflate what an IRI represents. Does it represent the thing itself, or a document describing the thing? The following example might help grasp this point.

Example: Cats are not descriptions of cats

Suppose I want to say things about cats.

First, the real-world concept of a cat needs a IRI so that it becomes a resource on the web I can refer to:

Cat	`https://id.example.com/animals/Cat`

The animals part of the IRI reflects the ontology in which Cat is defined. This data resource can be found at:

Model of cats	`https://data.example.com/animals/Cat`

Model of cats

https://data.example.com/animals/Cat

Finally, there could exist a web page representation with information about cats as well:

Web page about cats	`https://www.example.com/animals/Cat`

Web page about cats

https://www.example.com/animals/Cat

Other models and web pages can make statements about cats and use the definition of the ontology above simply by using the https://id.example.com/animals/Cat IRI to refer to it.

Takeaway point

Things and documents providing descriptions of those things are different (kinds of) resources, each with their own IRI.

Content negotiation

Content negotiation (or conneg) makes it possible to negotiate what representation to obtain when retrieving a resource.

This becomes a powerful mechanism when retrieving resources using their identity IRI, using 303 redirects which — based on the provided request headers by the user agent — to negotiate the appropriate representation which are served at different URLs. Typical parameters for negotiating content are language, media type or format, and version.

Figure 1. Example content negotiation for a real-world entity resource

Note that information resources too can have several representations, even though they are already informational in essence. No different from the previous example, here too, the identity IRI is used for retrieval:

Figure 2. Example content negotiation for an information resource

It is especially common for data resources such as ontologies and data products to also have documentation for humans.

Of course, if one knows the IRI of some desired representation, this IRI can be used directly instead of using conneg.

IRI syntax

Resources and their representations are identified by IRIs, each of which MUST be of the following syntax:

Base IRI syntax

https://{kind}.nbnl.info/{category}/{namespace}[/{version}][/{reference}]

`{kind}`	Resource kind. MUST be one of: `data` \| `id` \| `www`
`{category}`	Resource category. SHOULD be one of: `data-product` \| `documentation` \| `ontology` \| `register` \| `schema` \| `thesaurus`
`{namespace}`	Path which encodes the namespace of the resource. This can be nested as deeply as necessary, and has no formal (nor machine-readable) meaning.
`{version}`	Version specifier (if applicable).
`{reference}`	Local name of some referent in the namespace (if applicable).

Categories

Data product

Special category which contains data products.

This is much like a dedicated register, but data products are information resources, whereas register entities are not. Therefore, to avoid confusion, and because data products are important, this special category has been introduced.

IRI syntax for a data product

https://data.nbnl.info/data-product/{reference}[/{version}]

Resource kind	(`data`) data resource
Category	`data-product`
Namespace	n/a
Version (optional)	Data product version
Reference	Name of the data product

Resource kind

(data) data resource

Category

data-product

Namespace

n/a

Version
(optional)

Data product version

Reference

Name of the data product

Example: Netburen

Table 1. Generic (version-less) data product IRIs
Data representation	`https://data.nbnl.info/data-product/netburen`
Identity	`https://data.nbnl.info/data-product/netburen`
Documentation representation	`https://www.nbnl.info/data-product/netburen`

Table 2. Versioned data product IRIs
Data representation	`https://data.nbnl.info/data-product/netburen/2.1.1`
Identity	`https://data.nbnl.info/data-product/netburen/2.1.1`
Documentation representation	`https://www.nbnl.info/data-product/netburen/2.1.1`

Documentation

Documentation intended for reading by humans, not machines. A documentation project can consist of a mere single-page document, but also be comprised of a complex nested structure containing many pages and potentially many layers of organisation.

Do not confuse the category documentation with documentation representations as obtained through using www IRIs. See also: ^[1]

IRI syntax for a project

https://www.nbnl.info/documentation/{namespace}[/{version}]

IRI syntax for a part

https://www.nbnl.info/documentation/{namespace}[/{version}]/{reference}

Project Part

	Project	Part
Resource kind	(`www`) documentation resource
Category	`documentation` ^[1]
Namespace	Namespace identifying the project
Version (optional)	Project version	n/a
Reference	n/a	Name (local to the project) of the part (e.g. page)

Resource kind

(www) documentation resource

Category

documentation ^[1]

Namespace

Namespace identifying the project

Version
(optional)

Project version

n/a

Reference

n/a

Name (local to the project) of the part (e.g. page)

Example: Modeling Guidelines

Table 3. Generic (version-less) documentation IRIs
Documentation representation	`https://www.nbnl.info/documentation/modeling-guidelines`
Identity	`https://www.nbnl.info/documentation/modeling-guidelines`
Part identity	`https://www.nbnl.info/documentation/modeling-guidelines/cim-profiling`
Documentation part representation

Table 4. Versioned documentation IRIs
Documentation representation
Identity	`https://www.nbnl.info/documentation/modeling-guidelines/1.0.0`
Part identity	`https://www.nbnl.info/documentation/modeling-guidelines/1.0.0/cim-profiling`
Documentation part representation

Models

IRI syntax for a model

https://data.nbnl.info/{category}/{namespace}[/{version}]

IRI syntax for a model element

https://id.nbnl.info/{category}/{namespace}[/{version}]/{reference}

Model Element

	Model	Element
Resource kind	(`data`) data resource	(`id`) real-world concept if category is `ontology` or `thesaurus` (`data`) data resource if category is `schema`
Category	`ontology` \| `schema` \| `thesaurus`
Namespace	Namespace identifying the model
Version (optional)	Model version	n/a
Reference	n/a	Name (local to the model) of the element

Resource kind

(data) data resource

(id) real-world concept if category is ontology or thesaurus
(data) data resource if category is schema

Category

ontology | schema | thesaurus

Namespace

Namespace identifying the model

Version
(optional)

Model version

n/a

Reference

n/a

Name (local to the model) of the element

Generic (version-less) models are information resources too. They can be completely described by information such as what its name, purpose and owner is, and what versions exist of it (like one way DCAT recommends managing versions).

Never specify versions in the IRIs of model elements which represent a real-world concept, not even the model version.

Example: Data product Netburen schema

Table 5. Generic (version-less) data product schema IRIs
Data representation	`https://data.nbnl.info/schema/data-product/netburen`
Identity	`https://data.nbnl.info/schema/data-product/netburen`
Schema documentation	`https://www.nbnl.info/schema/data-product/netburen`

Table 6. Versioned data product schema IRIs
Data representation	`https://data.nbnl.info/data-product/netburen/2.1.1`
Identity	`https://data.nbnl.info/data-product/netburen/2.1.1`
Documentation representation	`https://www.nbnl.info/data-product/netburen/2.1.1`

Suppose we are composing a schema for the data product Netburen.

LinkML schema

id: https://data.nbnl.info/schema/data-product/netburen (1)
version: 1.0.1 (2)
name: netburen (3)
prefixes: (4)
  nbnl: https://id.nbnl.info/taxonomy/energiesysteembeheer/
  cimnl: https://id.nbnl.info/ont/cim-nl/
classes:
  MarketEvaluationPoint:
    class_uri: cim:MarketEvaluationPoint (5)
    description: The identification of an entity where energy products are measured
      or computed.
    from_schema: https://cim.ucaiug.io/ns#TC57CIM.IEC62325.MarketManagement
    exact_mappings:
    - nbnl:aansluiting (6)
    is_a: UsagePoint

1	The generic IRI of the schema, i.e. not a specific version. Note that we have encoded the fact that this schema is tightly coupled to a data product in the namespace of the IRI.
2	The version of the schema this document represents.
3	Technical name of the schema. This has local scope, but it is good practice to keep it equal to the model name in the IRI.
4	Prefixes for easy referring to terms from other models. Note the use of the `id` subdomain because ontology and thesauri terms are not real-world concepts. Also: never use versions in IRIs which represent real-world concepts.
5	Referring to a term from an ontology.
6	Referring to a term from a thesaurus.

Example: CIM NL ontology extension

Table 7. Generic (version-less) ontology IRIs
	Ontology	Term
Identity	`https://data.nbnl.info/ontology/cim-nl`	`https://id.nbnl.info/ontology/cim-nl/EAN18Code`
Data representation	`https://data.nbnl.info/ontology/cim-nl`	`https://id.nbnl.info/ontology/cim-nl/EAN18Code`
Documentation representation	`https://www.nbnl.info/ontology/cim-nl`	`https://www.nbnl.info/ontology/cim-nl/EAN18Code`

Registers

A register is a logical container of resources which represent entities from the real-world domain. Each register contains one type of entity.

IRI syntax for a register

https://data.nbnl.info/register/{namespace}

IRI syntax for a register entity

https://id.nbnl.info/register/{namespace}/{reference}

	Register	Entity
Resource kind	(`data`) data resource	(`id`) real-world entity
Category	`register`
Namespace	Namespace identifying the register
Version	n/a
Reference	n/a	Reference name (local to the register) of the contained entity

Resource kind

(data) data resource

(id) real-world entity

Category

register

Namespace

Namespace identifying the register

Version

n/a

Reference

n/a

Reference name (local to the register) of the contained entity

Example: Substation register

Table 8. Register IRIs
Data representation	`https://data.nbnl.info/register/substation`
Identity	`https://data.nbnl.info/register/substation`
Documentation part representation	`https://www.nbnl.info/register/substation`

Table 9. Entity IRIs
Identity	`https://id.nbnl.info/register/substation/001231`
Data representation	`https://data.nbnl.info/register/substation/001231`
Documentation representation	`https://www.nbnl.info/register/substation/001231`

Terms and definitions

resource: The term resource is used in a general sense for whatever might be identified by a URI. See: [AWWW].
resource description: A machine-readable representation of the resource, typically in some serialization of RDF.
information resource: A resource which has the property that all of its essential characteristics can be conveyed in a message. See also: [AWWW].

References

[COOL-URIS] W3C. 2008. Cool URIs for the Semantic Web.
[PLDN-URI] PLDN. Aanzet tot een nationale URI-Strategie voor Linked Data van de Nederlandse overheid. Work in progress.
[AWWW] W3C. 2004. Architecture of the World Wide Web, Volume One.
[HAWKE] Sandro Hawke. 2002. Disambiguating RDF Identifiers
[LOGIUS] Ministerie van Binnenlandse Zaken en Koninkrijksrelaties. Linked Data structuur | Logius Stelselcatalogus.
[WORKING-ONTOLOGIST] Dean Allemang, Jim Hendler, and Fabien Gandon. 2020. Semantic Web for the Working Ontologist: Effective Modeling for Linked Data, RDFS, and OWL (3rd. ed.). Association for Computing Machinery, New York, NY, USA.
[LD-BP] W3C. 2014. Best Practices for Publishing Linked Data. Section 5.
[HALPIN] Harry Halpin. Semantic Insecurity: Security and the Semantic Web. PrivOn 2017 - Workshop Society, Privacy and the Semantic Web - Policy and Technology, Oct 2017, Vienna, Austria. pp.1-10. ffhal01673291f.
[RFC2119] S. Bradner. 1997. Key words for use in RFCs to Indicate Requirement Levels.
[DODDS-DAVIS] Leigh Dodds, Ian Davis. 2022. Linked Data Patterns.
[BOOTH] David Booth. 2003. Four Uses of a URL: Name, Concept, Web Location and Document Instance.
[HTTPRANGE-14] Roy Fielding. 2005. [httpRange-14] Resolved.
[TBL-GENERIC] Tim Berners-Lee. 1996. Generic Resources.
[OWL2-PRIMER] W3C. 2012. OWL 2 Web Ontology Language Primer (Second Edition).
[OWL-SKOS] W3C. 2008. Using OWL and SKOS.
[LD] W3C. 2006. Linked Data.
[RFC3987] W3C. 2005. Internationalized Resource Identifiers (IRIs).
[FAIR] GO FAIR. FAIR Principles.
[USER-AGENT] W3C. 2011. Definition of User Agent.

1. For those who wonder if it is really necessary to specify both the www IRI type and the documentation category: yes it is. The www IRI type indicates we are requesting a web page representation of the resource. The documentation category tells us humans that the resource is a document. Contrast this, for example, with a web page representation of an ontology term, which would have a www IRI type but an ontology container type.