IRI Strategy

Author

Bart Kleijngeld (Alliander)

Editors

Ritger Teunissen (Alliander)
Robbert Hardin (Alliander)

Version

v0.draft

Feedback

Issue on GitHub (Netbeheer-Nederland/doc-iri-strategy)

Abstract

This document lays out a standardized way for assigning IRIs (Internationalized Resource Identifiers) to resources for the power and utilities industry in the Netherlands.

Assigning IRIs to resources — e.g. models, terms and real-world objects — enables their global identification on the web which is a significant part of making data FAIR, specifically by adhering to the Linked Data principles which make possible the easy and reliable linking of data and models.

Background

A well-defined IRI strategy is crucial to ensuring that resources on the web are uniquely and consistently identified, enabling the reliable linking, discovery, and integration of data. As the industry generates and shares large volumes of data across various stakeholders, a standardized IRI strategy is essential to achieving data interoperability, scalability, and long-term sustainability.

A core challenge within the industry is making data FAIR: Findable, Accessible, Interoperable and Reusable. The use of Linked Data enables the seamless connection of diverse data sources, ensuring that data is discoverable, accessible, and machine-readable. The adoption of Linked Data within the power and utilities industry delivers the following key benefits:

  • interoperability: ensures that data can be exchanged across different systems, independent of technology, enhancing integration within the power and utilities industry.

  • discoverability: facilitates better data discovery by creating structured IRIs, making it easier for stakeholders to identify and access relevant information.

  • machine-readability: enables machines to interpret and process data automatically, supporting advanced querying, analysis, and decision-making, which improves operational efficiency.

This IRI strategy is essential to ensuring the consistent identification and linkage of resources across the industry, advancing the broader goal of making data FAIR. This will contribute to a more connected, efficient, and sustainable power and utilities data ecosystem.

Conformance

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in [RFC2119].

Scope

This document is normative for — and only for — creating IRIs for resources defined or maintained by Netbeheer Nederland (NBNL).

Naming conventions for the variety of categories of resources are not part of the scope of this document, and neither are versioning strategies.

Resources

A resource can be anything we wish to state something about on the web: a certain web page, a company or the color "red". To make statements about a resource, one can refer to it by its IRI.

Resource kinds

Furthermore, following Cool URIs for the Semantic Web (among others), we distinguish between three kinds of resources: documentation resources, data resources and real-world entities or concepts.

Each of these will be explained in the following sections.

Information resources

Traditionally, all resources were web documents of some kind or another, and the URLs locating them can serve as their identifying IRIs. More formally, those kinds of resources are called information resources (colloquially referred to as web documents). In this document we further distinguish between two types of information resources:

  • documentation resources, colloquially referred to as web pages indicated by the www subdomain (e.g. https://www.nbnl.info/data-product/netburen)

  • data resources indicated by the data subdomain (e.g. https://data.nbnl.info/register/substation/001231)

Real-world entities or concepts

Since the dawn of the Semantic Web, however, we want (anyone) to be able to say anything about anything. This includes the ability to make statements about entities or concepts in the real world. Such resources, however, are not information resources and are not on the web. To be able to state things about them, they need to have IRIs, but since they are not information resources, they are recognized as their own resource kind with a dedicated subdomain:

  • real-world entities or concepts (indicated by the id subdomain)

If you get confused, just remember:

Use id if and only if the resource refers to a real-world concept or entity.

Ontology terms represent real-world concepts and are therefore of the id kind. A validation class or SHACL shape, on the other hand, are mere information resources and have no connection to the real world.

Identity and representations

The IRI that identifies a resource will sometimes be referred to as its identity IRI, to distinguish it explicitly from IRIs that identify representations of the resource.

We must be careful not to conflate what an IRI represents. Does it represent the thing itself, or a document describing the thing? The following example might help grasp this point.

Example: Cats are not descriptions of cats

Suppose I want to say things about cats.

First, the real-world concept of a cat needs a IRI so that it becomes a resource on the web I can refer to:

Cat

https://id.example.com/animals/Cat

The animals part of the IRI reflects the ontology in which Cat is defined. This data resource can be found at:

Model of cats

https://data.example.com/animals/Cat

Finally, there could exist a web page representation with information about cats as well:

Web page about cats

https://www.example.com/animals/Cat

Other models and web pages can make statements about cats and use the definition of the ontology above simply by using the https://id.example.com/animals/Cat IRI to refer to it.

Takeaway point
Things and documents providing descriptions of those things are different (kinds of) resources, each with their own IRI.

Content negotiation

Content negotiation (or conneg) makes it possible to negotiate what representation to obtain when retrieving a resource.

This becomes a powerful mechanism when retrieving resources using their identity IRI, using 303 redirects which — based on the provided request headers by the user agent — to negotiate the appropriate representation which are served at different URLs. Typical parameters for negotiating content are language, media type or format, and version.

conneg real world entities
Figure 1. Example content negotiation for a real-world entity resource

Note that information resources too can have several representations, even though they are already informational in essence. No different from the previous example, here too, the identity IRI is used for retrieval:

conneg information resources
Figure 2. Example content negotiation for an information resource

It is especially common for data resources such as ontologies and data products to also have documentation for humans.

Of course, if one knows the IRI of some desired representation, this IRI can be used directly instead of using conneg.

IRI syntax

Resources and their representations are identified by IRIs, each of which MUST be of the following syntax:

Base IRI syntax
https://{kind}.nbnl.info/{category}/{namespace}[/{version}][/{reference}]
{kind}

Resource kind. MUST be one of:
data | id | www

{category}

Resource category. SHOULD be one of:
data-product | documentation | ontology | register | schema | thesaurus

{namespace}

Path which encodes the namespace of the resource. This can be nested as deeply as necessary, and has no formal (nor machine-readable) meaning.

{version}

Version specifier (if applicable).

{reference}

Local name of some referent in the namespace (if applicable).

Categories

Data product

Special category which contains data products.

This is much like a dedicated register, but data products are information resources, whereas register entities are not. Therefore, to avoid confusion, and because data products are important, this special category has been introduced.
IRI syntax for a data product
https://data.nbnl.info/data-product/{reference}[/{version}]

Resource kind

(data) data resource

Category

data-product

Namespace

n/a

Version
(optional)

Data product version

Reference

Name of the data product

Example: Netburen
Table 1. Generic (version-less) data product IRIs

Identity

https://data.nbnl.info/data-product/netburen

Data representation

Documentation representation

https://www.nbnl.info/data-product/netburen

Table 2. Versioned data product IRIs

Identity

https://data.nbnl.info/data-product/netburen/2.1.1

Data representation

Documentation representation

https://www.nbnl.info/data-product/netburen/2.1.1

Documentation

Documentation intended for reading by humans, not machines. A documentation project can consist of a mere single-page document, but also be comprised of a complex nested structure containing many pages and potentially many layers of organisation.

Do not confuse the category documentation with documentation representations as obtained through using www IRIs. See also: [1]
IRI syntax for a project
https://www.nbnl.info/documentation/{namespace}[/{version}]
IRI syntax for a part
https://www.nbnl.info/documentation/{namespace}[/{version}]/{reference}
Project Part

Resource kind

(www) documentation resource

Category

documentation [1]

Namespace

Namespace identifying the project

Version
(optional)

Project version

n/a

Reference

n/a

Name (local to the project) of the part (e.g. page)

Example: Modeling Guidelines
Table 3. Generic (version-less) documentation IRIs

Identity

https://www.nbnl.info/documentation/modeling-guidelines

Documentation representation

Part identity

https://www.nbnl.info/documentation/modeling-guidelines/cim-profiling

Documentation part representation

Table 4. Versioned documentation IRIs

Identity

https://www.nbnl.info/documentation/modeling-guidelines/1.0.0

Documentation representation

Part identity

https://www.nbnl.info/documentation/modeling-guidelines/1.0.0/cim-profiling

Documentation part representation

Models

IRI syntax for a model
https://data.nbnl.info/{category}/{namespace}[/{version}]
IRI syntax for a model element
https://id.nbnl.info/{category}/{namespace}[/{version}]/{reference}
Model Element

Resource kind

(data) data resource

  • (id) real-world concept if category is ontology or thesaurus

  • (data) data resource if category is schema

Category

ontology | schema | thesaurus

Namespace

Namespace identifying the model

Version
(optional)

Model version

n/a

Reference

n/a

Name (local to the model) of the element

Generic (version-less) models are information resources too. They can be completely described by information such as what its name, purpose and owner is, and what versions exist of it (like one way DCAT recommends managing versions).
Never specify versions in the IRIs of model elements which represent a real-world concept, not even the model version.
Example: Data product Netburen schema
Table 5. Generic (version-less) data product schema IRIs

Identity

https://data.nbnl.info/schema/data-product/netburen

Data representation

Schema documentation

https://www.nbnl.info/schema/data-product/netburen

Table 6. Versioned data product schema IRIs

Identity

https://data.nbnl.info/data-product/netburen/2.1.1

Data representation

Documentation representation

https://www.nbnl.info/data-product/netburen/2.1.1


Suppose we are composing a schema for the data product Netburen.

LinkML schema
id: https://data.nbnl.info/schema/data-product/netburen (1)
version: 1.0.1 (2)
name: netburen (3)
prefixes: (4)
  nbnl: https://id.nbnl.info/taxonomy/energiesysteembeheer/
  cimnl: https://id.nbnl.info/ont/cim-nl/
classes:
  MarketEvaluationPoint:
    class_uri: cim:MarketEvaluationPoint (5)
    description: The identification of an entity where energy products are measured
      or computed.
    from_schema: https://cim.ucaiug.io/ns#TC57CIM.IEC62325.MarketManagement
    exact_mappings:
    - nbnl:aansluiting (6)
    is_a: UsagePoint
1 The generic IRI of the schema, i.e. not a specific version. Note that we have encoded the fact that this schema is tightly coupled to a data product in the namespace of the IRI.
2 The version of the schema this document represents.
3 Technical name of the schema. This has local scope, but it is good practice to keep it equal to the model name in the IRI.
4 Prefixes for easy referring to terms from other models. Note the use of the id subdomain because ontology and thesauri terms are not real-world concepts. Also: never use versions in IRIs which represent real-world concepts.
5 Referring to a term from an ontology.
6 Referring to a term from a thesaurus.
Example: CIM NL ontology extension
Table 7. Generic (version-less) ontology IRIs
Ontology Term

Identity

https://data.nbnl.info/ontology/cim-nl

https://id.nbnl.info/ontology/cim-nl/EAN18Code

Data representation

Documentation representation

https://www.nbnl.info/ontology/cim-nl

https://www.nbnl.info/ontology/cim-nl/EAN18Code

Registers

A register is a logical container of resources which represent entities from the real-world domain. Each register contains one type of entity.

IRI syntax for a register
https://data.nbnl.info/register/{namespace}
IRI syntax for a register entity
https://id.nbnl.info/register/{namespace}/{reference}
Register Entity

Resource kind

(data) data resource

(id) real-world entity

Category

register

Namespace

Namespace identifying the register

Version

n/a

Reference

n/a

Reference name (local to the register) of the contained entity

Example: Substation register
Table 8. Register IRIs

Identity

https://data.nbnl.info/register/substation

Data representation

Documentation part representation

https://www.nbnl.info/register/substation

Table 9. Entity IRIs

Identity

https://id.nbnl.info/register/substation/001231

Data representation

https://data.nbnl.info/register/substation/001231

Documentation representation

https://www.nbnl.info/register/substation/001231

Terms and definitions

resource

The term resource is used in a general sense for whatever might be identified by a URI. See: [AWWW].

resource description

A machine-readable representation of the resource, typically in some serialization of RDF.

information resource

A resource which has the property that all of its essential characteristics can be conveyed in a message. See also: [AWWW].

References


1. For those who wonder if it is really necessary to specify both the www IRI type and the documentation category: yes it is. The www IRI type indicates we are requesting a web page representation of the resource. The documentation category tells us humans that the resource is a document. Contrast this, for example, with a web page representation of an ontology term, which would have a www IRI type but an ontology container type.