IRI Strategy
- Author
-
Bart Kleijngeld (Alliander)
- Editors
-
Ritger Teunissen (Alliander)
Robbert Hardin (Alliander) - Version
-
v0.draft
- Feedback
Abstract
This document lays out a standardized way for assigning IRIs (Internationalized Resource Identifiers) to resources for the power and utilities industry in the Netherlands.
Assigning IRIs to resources — e.g. models, terms and real-world objects — enables their global identification on the web which is a significant part of making data FAIR, specifically by adhering to the Linked Data principles which make possible the easy and reliable linking of data and models.
Background
A well-defined IRI strategy is crucial to ensuring that resources on the web are uniquely and consistently identified, enabling the reliable linking, discovery, and integration of data. As the industry generates and shares large volumes of data across various stakeholders, a standardized IRI strategy is essential to achieving data interoperability, scalability, and long-term sustainability.
A core challenge within the industry is making data FAIR: Findable, Accessible, Interoperable and Reusable. The use of Linked Data enables the seamless connection of diverse data sources, ensuring that data is discoverable, accessible, and machine-readable. The adoption of Linked Data within the power and utilities industry delivers the following key benefits:
-
interoperability: ensures that data can be exchanged across different systems, independent of technology, enhancing integration within the power and utilities industry.
-
discoverability: facilitates better data discovery by creating structured IRIs, making it easier for stakeholders to identify and access relevant information.
-
machine-readability: enables machines to interpret and process data automatically, supporting advanced querying, analysis, and decision-making, which improves operational efficiency.
This IRI strategy is essential to ensuring the consistent identification and linkage of resources across the industry, advancing the broader goal of making data FAIR. This will contribute to a more connected, efficient, and sustainable power and utilities data ecosystem.
Conformance
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in [RFC2119].
Scope
This document is normative for — and only for — creating IRIs for resources defined or maintained by Netbeheer Nederland (NBNL).
Naming conventions for the variety of categories of resources are not part of the scope of this document, and neither are versioning strategies.
Resources
A resource can be anything we wish to state something about on the web: a certain web page, a company or the color "red". To make statements about a resource, one can refer to it by its IRI.
Resource kinds
Furthermore, following Cool URIs for the Semantic Web (among others), we distinguish between three kinds of resources: documentation resources, data resources and real-world entities or concepts.
Each of these will be explained in the following sections.
Information resources
Traditionally, all resources were web documents of some kind or another, and the URLs locating them can serve as their identifying IRIs. More formally, those kinds of resources are called information resources (colloquially referred to as web documents). In this document we further distinguish between two types of information resources:
-
documentation resources, colloquially referred to as web pages indicated by the
www
subdomain (e.g.https://www.nbnl.info/data-product/netburen
) -
data resources indicated by the
data
subdomain (e.g.https://data.nbnl.info/register/substation/001231
)
Real-world entities or concepts
Since the dawn of the Semantic Web, however, we want (anyone) to be able to say anything about anything. This includes the ability to make statements about entities or concepts in the real world. Such resources, however, are not information resources and are not on the web. To be able to state things about them, they need to have IRIs, but since they are not information resources, they are recognized as their own resource kind with a dedicated subdomain:
-
real-world entities or concepts (indicated by the
id
subdomain)
If you get confused, just remember: Use |
Ontology terms represent real-world concepts and are therefore of the |
Identity and representations
The IRI that identifies a resource will sometimes be referred to as its identity IRI, to distinguish it explicitly from IRIs that identify representations of the resource.
We must be careful not to conflate what an IRI represents. Does it represent the thing itself, or a document describing the thing? The following example might help grasp this point.
Example: Cats are not descriptions of cats
Suppose I want to say things about cats.
First, the real-world concept of a cat needs a IRI so that it becomes a resource on the web I can refer to:
Cat |
|
---|
The animals
part of the IRI reflects the ontology in which Cat
is defined. This data resource can be found at:
Model of cats |
|
---|
Finally, there could exist a web page representation with information about cats as well:
Web page about cats |
|
---|
Other models and web pages can make statements about cats and use the definition of the ontology above simply by using the |
Takeaway point
Things and documents providing descriptions of those things are different (kinds of) resources, each with their own IRI.
|
Content negotiation
Content negotiation (or conneg) makes it possible to negotiate what representation to obtain when retrieving a resource.
This becomes a powerful mechanism when retrieving resources using their identity IRI, using 303 redirects which — based on the provided request headers by the user agent — to negotiate the appropriate representation which are served at different URLs. Typical parameters for negotiating content are language, media type or format, and version.
Note that information resources too can have several representations, even though they are already informational in essence. No different from the previous example, here too, the identity IRI is used for retrieval:
It is especially common for data resources such as ontologies and data products to also have documentation for humans.
Of course, if one knows the IRI of some desired representation, this IRI can be used directly instead of using conneg. |
IRI syntax
Resources and their representations are identified by IRIs, each of which MUST be of the following syntax:
https://{kind}.nbnl.info/{category}/{namespace}[/{version}][/{reference}]
{kind}
|
Resource kind. MUST be one of: |
{category}
|
Resource category. SHOULD be one of: |
{namespace}
|
Path which encodes the namespace of the resource. This can be nested as deeply as necessary, and has no formal (nor machine-readable) meaning. |
{version}
|
Version specifier (if applicable). |
{reference}
|
Local name of some referent in the namespace (if applicable). |
Categories
Data product
Special category which contains data products.
This is much like a dedicated register, but data products are information resources, whereas register entities are not. Therefore, to avoid confusion, and because data products are important, this special category has been introduced. |
https://data.nbnl.info/data-product/{reference}[/{version}]
Resource kind |
( |
---|---|
Category |
|
Namespace |
n/a |
Version |
Data product version |
Reference |
Name of the data product |
Example: Netburen
Identity |
|
---|---|
Data representation |
|
Documentation representation |
|
Identity |
|
---|---|
Data representation |
|
Documentation representation |
|
Documentation
Documentation intended for reading by humans, not machines. A documentation project can consist of a mere single-page document, but also be comprised of a complex nested structure containing many pages and potentially many layers of organisation.
Do not confuse the category documentation with documentation representations as obtained through using www IRIs. See also: [1]
|
https://www.nbnl.info/documentation/{namespace}[/{version}]
https://www.nbnl.info/documentation/{namespace}[/{version}]/{reference}
Project | Part | |
---|---|---|
Resource kind |
( |
|
Category |
|
|
Namespace |
Namespace identifying the project |
|
Version |
Project version |
n/a |
Reference |
n/a |
Name (local to the project) of the part (e.g. page) |
Example: Modeling Guidelines
Identity |
|
---|---|
Documentation representation |
|
Part identity |
|
Documentation part representation |
Identity |
|
---|---|
Documentation representation |
|
Part identity |
|
Documentation part representation |
Models
https://data.nbnl.info/{category}/{namespace}[/{version}]
https://id.nbnl.info/{category}/{namespace}[/{version}]/{reference}
Model | Element | |
---|---|---|
Resource kind |
( |
|
Category |
|
|
Namespace |
Namespace identifying the model |
|
Version |
Model version |
n/a |
Reference |
n/a |
Name (local to the model) of the element |
Generic (version-less) models are information resources too. They can be completely described by information such as what its name, purpose and owner is, and what versions exist of it (like one way DCAT recommends managing versions). |
Never specify versions in the IRIs of model elements which represent a real-world concept, not even the model version. |
Example: Data product Netburen schema
Identity |
|
---|---|
Data representation |
|
Schema documentation |
|
Identity |
|
---|---|
Data representation |
|
Documentation representation |
|
Suppose we are composing a schema for the data product Netburen.
id: https://data.nbnl.info/schema/data-product/netburen (1)
version: 1.0.1 (2)
name: netburen (3)
prefixes: (4)
nbnl: https://id.nbnl.info/taxonomy/energiesysteembeheer/
cimnl: https://id.nbnl.info/ont/cim-nl/
classes:
MarketEvaluationPoint:
class_uri: cim:MarketEvaluationPoint (5)
description: The identification of an entity where energy products are measured
or computed.
from_schema: https://cim.ucaiug.io/ns#TC57CIM.IEC62325.MarketManagement
exact_mappings:
- nbnl:aansluiting (6)
is_a: UsagePoint
1 | The generic IRI of the schema, i.e. not a specific version. Note that we have encoded the fact that this schema is tightly coupled to a data product in the namespace of the IRI. |
2 | The version of the schema this document represents. |
3 | Technical name of the schema. This has local scope, but it is good practice to keep it equal to the model name in the IRI. |
4 | Prefixes for easy referring to terms from other models. Note the use of the id subdomain because ontology and thesauri terms are not real-world concepts. Also: never use versions in IRIs which represent real-world concepts. |
5 | Referring to a term from an ontology. |
6 | Referring to a term from a thesaurus. |
Example: CIM NL ontology extension
Ontology | Term | |
---|---|---|
Identity |
|
|
Data representation |
||
Documentation representation |
|
|
Registers
A register is a logical container of resources which represent entities from the real-world domain. Each register contains one type of entity.
https://data.nbnl.info/register/{namespace}
https://id.nbnl.info/register/{namespace}/{reference}
Register | Entity | |
---|---|---|
Resource kind |
( |
( |
Category |
|
|
Namespace |
Namespace identifying the register |
|
Version |
n/a |
|
Reference |
n/a |
Reference name (local to the register) of the contained entity |
Example: Substation register
Identity |
|
---|---|
Data representation |
|
Documentation part representation |
|
Identity |
|
---|---|
Data representation |
|
Documentation representation |
|
Terms and definitions
- resource
-
The term resource is used in a general sense for whatever might be identified by a URI. See: [AWWW].
- resource description
-
A machine-readable representation of the resource, typically in some serialization of RDF.
- information resource
-
A resource which has the property that all of its essential characteristics can be conveyed in a message. See also: [AWWW].
References
-
[COOL-URIS] W3C. 2008. Cool URIs for the Semantic Web.
-
[PLDN-URI] PLDN. Aanzet tot een nationale URI-Strategie voor Linked Data van de Nederlandse overheid. Work in progress.
-
[AWWW] W3C. 2004. Architecture of the World Wide Web, Volume One.
-
[HAWKE] Sandro Hawke. 2002. Disambiguating RDF Identifiers
-
[LOGIUS] Ministerie van Binnenlandse Zaken en Koninkrijksrelaties. Linked Data structuur | Logius Stelselcatalogus.
-
[WORKING-ONTOLOGIST] Dean Allemang, Jim Hendler, and Fabien Gandon. 2020. Semantic Web for the Working Ontologist: Effective Modeling for Linked Data, RDFS, and OWL (3rd. ed.). Association for Computing Machinery, New York, NY, USA.
-
[LD-BP] W3C. 2014. Best Practices for Publishing Linked Data. Section 5.
-
[HALPIN] Harry Halpin. Semantic Insecurity: Security and the Semantic Web. PrivOn 2017 - Workshop Society, Privacy and the Semantic Web - Policy and Technology, Oct 2017, Vienna, Austria. pp.1-10. ffhal01673291f.
-
[RFC2119] S. Bradner. 1997. Key words for use in RFCs to Indicate Requirement Levels.
-
[DODDS-DAVIS] Leigh Dodds, Ian Davis. 2022. Linked Data Patterns.
-
[BOOTH] David Booth. 2003. Four Uses of a URL: Name, Concept, Web Location and Document Instance.
-
[HTTPRANGE-14] Roy Fielding. 2005. [httpRange-14] Resolved.
-
[TBL-GENERIC] Tim Berners-Lee. 1996. Generic Resources.
-
[OWL2-PRIMER] W3C. 2012. OWL 2 Web Ontology Language Primer (Second Edition).
-
[OWL-SKOS] W3C. 2008. Using OWL and SKOS.
-
[LD] W3C. 2006. Linked Data.
-
[RFC3987] W3C. 2005. Internationalized Resource Identifiers (IRIs).
-
[FAIR] GO FAIR. FAIR Principles.
-
[USER-AGENT] W3C. 2011. Definition of User Agent.
www
IRI type and the documentation
category: yes it is. The www
IRI type indicates we are requesting a web page representation of the resource. The documentation
category tells us humans that the resource is a document. Contrast this, for example, with a web page representation of an ontology term, which would have a www
IRI type but an ontology
container type.