Profiling and Extending the CIM with LinkML
Use cases may require extensions to the CIM, or simply the use of other vocabularies and schemas besides those from or based on the CIM. This can be achieved in a variety of ways, each of which having their own pros and cons. Most notably, modeling choices might impact compliance with the CIM standard.
In this document we will touch on:
-
What it means to be compliant with the CIM standard, and what is required in order to achieve it.
-
How to extend the CIM if what it provides does not suffice for a certain use case.
-
How to choose between the various ways in which the CIM can be extended, discussing in particular the trade-off between semantic expressiveness and CIM compliance.
This guide closely follows the [CIM-PRIMER] where applicable, but sets out to enhance it in two ways by:
|
CIM compliance
We need to conform to the standard.
Plenty of people understand the importance of standardization, but what exactly is required in order be compliant is not always clear. This could either be due to a lacking level of formality of the standard, or due to people misunderstanding or not knowing how to conform.
The CIM has defined custom data types ("CIM data types") which are modeled as classes in the official UML model. However, many vendors choose not to implement these data types as classes or some other compound structure, but represent them as simple primitives.
Is this deviating from the standard? It seems to be, but then again, some technical languages might not even be able to represent compound structures at all, in which case TODO
In the case of the
Dealing with non-compliance
Having the flexibility of extending profiles beyond the use of just the canonical CIM is great, but in many cases it causes non-conformity and as a consequence also non-interoperability with CIM-compliant software.
Merely creating a subclass from some CIM class will introduce a new name that is unknown to all software that knows only CIM.
What’s the point of a standard if the smallest deviations — in many cases necessary ones to make it useful — cause non-conformity?
Profile variants
One way of dealing with the issue is to keep two variants of the profile: one which conforms neatly with the CIM, the other which contains the extensions which (might) break conformity.
This is not always possible, since in some cases the extensions add necessary expressiveness for modeling the use case.
In cases where this is possible, it basically means a loss of information. Examples of this might be:
-
ignored optional attributes or relations
-
the use of some CIM ancestor class instead of the more semantically precise extension class
Manual maintenance
The most straightforward way of maintaining these variants of the profile is to do this manually. This becomes cumbersome very quickly though, and is very error-prone and time-consuming.
Thanks to the formal, machine-readable nature of our models, we can rely (at least in part) on automating this process.
Generating conformant profile variants
As long as we make sure we model extensions in a machine-readable way, we can benefit from automation.
The idea is as follows:
-
The extended profile is the most accurate and semantically rich model, so this could be maintained as the main profile.
-
In the main profile, relations and (machine-readable) annotations are added to the extensions to convey as much information as possible about how the extensions relate to the canonical model elements they interact with.
-
The CIM-compliant profile variant can then be generated from the main profile using tooling.
Suppose an extension class SuperConductingACLineSegment
is introduced, which is a subclass of ACLineSegment
. In LinkML, this would look as follows:
classes:
SuperConductingACLineSegment:
is_a: ACLineSegment
Software that expect CIM-compliant data which receive data which contains line segments of the type SuperConductingACLineSegment
will not be able to handle the data. However, every mention of SuperConductingACLineSegment
The specialization is encoded into the model relying on the semantics of the modeling language, ensuring
Note that LinkML also supports specialization of attributes and slots this way. For example, the relation isFatherOf
could be expressed to be a subslot of isRelativeOf
, completely analgously to the subclass case.
Possibly the automated generation only gets so you so far, and some manual edits still need to be done. If this is the case, make sure to have proper documentation and checks in place for how to deal with this in a deterministic way. Leverage machine-readability where possible. |
Example
In the Netherlands DSOs make a distinction between primary and secondary usage points, as well as them being physical or virtual. How can we model this in the CIM?
Canonical CIM
In the CIM, equipment (cim:Equipment
) can be related to zero or more usage points (cim:UsagePoint
) through the cim:Equipment.UsagePoints
relation. The cim:UsagePoint.isVirtual
attribute can be used to designate whether a usage point is virtual or physical. However, the CIM does not support distinguishing between primary and secondary usage points, so the best you can do is model it like this:
We need to extend the CIM to be able to model our use case properly. Specifically, the challenge to solve is: where do we encode the information that discerns one usage point from another in regard to whether it’s primary or secondary.
For illustrative purposes we will consider several methods, some of which are clearly not a great fit for this particular situation. They will still be covered, though, since they might be useful in other scenarions. |
Class specialization
One way to extend the CIM is to consider the primary and secondary usage points to be specializations of a usage point, i.e. we could define two new subclasses:
-
cim:PrimaryUsagePoint
is a subclass ofcim:UsagePoint
; -
cim:SecondaryUsagePoint
is a subclass ofcim:UsagePoint
.
Considerations
-
This method can only be used if all possible specializations are known beforehand, i.e. it’s not dynamic nor depend on any other detail.
-
If additional attributes and relations apply to the more specialized forms of usage points, subclasses are the way to go.
-
Encoding the specialization kind into the name of the class implies it is an inherent characteristic of the concept the class represents. This means for example that being a "primary" usage point tells you something about that usage point, regardless of the context in which it is used. Caution is advised since subtle semantic impurity can come to bite you later.
Glossary
- contextual model
-
See: (application) profile.
- (application) profile
-
Subset of an information model with application-specific constraints.
Also known as: contextual model.
- CIM profile
-
A profile which is derived from the (full) canonical CIM.
- information model
-
A technology-agnostic representation of concepts, relationships, constraints, rules and operations to specify data semantics for a chosen domain of discourse.
- CIM extension
-
Whenever the