Disclaimer

This proposal is just that, a proposal. So, contributors are very welcome. If any ideas or assertions presented in this draft are incorrect, do not hesitate to share your own view by commenting on the page or contacting the author. The principal intention of sharing these ideas is to find synergies and joint agreements on the best possible next steps for the data modeling group. In this sense, I tried to point out aspects where I believe there is room for improvement with as much constructive criticism as possible.

This page describes the proposal to restructure the data modeling activities within COVESA. First, the current state is presented to provide the necessary context. Then, the proposal is introduced.

Current setup (summary)

As of today (05.2023), COVESA's data modelling activities focus mainly on VSS.
VSS is the conceptual model for the description of vehicle properties in a high-level of abstraction (i.e., friendly for non-experts).
VSS-tools complement VSS by providing the scripts that parse the specification into a few standard formats.
Pros
- Easy to understand and contribute to.
- Simple YAML files with custom constructs.
- VSS model constantly maintained
- Naming convention.
Cons
- The specification of the data model is not represented in a standard data exchange format (only after parsing it using the tools).
- One tree covers only one hierarchy.
- No clear definition of the scope. That is, where does VSS end and something else start?
- No clear distinction between conceptual and application areas.
- Most of the attention is given to the leaves of the tree, while no rule-set is in place to control how the information is classified in different branches.
- A few misleading terms are used. For example:
  - Using "Signal" to refer to dynamic data properties of the vehicle (i.e., a signal is the information carrier that can have one or multiple properties, and it occurs at a lower layer of abstraction than what VSS defines).
  - Saying that VSS is a "Taxonomy" when the implicit relationship of the concepts in the hierarchy is not a sub classification.

Current setup (description)

So far, the data modeling activities in COVESA have been primarily centered on the continuous development and maintenance of the Vehicle Signal Specification (VSS) and the tools that parse VSS into different formats. In the current setup, there is no clear description of what requirements (i.e., functional and non-functional) are driving the design of the data model. It seems that the primary purpose of VSS is to serve as a naming convention for the properties of the vehicle. Nevertheless, there is little attention given to the separation of concerns:

On the one hand, there is the "conceptual area," where the controlled vocabulary has to be described and adequately documented. Data models belong to this area because they define the entities of interest in a particular domain and the possible relationships between them.
On the other hand, we have the "application area," where a data model is used in specific implementations (e.g., databases, applications, etc.).

Current data modelling setup

The figure above shows how VSS modeling belongs to the conceptual area. To use the specification described in VSS (i.e., a "vspec" file), one has to parse it into a specific format (e.g., JSON) by using the VSS tools. The tools are the mechanism that makes the VSS data model usable in the application area. From the practical point of view, the application area needs a specific schema that determines the structure in which the data is to be stored. In this context, we mean long-term storage (e.g., a database) or short-term storage (RAM and variables' allocation during application execution).

In the current setup, the whole data model is taken one-to-one and parsed as the schema for the application area. Then, it is up to the specific implementation to use custom mechanisms to ignore the overhead when only some concepts defined in the data model are required or used. Although this aspect has shown no significant limitation until now, it becomes relevant when multiple domains are involved. Therefore, with the increasing interest in adding other domains apart from vehicle-specific data, it is crucial to define a data modeling strategy that can scale beyond tree hierarchies and vehicle-specific data.

Proposal overview (summary)

Unique and standard definitions in the conceptual area, and arbitrary selection in the application area
Suggested tasks to address:
- Generalize the VSS approach to model one tree-like hierarchy, so that another modelling group can reuse it when convenient (inside or outside COVESA).
- Keep using the YAML format for the further specification of the domain hierarchy, but extend the tools to first translate the YAML into the RDF standard data exchange format (i.e., publishing the specification in RDF + SKOS). Some advantages include:
  - Unique identification of resources
  - Interoperability
  - Machine readable format
  - RDF libraries available in multiple programming languages
  - RDF data can be queried with SPARQL
  - Facilitate the integration of multiple hierarchies because OWL ontologies are built on RDF data, so the tree hierarchy can contribute to it.
  - Taxonomy editors and other tools can import RDF data with SKOS concepts in it directly.

Proposal overview (description)

The idea is to define the data modelling workflow for COVESA in terms of the conceptual area (i.e., development and maintenance of the controlled vocabulary) and the application area (i.e., tools that construct a usable schema out of the data models for a particular use case).

In the conceptual area, there should be a clear step-by-step guide on how to work with two different levels of expressiveness. This consideration is needed because a tree hierarchical model (what VSS has been so far) is not the best model type to handle some upcoming needs, such as data integration. Hence, the model might be selected depending on the user needs.
- (less expressive) Tree hierarchical model, good for:
  - Information classification according to a given criteria (e.g., taxonomy, meronymy, custom tree.)
  - Naming convention following a dotted notation (i.e., concatenation of the branches)
- (more expressive) Ontology, good for:
  - Data integration
  - Concepts re usability
  - Reasoning
  - Multiple hierarchies
  - Knowledge representation
In the application area, the tools have to provide a mechanism to arbitrary select concepts of interest from one or multiple domains, including the context and pointers to the uniquely identified definition of the concepts.

In other words, the proposal is not to enforce the use of ontologies. It is rather providing a common agreement on how and when to use what data model. The idea is better explained with the following diagram:

- will be responsible for the standard definition of the controlled vocabulary. Here

The specific aspects of the proposal are illustrated in the following figure, and explained below:

1a - Generic domain tree modelling approach

The simplicity of VSS has proven to be a successful approach for the continuous contribution of Subject Matter Experts (SMEs); just by modifying a text file, discussing the changes, and creating a pull request. The approach itself should be generalized to serve as a guideline to describe and maintain one hierarchy. The idea here is to abstract the modelling approach used in VSS and describe it with generic terms that might be re used with other domains.

1b - Tree model suitability rule-set

Having an stablished approach to model a tree hierarchy does not mean that COVESA should motivate the arbitrary creation of multiple trees. Therefore, it is essential to define a simple set of rules that are to be satisfied before starting a new data tree that is to be developed and maintained by COVESA. Such a rule set can include, for example:

There must exist the proven need for at least 3 branches.
Each branch must have at least 2 leaves.
The implicit relationship between consecutive branches must be documented.
There must be no cross references to other existing data models.
Branch names must use this XYZ format

Space shortcuts

JOIN/SIGN UP

GET INVOLVED

COLLABORATIVE PROJECTS

HISTORICAL

Events

Industry Events

COVESA Events

Contact Us

Child pages

Current setup (summary)

Current setup (description)

Proposal overview (summary)

Proposal overview (description)

1a - Generic domain tree modelling approach

1b - Tree model suitability rule-set

2 - Tree-to-ontology sync mechanism

3a - Domain ontology modeling approach

3b - Covesa core ontology

3c - Other data model(s)

4 - Custom data schema construction mechanism

Space shortcuts

JOIN/SIGN UP

GET INVOLVED

COLLABORATIVE PROJECTS

HISTORICAL

Events

Industry Events

COVESA Events

Contact Us

Child pages

Defining the COVESA data modeling strategy and its associated artifacts

Current setup (summary)

Current setup (description)

Proposal overview (summary)

Proposal overview (description)

1a - Generic domain tree modelling approach

1b - Tree model suitability rule-set

2 - Tree-to-ontology sync mechanism

3a - Domain ontology modeling approach

3b - Covesa core ontology

3c - Other data model(s)

4 - Custom data schema construction mechanism