InteroperAble Descriptions of Observable Property Terminologies (I-ADOPT) WG Outputs and Recommendations

    You are here

02
Feb
2022

InteroperAble Descriptions of Observable Property Terminologies (I-ADOPT) WG Outputs and Recommendations

By Barbara Magagna


InteroperAble Descriptions of Observable Property Terminology (I-ADOPT) WG

Group co-chairs: Barbara MagagnaAnusuriya DevarajuGwenaëlle MoncoifféMaria Stoica

Recommendation: InteroperAble Descriptions of Observable Property Terminologies (I-ADOPT) WG Outputs and Recommendations

Authors: Barbara Magagna; Gwenaëlle Moncoiffé; Anusuriya Devaraju; Maria Stoica; Sirko Schindler; Alison Pamment; Environment Agency Austria, Austria/University of Twente, NL; National Oceanography Centre/British Oceanographic Data Centre, UK; Terrestrial Ecosystem Research Network (TERN), University of Queensland, Australia; University of Colorado, Boulder, USA; Institute of Data Science, German Aerospace Center (DLR), Germany; National Centre for Atmospheric Science/UKRI, UK.

Impact: 

By proposing a common way of decomposing variable descriptions according to a systematic approach that clearly identifies essential atomic components of these variables (for example the property measured, the chemical substance targeted, the medium in which the observation was made), the I-ADOPT Interoperability Framework addresses not only the I (interoperability) of the FAIR data principles but it will also benefit both the Findability and Reusability of data. The output from the I-ADOPT WG will help research communities by enabling domain scientists, ontology engineers, data providers, data curators to create and reuse complex unambiguous variable descriptions in a machine and human readable format. Well defined variables make data easier to find and reuse. Users and developers of software that are needed to create, search, manipulate, aggregate, and publish data will thus be able to design and build tools based on this common framework, facilitating the design process, fostering collaboration and improving the user experience. When broadly adopted, the I-ADOPT Framework and Recommendations will improve the efficiency and the specificity of cross-catalogue federated searches. It will also facilitate the automation or semi-automation of data workflows and enable faster, more reliable and more reproducible data aggregation steps. In addition, the components of the I-ADOPT Framework could be used by data curators to standardise the ‘mandatory’ metadata elements that should be submitted by data authors in order to identify the variables held in datasets.

The publication of the I-ADOPT Interoperability Framework and the terminology catalogue via GitHub permits community engagement and means to continue building on this initial work.

Recommendation package DOI: 10.15497/RDA00071

Citation: Magagna, B., Moncoiffé, G., Devaraju, A., Stoica, M., Schindler, S., Pamment, A., Environment Agency Austria, Austria/University of Twente, NL, National Oceanography Centre/British Oceanographic Data Centre, UK, Terrestrial Ecosystem Research Network (TERN), University of Queensland, Australia, University of Colorado, Boulder, USA, Institute of Data Science, German Aerospace Centre (DLR), Germany, & National Centre for Atmospheric Science/UKRI, UK. (2022). InteroperAble Descriptions of Observable Property Terminologies (I-ADOPT) WG Outputs and Recommendations. Research Data Alliance. https://doi.org/10.15497/RDA00071

Abstract:

The InteroperAble Description of Observable Property Terminologies Working Group

(I-ADOPT WG) was formed in June 2019 under the auspices of the Research Data Alliance’s Vocabulary and Semantic Services Interest Group. Its objective was to develop a framework to harmonise the way observable properties are named and conceptualised, in various communities within and across scientific domains. There was a realisation that the rapid demand for controlled vocabularies specialised in describing observed properties (i.e. measured, simulated, counted quantities, or qualitative observations) was presenting a risk of proliferation of semantic resources that were poorly aligned. This, in turn, was becoming a source of confusion for the end-users and a hindrance to data interoperability.

The development of the I-ADOPT Framework proceeded in multiple phases. Following the initial phase dedicated to the collection of user stories, the identification of key requirements, and an in-depth analysis of existing semantic representations of scientific variables and of terminologies in use, the group focused on identifying the essential components of the conceptual framework, reusing as much as possible concepts that were common to existing operational resources. The proposed framework was then tested against a variety of examples to ensure that it could be used as a sound basis for the creation of new variable names as needed. The results were formalised into the I-ADOPT ontology and subsequently extended with usage guidelines to form the I-ADOPT Framework presented in this document. The output can now be used to facilitate interoperability between existing semantic resources and to support the provision of machine-readable variable descriptions whose components are mapped to FAIR vocabulary concepts. The group also issued the following six key recommendations:

  1. Data creators, curators or publishers should describe the variable(s) held in datasets in both a human- and a machine-readable format.
  2. The variable’s description should enable data reuse with minimum reliance on externally held free-text documentation.
  3. The machine-readable description should make use of FAIR terminologies (e.g., controlled vocabularies, ontological relationships) adhering to Linked Data principles.
  4. The translation from human readable to machine readable form should follow a decomposition approach that is compatible with the classes and relations defined in the I-ADOPT ontology (https://w3id.org/iadopt/).
  5. Users should preferably reuse terminologies that are already aligned with the I-ADOPT Framework by either reusing existing concepts or extending collections, or by creating new concepts based on the I-ADOPT Framework.
  6. For variables based on a different schema, a mapping to the I-ADOPT Framework should be provided.

The group also set up public repositories to continue open collaboration and give access to resources that will be maintained and/or developed beyond the lifetime of the official RDA working group: 1) a catalogue of terminologies relevant to observable properties, 2) a repository of design patterns; 3) a step-by-step guide for minting new variables, 4) use-case specific guidelines on implementing the framework, 5) a repository of applications, and user implementation stories; 6) additional materials including a list of alignments with other ontological resources.

 

Output Status: 
RDA Endorsed Recommendations
Review period start: 
Thursday, 3 February, 2022 to Thursday, 3 March, 2022
Group content visibility: 
Public - accessible to all site users
Primary Domain/Field of Expertise: 
Natural Sciences, Natural Sciences/Earth and related environmental sciences, Natural Sciences/Biological sciences
Primary WG Focus / Output focus: 
Domain Agnostic: 
  • Rowland Mosbergen's picture

    Author: Rowland Mosbergen

    Date: 02 Mar, 2022

    Just an example.

    In the past, macrophages were broken down into two types, M1 and M2.

    But now they realise that this is more of a spectrum than two separate subtypes

    I think that this is explains it https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3944738/

    So imagine we have two datasets, one where M1 and M2 was the standard way to subtype and then one after where the M1 and M2 standard was not considered relevant.

    How would this look in the same schema? 

    How do we know when we need to identify out of date scientific methods and change the metadata?

    Or do we always have to re-curate datasets before we analyse them for our research question?

    I guess I would like to see how this works across time as opposed to a point in time.

  • Barbara Magagna's picture

    Author: Barbara Magagna

    Date: 10 Mar, 2022

    Answer by the I-ADOPT core group:
    What you basically (seem to) request is a way to support the evolution of terms within the framework. However, the scope of the framework is to provide a mechanism for expressing the variable, not a mechanism for documenting how the variable attached to a particular resource has evolved over time. Providing both the original and updated variable representations using the same framework helps document which components have changed and which have remained the same. The terms used for the components should follow FAIR principles and have approved governance practices for referring to a newer concept if deprecated. So the responsibility for the documentation of terms' provenance is with the underlying terminology for each component and not with the I-ADOPT framework. Happy to discuss further, if this doesn't answer your question.

  • Rowland Mosbergen's picture

    Author: Rowland Mosbergen

    Date: 02 Mar, 2022

    Ignore - duplicate

  • Dieter Maier's picture

    Author: Dieter Maier

    Date: 02 Mar, 2022

    Thank you very much for this very well structured recommendation and the associated tools. Especially the variable design pattern will be very helpful I believe.

    The proposed I-ADOPT ontology integrates the matrix of a measurement e.g. water and some of the properties of the measurement e.g. molar concentration per unit mass into a single "variable" object. This will lead to a multiplication of varaible instances, a typical issue of pre-coordination approaches. Have you considered and discussed a more atomistic options which would assign matrix and unit separately to the measurement, using the "variable" only as semantic representation of the measured property? Such an approach may also better serve issues of increasing knowledge and differentiation as raised by Rowlands comment.

  • Barbara Magagna's picture

    Author: Barbara Magagna

    Date: 10 Mar, 2022

    Answer by the I-ADOPT core group: 

    If we understood you correctly you would propose to not use the variable as a unifying concept but just use the descriptive components (like property, matrix, etc) to annotate the measurement result itself. You say that this would otherwise lead to an explosion of instances. This would be true if all possible combinations would be provided in advance, but in reality only those combinations will be created which are really needed by the respective community. In addition, we want to build reusable semantic concepts. The semantic (decomposition) work and the publication of variables with its components should be done by semantic experts (in cooperation with domain experts) and not by the scientists who want to annotate their measurements. The scientists should just be enabled to pick the right variable and only in case the variable is missing propose a new one. For both workflows we want to provide supporting services.

    The reason why we propose to use variables is that this is the current practice in many data provider repositories. Our goal is to find interoperable solutions for existing approaches which are variable descriptions that just would need a semantic decomposition to become more transparent, precise, matchable and thus reusable.

  • Barbara Magagna's picture

    Author: Barbara Magagna

    Date: 10 Mar, 2022

    sorry duplicate

  • Johannes Peterseil's picture

    Author: Johannes Peterseil

    Date: 04 Mar, 2022

    The proposed i-adopt framework and ontology for describing variables provides a good framework when integrating variable namings across different data sources. This will be one of the applications within eLTER work. In addition eLTER will adopt and apply the i-Adopt recommendations to further develop EnvThes focusing on the variable names used. The inclusion of the matrix, e.g. soil water, is important to detail the variable when using it e.g. in SensorML and OGC SOS. 

submit a comment