Advice on new controlled vocabulary

16 Feb 2018
Groups audience: 

Hi all,

I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.

I think a SKOS is an appropriate model since it can capture:

  • Preferred and alternative names for experiments, experiment sites/fields and plots
  • Capture relations between a field/site and experiments conducted there on that field
  • Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).

I would appreciate this groups thoughts on the following questions:

  1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
  2. If we build a new vocabulary where is the best place to host it? Agroportal?
  3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?

I will be at the Berlin meeting so happy to discuss this and the network further there.

thanks

Richard

  • Armando Stellato's picture

    Author: Armando Stellato

    Date: 16 Feb, 2018

    Dear Richard,
    My two cents on the matter…
    I will address first the three points about your rationale for SKOS:
    * Preferred/alternative labels: you can still use SKOS labels with classes/properties and instances of an OWL ontology so this should not influence your choice on the model
    * Capture etc… : SKOS is an OWL vocabulary. Everything you can do in SKOS can be done in OWL as well
    So, this is not at all to advice against using SKOS, it was just to tell that none of the above is necessarily in favor of SKOS.
    Actually, you need first to discover what are all the entities and relations that you need to model and then you can take the best shot at the modeling vocabulary(es) to use.
    From the picture I got from your first requirements drawn in your email, it might well be better to describe an ontology of agricultural experiments rather than a thesaurus. A combination of both is also considerable, it’s just a matter of what you need to represent.
    Concerning your specific questions:
    1. Again, depends on what you need to model. I would suggest to properly model your central entities (experiment, properties of an experiment etc..) and then you can use Agrovoc/GACS/whatever as a source of “topics” for these experiments (this is what thesauri are good for)
    2. As a first thing, the web is meant to be decentralized :-) so any place is fine providing that there is an URL for it and (quality points added) that you publish good metadata for it. Anyway, aggregators do always a good job and help to advertise content! So +1 for Agroportal!
    * Bonus (rhetorical) question with the answer: are you interested in publishing the vocabulary (as you said in the question) or also the data? (which seems the case given your project). If you have plenty of data (as you said in question 3, you will have plenty of contributions also) then this means datasets, not just vocabularies. While for the latter an URL is fine, the first would benefit from SPARQL endpoints, services for http resolution of each single entity both for machines and for humans etc..
    3. This is a very broad question. I hate to say “it depends” but..it depends ;-) There are approaches in which an organization gets results from partners all over the world, translates them into a single model and takes care of the publication in a very centralized manner (e.g. FAO with AGRIS [1,2] ), or cases where the production of the resource is more horizontal and collaborative (e.g. Agrovoc, collaboratively maintained though VocBench [3] ), though if it has to be an authoritative resource, some validating entity is still necessary.
    In any case, good you will be in Berlin, meet you there!
    Cheers,
    Armando
    [1] https://content.iospress.com/articles/semantic-web/sw128
    [2] http://agris.fao.org/
    [3] http://vocbench.uniroma2.it
    - Show quoted text -From: richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Friday, February 16, 2018 10:21 AM
    To: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922

  • Richard Ostler's picture

    Author: Richard Ostler

    Date: 16 Feb, 2018

    Hi Armando,
    Thanks for your reply.
    My thinking for a SKOS is it provides a structure for capturing names of things and the relatively simple relationships between them. However, if for example, we want to define classes of experiment types (e.g. rotation, continuous cropping, organic, fertiliser, tillage etc) then yes a more fully developed ontology would be needed. So the decision is whether it is more appropriate to have a SKOS for naming experiments, sites and plots, or an ontology which can describe relationships between these concepts as well.
    For populating the SKOS/Ontology I think it has to be the horizontal & collaborative approach – the definitive sources for the experiments are the Research Institutes hosting them.
    The intention is to publish this as a public resource, so for us it is where to publish to get maximum impact!
    As far as experiment metadata goes, I like the idea of a minimum information schema/checklist. Existing ontologies like crop ontology could be used for this and something like MIAPPE has potential as an MI schema. However, we also have site and plot metadata to deal with for these experiments and this may need a different approach. Since we want to provide a semantic framework for capturing metadata, then presumably we could expose a SPARQL end point. Ideally we would like to develop a discovery portal for long-term experiments so that would make sense.
    For info, Rothamsted Research is hosting a 3 day conference in May on the future of long-term experiments. A key theme will be the Global Long-term (Agricultural) Experiments Network – we have a colleague from IRRI working with us who’ll present. The second day will include a data workshop and it would be great to see some Ag data people there. https://www.rothamsted.ac.uk/events/future-long-term-experiments-agricul....
    Thanks
    Richard
    - Show quoted text -From: Armando Stellato [mailto:***@***.***] On Behalf Of Armando Stellato
    Sent: 16 February 2018 13:18
    To: Richard Ostler <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: RE: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    My two cents on the matter…
    I will address first the three points about your rationale for SKOS:
    * Preferred/alternative labels: you can still use SKOS labels with classes/properties and instances of an OWL ontology so this should not influence your choice on the model
    * Capture etc… : SKOS is an OWL vocabulary. Everything you can do in SKOS can be done in OWL as well
    So, this is not at all to advice against using SKOS, it was just to tell that none of the above is necessarily in favor of SKOS.
    Actually, you need first to discover what are all the entities and relations that you need to model and then you can take the best shot at the modeling vocabulary(es) to use.
    From the picture I got from your first requirements drawn in your email, it might well be better to describe an ontology of agricultural experiments rather than a thesaurus. A combination of both is also considerable, it’s just a matter of what you need to represent.
    Concerning your specific questions:
    1. Again, depends on what you need to model. I would suggest to properly model your central entities (experiment, properties of an experiment etc..) and then you can use Agrovoc/GACS/whatever as a source of “topics” for these experiments (this is what thesauri are good for)
    2. As a first thing, the web is meant to be decentralized :-) so any place is fine providing that there is an URL for it and (quality points added) that you publish good metadata for it. Anyway, aggregators do always a good job and help to advertise content! So +1 for Agroportal!
    * Bonus (rhetorical) question with the answer: are you interested in publishing the vocabulary (as you said in the question) or also the data? (which seems the case given your project). If you have plenty of data (as you said in question 3, you will have plenty of contributions also) then this means datasets, not just vocabularies. While for the latter an URL is fine, the first would benefit from SPARQL endpoints, services for http resolution of each single entity both for machines and for humans etc..
    3. This is a very broad question. I hate to say “it depends” but..it depends ;-) There are approaches in which an organization gets results from partners all over the world, translates them into a single model and takes care of the publication in a very centralized manner (e.g. FAO with AGRIS [1,2] ), or cases where the production of the resource is more horizontal and collaborative (e.g. Agrovoc, collaboratively maintained though VocBench [3] ), though if it has to be an authoritative resource, some validating entity is still necessary.
    In any case, good you will be in Berlin, meet you there!
    Cheers,
    Armando
    [1] https://content.iospress.com/articles/semantic-web/sw128
    [2] http://agris.fao.org/
    [3] http://vocbench.uniroma2.it
    From: richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Friday, February 16, 2018 10:21 AM
    To: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    Hi Armando,
    Thanks for your reply.
    My thinking for a SKOS is it provides a structure for capturing names of things and the relatively simple relationships between them. However, if for example, we want to define classes of experiment types (e.g. rotation, continuous cropping, organic, fertiliser, tillage etc) then yes a more fully developed ontology would be needed. So the decision is whether it is more appropriate to have a SKOS for naming experiments, sites and plots, or an ontology which can describe relationships between these concepts as well.
    For populating the SKOS/Ontology I think it has to be the horizontal & collaborative approach – the definitive sources for the experiments are the Research Institutes hosting them.
    The intention is to publish this as a public resource, so for us it is where to publish to get maximum impact!
    As far as experiment metadata goes, I like the idea of a minimum information schema/checklist. Existing ontologies like crop ontology could be used for this and something like MIAPPE has potential as an MI schema. However, we also have site and plot metadata to deal with for these experiments and this may need a different approach. Since we want to provide a semantic framework for capturing metadata, then presumably we could expose a SPARQL end point. Ideally we would like to develop a discovery portal for long-term experiments so that would make sense.
    For info, Rothamsted Research is hosting a 3 day conference in May on the future of long-term experiments. A key theme will be the Global Long-term (Agricultural) Experiments Network – we have a colleague from IRRI working with us who’ll present. The second day will include a data workshop and it would be great to see some Ag data people there. https://www.rothamsted.ac.uk/events/future-long-term-experiments-agricul....
    Thanks
    Richard
    From: Armando Stellato [mailto:***@***.***] On Behalf Of Armando Stellato
    Sent: 16 February 2018 13:18
    To: Richard Ostler <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: RE: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    My two cents on the matter…
    I will address first the three points about your rationale for SKOS:
    * Preferred/alternative labels: you can still use SKOS labels with classes/properties and instances of an OWL ontology so this should not influence your choice on the model
    * Capture etc… : SKOS is an OWL vocabulary. Everything you can do in SKOS can be done in OWL as well
    So, this is not at all to advice against using SKOS, it was just to tell that none of the above is necessarily in favor of SKOS.
    Actually, you need first to discover what are all the entities and relations that you need to model and then you can take the best shot at the modeling vocabulary(es) to use.
    From the picture I got from your first requirements drawn in your email, it might well be better to describe an ontology of agricultural experiments rather than a thesaurus. A combination of both is also considerable, it’s just a matter of what you need to represent.
    Concerning your specific questions:
    1. Again, depends on what you need to model. I would suggest to properly model your central entities (experiment, properties of an experiment etc..) and then you can use Agrovoc/GACS/whatever as a source of “topics” for these experiments (this is what thesauri are good for)
    2. As a first thing, the web is meant to be decentralized :-) so any place is fine providing that there is an URL for it and (quality points added) that you publish good metadata for it. Anyway, aggregators do always a good job and help to advertise content! So +1 for Agroportal!
    * Bonus (rhetorical) question with the answer: are you interested in publishing the vocabulary (as you said in the question) or also the data? (which seems the case given your project). If you have plenty of data (as you said in question 3, you will have plenty of contributions also) then this means datasets, not just vocabularies. While for the latter an URL is fine, the first would benefit from SPARQL endpoints, services for http resolution of each single entity both for machines and for humans etc..
    3. This is a very broad question. I hate to say “it depends” but..it depends ;-) There are approaches in which an organization gets results from partners all over the world, translates them into a single model and takes care of the publication in a very centralized manner (e.g. FAO with AGRIS [1,2] ), or cases where the production of the resource is more horizontal and collaborative (e.g. Agrovoc, collaboratively maintained though VocBench [3] ), though if it has to be an authoritative resource, some validating entity is still necessary.
    In any case, good you will be in Berlin, meet you there!
    Cheers,
    Armando
    [1] https://content.iospress.com/articles/semantic-web/sw128
    [2] http://agris.fao.org/
    [3] http://vocbench.uniroma2.it
    - Show quoted text -From: richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Friday, February 16, 2018 10:21 AM
    To: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.

  • Clement Jonquet's picture

    Author: Clement Jonquet

    Date: 20 Feb, 2018

    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    > but experiment names, fields and plots are not represented by any controlled vocabularly.
    >
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------

  • Richard Ostler's picture

    Author: Richard Ostler

    Date: 20 Feb, 2018

    Hi Clement,
    Thanks for your reply. I think more background would be useful for explaining my motivation.
    At Rothamsted we have several agriculture long-term experiments (e.g. Broadbalk winter wheat, Park grass, Hoosfield spring barley and more). Globally, with other agricultural research institutes, we’re starting to identify other long-term experiments and develop a better network for utilising them – there is growing interest in meta-analyses across diverse LTEs for questions on yield sustainability/sustainable intensification and natural capital.
    From our initial work it is apparent there are not any metadata standards or recommendations being widely used in his community.
    Some defining characteristics of agriculture long-term experiments are:
    1. They run for many years (decades) and an experimental plot will receive the same treatment regime over that time
    2. The experiments and their plots have names or identifiers which persist over time, however they can be known by different names in the literature.
    3. The experiment design can be modified (e.g. plots split or renamed, treatments modified)– this is a particular problem for older experiments with less robust/modern statistical designs
    4. Experiments can be used for research beyond their original purpose and new, but related, datasets generated. For example, our primary datasets are yield, but additional datasets on soils, biodiversity and phenotypic traits have been generated. These can all be linked to an experiment plot but currently there isn’t a recommendation on persistent identifiers for doing this.
    For point 2 above, a gazetteer-like controlled vocabulary of Long-term experiment, field and plot names could provide a resource for consistently naming these things and semantically tagging resources referencing them. It could also capture alternative names and basic relationships between different classes. – e.g. Plot A is part of Experiment X. The main classes would be site/field, experiment, plot. Potentially you could expand this into an ontology to capture other relationships for example, Experiment X is a Rotation Experiment.
    In fact I think the experiments, plots and fields should be assigned persistent identifiers and have relevant metadata captured. e.g. location, geographic & climate for the site; area, soil characteristics, treatment group for a plot; treatments; standard management & cropping for the experiment.
    I’ve looked closely at the DEIMS-SDR metadata models and MIAPPE and I don’t think either is quite sufficient to represent metadata for agricultural LTE datasets. I like the MI checklist approach and recommended ontologies taken by MIAPPE and I think existing ontologies can be used for much of the checklist detail – I think what is needed is an MI checklist relevant to agriculture LTEs.
    Thanks
    Richard
    From: Clement Jonquet [mailto:***@***.***] On Behalf Of Clement Jonquet
    Sent: 20 February 2018 06:13
    To: Richard Ostler <***@***.***>
    Cc: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    but experiment names, fields and plots are not represented by any controlled vocabularly.
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------
    Le 16 févr. 2018 à 01:20, RichardOstler <***@***.***> a écrit :
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.

  • Valeria Pesce's picture

    Author: Valeria Pesce

    Date: 20 Feb, 2018

    Dear Richard,
    Interesting task.
    I would second Armando’s recommendation that for your use case you might need an ontology / model / schema for agricultural experiments rather than a thesaurus (you can have a thesaurus for the controlled values of specific properties).
    It’s also true that I’ve seen cases in which people use SKOS to create a list of properties that they need to describe something, but I don’t think it’s the best use of SKOS.
    And I would also highlight what Armando says about distinguishing between vocabularies and datasets: you will probably need vocabularies for the experiment description structure (a schema or an ontology) and for controlled values, but the data that you expect will be contributed would be the actual instances of the experiments data, which would ideally go into a dataset, with records structured according to the vocabularies you designed.
    The only suggestion I would add is to look at what has already been done in a few projects (which you may already know):
    - The ICASA data standards for field experiments: https://dssat.net/data/standards_v2. I think the initial standard was a specification with possible serializations in CSV and XML. It provides a model and a list of variables. From what I understand there were plans to render ICASA in RDF in the TERRA-REF project (http://terraref.org/about/) and the ICASA data dictionary is also being mapped to various ontologies as part of the Agronomy Ontology project (http://www.obofoundry.org/ontology/agro.html).
    - Projects on crop modelling and sharing experiment data (no RDF yet, but useful models to base your RDF upon): AgMIP (in which they’re reusing the ICASA variables: http://research.agmip.org/display/dev/ICASA+Master+Variable+List), APSIM: http://www.apsim.info/, CGIAR AgTrials (http://www.agtrials.org/).
    - Useful RDF vocabularies to build upon or reuse: the Crop Research Ontology: http://agroportal.lirmm.fr/ontologies/CO_715; AgroRDF: http://data.igreen-services.com/agrordf.
    Hope this helps a little.
    Best regards,
    Valeria
    - Show quoted text -From: richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Tuesday, February 20, 2018 12:45 PM
    To: Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi Clement,
    Thanks for your reply. I think more background would be useful for explaining my motivation.
    At Rothamsted we have several agriculture long-term experiments (e.g. Broadbalk winter wheat, Park grass, Hoosfield spring barley and more). Globally, with other agricultural research institutes, we’re starting to identify other long-term experiments and develop a better network for utilising them – there is growing interest in meta-analyses across diverse LTEs for questions on yield sustainability/sustainable intensification and natural capital.
    From our initial work it is apparent there are not any metadata standards or recommendations being widely used in his community.
    Some defining characteristics of agriculture long-term experiments are:
    1. They run for many years (decades) and an experimental plot will receive the same treatment regime over that time
    2. The experiments and their plots have names or identifiers which persist over time, however they can be known by different names in the literature.
    3. The experiment design can be modified (e.g. plots split or renamed, treatments modified)– this is a particular problem for older experiments with less robust/modern statistical designs
    4. Experiments can be used for research beyond their original purpose and new, but related, datasets generated. For example, our primary datasets are yield, but additional datasets on soils, biodiversity and phenotypic traits have been generated. These can all be linked to an experiment plot but currently there isn’t a recommendation on persistent identifiers for doing this.
    For point 2 above, a gazetteer-like controlled vocabulary of Long-term experiment, field and plot names could provide a resource for consistently naming these things and semantically tagging resources referencing them. It could also capture alternative names and basic relationships between different classes. – e.g. Plot A is part of Experiment X. The main classes would be site/field, experiment, plot. Potentially you could expand this into an ontology to capture other relationships for example, Experiment X is a Rotation Experiment.
    In fact I think the experiments, plots and fields should be assigned persistent identifiers and have relevant metadata captured. e.g. location, geographic & climate for the site; area, soil characteristics, treatment group for a plot; treatments; standard management & cropping for the experiment.
    I’ve looked closely at the DEIMS-SDR metadata models and MIAPPE and I don’t think either is quite sufficient to represent metadata for agricultural LTE datasets. I like the MI checklist approach and recommended ontologies taken by MIAPPE and I think existing ontologies can be used for much of the checklist detail – I think what is needed is an MI checklist relevant to agriculture LTEs.
    Thanks
    Richard
    From: Clement Jonquet [mailto:***@***.***] On Behalf Of Clement Jonquet
    Sent: 20 February 2018 06:13
    To: Richard Ostler <***@***.***>
    Cc: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    but experiment names, fields and plots are not represented by any controlled vocabularly.
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------
    Le 16 févr. 2018 à 01:20, RichardOstler <***@***.***> a écrit :
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.

  • Richard Ostler's picture

    Author: Richard Ostler

    Date: 20 Feb, 2018

    Thanks Valeria,
    I’ve looked at Crop Research Ontology and I think it has a generally good fit. I haven’t seen AgroRDF before, but from a quick look it seems to capture a lot of farm management data. I had a look at ICASA a few months ago, but from the new projects you mention it sounds like it should be worth another look.
    Thanks
    Richard
    - Show quoted text -From: valeria.pesce=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of valeriapesce
    Sent: 20 February 2018 11:51
    To: Richard Ostler <***@***.***>; Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    Interesting task.
    I would second Armando’s recommendation that for your use case you might need an ontology / model / schema for agricultural experiments rather than a thesaurus (you can have a thesaurus for the controlled values of specific properties).
    It’s also true that I’ve seen cases in which people use SKOS to create a list of properties that they need to describe something, but I don’t think it’s the best use of SKOS.
    And I would also highlight what Armando says about distinguishing between vocabularies and datasets: you will probably need vocabularies for the experiment description structure (a schema or an ontology) and for controlled values, but the data that you expect will be contributed would be the actual instances of the experiments data, which would ideally go into a dataset, with records structured according to the vocabularies you designed.
    The only suggestion I would add is to look at what has already been done in a few projects (which you may already know):
    * The ICASA data standards for field experiments: https://dssat.net/data/standards_v2. I think the initial standard was a specification with possible serializations in CSV and XML. It provides a model and a list of variables. From what I understand there were plans to render ICASA in RDF in the TERRA-REF project (http://terraref.org/about/) and the ICASA data dictionary is also being mapped to various ontologies as part of the Agronomy Ontology project (http://www.obofoundry.org/ontology/agro.html).
    * Projects on crop modelling and sharing experiment data (no RDF yet, but useful models to base your RDF upon): AgMIP (in which they’re reusing the ICASA variables: http://research.agmip.org/display/dev/ICASA+Master+Variable+List), APSIM: http://www.apsim.info/, CGIAR AgTrials (http://www.agtrials.org/).
    * Useful RDF vocabularies to build upon or reuse: the Crop Research Ontology: http://agroportal.lirmm.fr/ontologies/CO_715; AgroRDF: http://data.igreen-services.com/agrordf.
    Hope this helps a little.
    Best regards,
    Valeria
    From: richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Tuesday, February 20, 2018 12:45 PM
    To: Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi Clement,
    Thanks for your reply. I think more background would be useful for explaining my motivation.
    At Rothamsted we have several agriculture long-term experiments (e.g. Broadbalk winter wheat, Park grass, Hoosfield spring barley and more). Globally, with other agricultural research institutes, we’re starting to identify other long-term experiments and develop a better network for utilising them – there is growing interest in meta-analyses across diverse LTEs for questions on yield sustainability/sustainable intensification and natural capital.
    From our initial work it is apparent there are not any metadata standards or recommendations being widely used in his community.
    Some defining characteristics of agriculture long-term experiments are:
    1. They run for many years (decades) and an experimental plot will receive the same treatment regime over that time
    2. The experiments and their plots have names or identifiers which persist over time, however they can be known by different names in the literature.
    3. The experiment design can be modified (e.g. plots split or renamed, treatments modified)– this is a particular problem for older experiments with less robust/modern statistical designs
    4. Experiments can be used for research beyond their original purpose and new, but related, datasets generated. For example, our primary datasets are yield, but additional datasets on soils, biodiversity and phenotypic traits have been generated. These can all be linked to an experiment plot but currently there isn’t a recommendation on persistent identifiers for doing this.
    For point 2 above, a gazetteer-like controlled vocabulary of Long-term experiment, field and plot names could provide a resource for consistently naming these things and semantically tagging resources referencing them. It could also capture alternative names and basic relationships between different classes. – e.g. Plot A is part of Experiment X. The main classes would be site/field, experiment, plot. Potentially you could expand this into an ontology to capture other relationships for example, Experiment X is a Rotation Experiment.
    In fact I think the experiments, plots and fields should be assigned persistent identifiers and have relevant metadata captured. e.g. location, geographic & climate for the site; area, soil characteristics, treatment group for a plot; treatments; standard management & cropping for the experiment.
    I’ve looked closely at the DEIMS-SDR metadata models and MIAPPE and I don’t think either is quite sufficient to represent metadata for agricultural LTE datasets. I like the MI checklist approach and recommended ontologies taken by MIAPPE and I think existing ontologies can be used for much of the checklist detail – I think what is needed is an MI checklist relevant to agriculture LTEs.
    Thanks
    Richard
    From: Clement Jonquet [mailto:***@***.***] On Behalf Of Clement Jonquet
    Sent: 20 February 2018 06:13
    To: Richard Ostler <***@***.***>
    Cc: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    but experiment names, fields and plots are not represented by any controlled vocabularly.
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------
    Le 16 févr. 2018 à 01:20, RichardOstler <***@***.***> a écrit :
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    Thanks Valeria,
    I’ve looked at Crop Research Ontology and I think it has a generally good fit. I haven’t seen AgroRDF before, but from a quick look it seems to capture a lot of farm management data. I had a look at ICASA a few months ago, but from the new projects you mention it sounds like it should be worth another look.
    Thanks
    Richard
    From: valeria.pesce=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of valeriapesce
    Sent: 20 February 2018 11:51
    To: Richard Ostler <***@***.***>; Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    Interesting task.
    I would second Armando’s recommendation that for your use case you might need an ontology / model / schema for agricultural experiments rather than a thesaurus (you can have a thesaurus for the controlled values of specific properties).
    It’s also true that I’ve seen cases in which people use SKOS to create a list of properties that they need to describe something, but I don’t think it’s the best use of SKOS.
    And I would also highlight what Armando says about distinguishing between vocabularies and datasets: you will probably need vocabularies for the experiment description structure (a schema or an ontology) and for controlled values, but the data that you expect will be contributed would be the actual instances of the experiments data, which would ideally go into a dataset, with records structured according to the vocabularies you designed.
    The only suggestion I would add is to look at what has already been done in a few projects (which you may already know):
    * The ICASA data standards for field experiments: https://dssat.net/data/standards_v2. I think the initial standard was a specification with possible serializations in CSV and XML. It provides a model and a list of variables. From what I understand there were plans to render ICASA in RDF in the TERRA-REF project (http://terraref.org/about/) and the ICASA data dictionary is also being mapped to various ontologies as part of the Agronomy Ontology project (http://www.obofoundry.org/ontology/agro.html).
    * Projects on crop modelling and sharing experiment data (no RDF yet, but useful models to base your RDF upon): AgMIP (in which they’re reusing the ICASA variables: http://research.agmip.org/display/dev/ICASA+Master+Variable+List), APSIM: http://www.apsim.info/, CGIAR AgTrials (http://www.agtrials.org/).
    * Useful RDF vocabularies to build upon or reuse: the Crop Research Ontology: http://agroportal.lirmm.fr/ontologies/CO_715; AgroRDF: http://data.igreen-services.com/agrordf.
    Hope this helps a little.
    Best regards,
    Valeria
    - Show quoted text -From: richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Tuesday, February 20, 2018 12:45 PM
    To: Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi Clement,
    Thanks for your reply. I think more background would be useful for explaining my motivation.
    At Rothamsted we have several agriculture long-term experiments (e.g. Broadbalk winter wheat, Park grass, Hoosfield spring barley and more). Globally, with other agricultural research institutes, we’re starting to identify other long-term experiments and develop a better network for utilising them – there is growing interest in meta-analyses across diverse LTEs for questions on yield sustainability/sustainable intensification and natural capital.
    From our initial work it is apparent there are not any metadata standards or recommendations being widely used in his community.
    Some defining characteristics of agriculture long-term experiments are:
    1. They run for many years (decades) and an experimental plot will receive the same treatment regime over that time
    2. The experiments and their plots have names or identifiers which persist over time, however they can be known by different names in the literature.
    3. The experiment design can be modified (e.g. plots split or renamed, treatments modified)– this is a particular problem for older experiments with less robust/modern statistical designs
    4. Experiments can be used for research beyond their original purpose and new, but related, datasets generated. For example, our primary datasets are yield, but additional datasets on soils, biodiversity and phenotypic traits have been generated. These can all be linked to an experiment plot but currently there isn’t a recommendation on persistent identifiers for doing this.
    For point 2 above, a gazetteer-like controlled vocabulary of Long-term experiment, field and plot names could provide a resource for consistently naming these things and semantically tagging resources referencing them. It could also capture alternative names and basic relationships between different classes. – e.g. Plot A is part of Experiment X. The main classes would be site/field, experiment, plot. Potentially you could expand this into an ontology to capture other relationships for example, Experiment X is a Rotation Experiment.
    In fact I think the experiments, plots and fields should be assigned persistent identifiers and have relevant metadata captured. e.g. location, geographic & climate for the site; area, soil characteristics, treatment group for a plot; treatments; standard management & cropping for the experiment.
    I’ve looked closely at the DEIMS-SDR metadata models and MIAPPE and I don’t think either is quite sufficient to represent metadata for agricultural LTE datasets. I like the MI checklist approach and recommended ontologies taken by MIAPPE and I think existing ontologies can be used for much of the checklist detail – I think what is needed is an MI checklist relevant to agriculture LTEs.
    Thanks
    Richard
    From: Clement Jonquet [mailto:***@***.***] On Behalf Of Clement Jonquet
    Sent: 20 February 2018 06:13
    To: Richard Ostler <***@***.***>
    Cc: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    but experiment names, fields and plots are not represented by any controlled vocabularly.
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------
    Le 16 févr. 2018 à 01:20, RichardOstler <***@***.***> a écrit :
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.

  • Simon Cox's picture

    Author: Simon Cox

    Date: 21 Feb, 2018

    Also consider the w3c SSN ontology for sampling and observations.
    w3.org/TR/vocab-ssn
    - Show quoted text -From: valeria.pesce=***@***.***-groups.org <***@***.***-groups.org> on behalf of valeriapesce <***@***.***>
    Sent: Tuesday, 20 February 2018 11:50:40 AM
    To: RichardOstler; Clement Jonquet; Agricultural Data Interest Group (IGAD)
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    Interesting task.
    I would second Armando’s recommendation that for your use case you might need an ontology / model / schema for agricultural experiments rather than a thesaurus (you can have a thesaurus for the controlled values of specific properties).
    It’s also true that I’ve seen cases in which people use SKOS to create a list of properties that they need to describe something, but I don’t think it’s the best use of SKOS.
    And I would also highlight what Armando says about distinguishing between vocabularies and datasets: you will probably need vocabularies for the experiment description structure (a schema or an ontology) and for controlled values, but the data that you expect will be contributed would be the actual instances of the experiments data, which would ideally go into a dataset, with records structured according to the vocabularies you designed.
    The only suggestion I would add is to look at what has already been done in a few projects (which you may already know):
    - The ICASA data standards for field experiments: https://dssat.net/data/standards_v2. I think the initial standard was a specification with possible serializations in CSV and XML. It provides a model and a list of variables. From what I understand there were plans to render ICASA in RDF in the TERRA-REF project (http://terraref.org/about/) and the ICASA data dictionary is also being mapped to various ontologies as part of the Agronomy Ontology project (http://www.obofoundry.org/ontology/agro.html).
    - Projects on crop modelling and sharing experiment data (no RDF yet, but useful models to base your RDF upon): AgMIP (in which they’re reusing the ICASA variables: http://research.agmip.org/display/dev/ICASA+Master+Variable+List), APSIM: http://www.apsim.info/, CGIAR AgTrials (http://www.agtrials.org/).
    - Useful RDF vocabularies to build upon or reuse: the Crop Research Ontology: http://agroportal.lirmm.fr/ontologies/CO_715; AgroRDF: http://data.igreen-services.com/agrordf.
    Hope this helps a little.
    Best regards,
    Valeria
    From: richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Tuesday, February 20, 2018 12:45 PM
    To: Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi Clement,
    Thanks for your reply. I think more background would be useful for explaining my motivation.
    At Rothamsted we have several agriculture long-term experiments (e.g. Broadbalk winter wheat, Park grass, Hoosfield spring barley and more). Globally, with other agricultural research institutes, we’re starting to identify other long-term experiments and develop a better network for utilising them – there is growing interest in meta-analyses across diverse LTEs for questions on yield sustainability/sustainable intensification and natural capital.
    From our initial work it is apparent there are not any metadata standards or recommendations being widely used in his community.
    Some defining characteristics of agriculture long-term experiments are:
    1. They run for many years (decades) and an experimental plot will receive the same treatment regime over that time
    2. The experiments and their plots have names or identifiers which persist over time, however they can be known by different names in the literature.
    3. The experiment design can be modified (e.g. plots split or renamed, treatments modified)– this is a particular problem for older experiments with less robust/modern statistical designs
    4. Experiments can be used for research beyond their original purpose and new, but related, datasets generated. For example, our primary datasets are yield, but additional datasets on soils, biodiversity and phenotypic traits have been generated. These can all be linked to an experiment plot but currently there isn’t a recommendation on persistent identifiers for doing this.
    For point 2 above, a gazetteer-like controlled vocabulary of Long-term experiment, field and plot names could provide a resource for consistently naming these things and semantically tagging resources referencing them. It could also capture alternative names and basic relationships between different classes. – e.g. Plot A is part of Experiment X. The main classes would be site/field, experiment, plot. Potentially you could expand this into an ontology to capture other relationships for example, Experiment X is a Rotation Experiment.
    In fact I think the experiments, plots and fields should be assigned persistent identifiers and have relevant metadata captured. e.g. location, geographic & climate for the site; area, soil characteristics, treatment group for a plot; treatments; standard management & cropping for the experiment.
    I’ve looked closely at the DEIMS-SDR metadata models and MIAPPE and I don’t think either is quite sufficient to represent metadata for agricultural LTE datasets. I like the MI checklist approach and recommended ontologies taken by MIAPPE and I think existing ontologies can be used for much of the checklist detail – I think what is needed is an MI checklist relevant to agriculture LTEs.
    Thanks
    Richard
    From: Clement Jonquet [mailto:***@***.***] On Behalf Of Clement Jonquet
    Sent: 20 February 2018 06:13
    To: Richard Ostler <***@***.***>
    Cc: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    but experiment names, fields and plots are not represented by any controlled vocabularly.
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------
    Le 16 févr. 2018 à 01:20, RichardOstler <***@***.***> a écrit :
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    Also consider the w3c SSN ontology for sampling and observations.
    w3.org/TR/vocab-ssn
    ________________________________
    From: valeria.pesce=***@***.***-groups.org <***@***.***-groups.org> on behalf of valeriapesce <***@***.***>
    Sent: Tuesday, 20 February 2018 11:50:40 AM
    To: RichardOstler; Clement Jonquet; Agricultural Data Interest Group (IGAD)
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    Interesting task.
    I would second Armando’s recommendation that for your use case you might need an ontology / model / schema for agricultural experiments rather than a thesaurus (you can have a thesaurus for the controlled values of specific properties).
    It’s also true that I’ve seen cases in which people use SKOS to create a list of properties that they need to describe something, but I don’t think it’s the best use of SKOS.
    And I would also highlight what Armando says about distinguishing between vocabularies and datasets: you will probably need vocabularies for the experiment description structure (a schema or an ontology) and for controlled values, but the data that you expect will be contributed would be the actual instances of the experiments data, which would ideally go into a dataset, with records structured according to the vocabularies you designed.
    The only suggestion I would add is to look at what has already been done in a few projects (which you may already know):
    - The ICASA data standards for field experiments: https://dssat.net/data/standards_v2. I think the initial standard was a specification with possible serializations in CSV and XML. It provides a model and a list of variables. From what I understand there were plans to render ICASA in RDF in the TERRA-REF project (http://terraref.org/about/) and the ICASA data dictionary is also being mapped to various ontologies as part of the Agronomy Ontology project (http://www.obofoundry.org/ontology/agro.html).
    - Projects on crop modelling and sharing experiment data (no RDF yet, but useful models to base your RDF upon): AgMIP (in which they’re reusing the ICASA variables: http://research.agmip.org/display/dev/ICASA+Master+Variable+List), APSIM: http://www.apsim.info/, CGIAR AgTrials (http://www.agtrials.org/).
    - Useful RDF vocabularies to build upon or reuse: the Crop Research Ontology: http://agroportal.lirmm.fr/ontologies/CO_715; AgroRDF: http://data.igreen-services.com/agrordf.
    Hope this helps a little.
    Best regards,
    Valeria
    - Show quoted text -From: richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Tuesday, February 20, 2018 12:45 PM
    To: Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi Clement,
    Thanks for your reply. I think more background would be useful for explaining my motivation.
    At Rothamsted we have several agriculture long-term experiments (e.g. Broadbalk winter wheat, Park grass, Hoosfield spring barley and more). Globally, with other agricultural research institutes, we’re starting to identify other long-term experiments and develop a better network for utilising them – there is growing interest in meta-analyses across diverse LTEs for questions on yield sustainability/sustainable intensification and natural capital.
    From our initial work it is apparent there are not any metadata standards or recommendations being widely used in his community.
    Some defining characteristics of agriculture long-term experiments are:
    1. They run for many years (decades) and an experimental plot will receive the same treatment regime over that time
    2. The experiments and their plots have names or identifiers which persist over time, however they can be known by different names in the literature.
    3. The experiment design can be modified (e.g. plots split or renamed, treatments modified)– this is a particular problem for older experiments with less robust/modern statistical designs
    4. Experiments can be used for research beyond their original purpose and new, but related, datasets generated. For example, our primary datasets are yield, but additional datasets on soils, biodiversity and phenotypic traits have been generated. These can all be linked to an experiment plot but currently there isn’t a recommendation on persistent identifiers for doing this.
    For point 2 above, a gazetteer-like controlled vocabulary of Long-term experiment, field and plot names could provide a resource for consistently naming these things and semantically tagging resources referencing them. It could also capture alternative names and basic relationships between different classes. – e.g. Plot A is part of Experiment X. The main classes would be site/field, experiment, plot. Potentially you could expand this into an ontology to capture other relationships for example, Experiment X is a Rotation Experiment.
    In fact I think the experiments, plots and fields should be assigned persistent identifiers and have relevant metadata captured. e.g. location, geographic & climate for the site; area, soil characteristics, treatment group for a plot; treatments; standard management & cropping for the experiment.
    I’ve looked closely at the DEIMS-SDR metadata models and MIAPPE and I don’t think either is quite sufficient to represent metadata for agricultural LTE datasets. I like the MI checklist approach and recommended ontologies taken by MIAPPE and I think existing ontologies can be used for much of the checklist detail – I think what is needed is an MI checklist relevant to agriculture LTEs.
    Thanks
    Richard
    From: Clement Jonquet [mailto:***@***.***] On Behalf Of Clement Jonquet
    Sent: 20 February 2018 06:13
    To: Richard Ostler <***@***.***>
    Cc: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    but experiment names, fields and plots are not represented by any controlled vocabularly.
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------
    Le 16 févr. 2018 à 01:20, RichardOstler <***@***.***> a écrit :
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.

  • Richard Ostler's picture

    Author: Richard Ostler

    Date: 21 Feb, 2018

    Thanks for the tips.
    So far I’ve mostly been using ontology lookup service and agroportal, and have been selecting measurement concepts from Unit Ontology (https://github.com/bio-ontology-research-group/unit-ontology) and Crop Ontology. However, I have found a few gaps with older imperial measures (e.g. hundredweight) which the Wageningen unit ontology does include, so it’s a shame it isn’t listed/searchable on OLS and Agroportal - I’ve also checked bioportal and fairsharing and it isn’t listed on those either.
    Thanks
    Richard
    From: ***@***.*** [mailto:***@***.***]
    Sent: 21 February 2018 08:20
    To: simon.cox <***@***.***>
    Cc: valeria pesce <***@***.***>; Richard Ostler <***@***.***>; ***@***.***; ***@***.***-groups.org
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    You may also be interested in ontologies for units of measurement, like http://www.wurvoc.org/vocabularies/om-1.6/ .
    Sent via Webmail interface
    ________________________________
    From: "simon.cox" <***@***.***>
    To: "valeria pesce" <***@***.***>, "richard ostler" <***@***.***>, ***@***.***, ***@***.***-groups.org
    Sent: Wednesday, 21 February, 2018 4:07:17 AM
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Also consider the w3c SSN ontology for sampling and observations.
    w3.org/TR/vocab-ssn
    - Show quoted text -From: valeria.pesce=***@***.***-groups.org <***@***.***-groups.org> on behalf of valeriapesce <***@***.***>
    Sent: Tuesday, 20 February 2018 11:50:40 AM
    To: RichardOstler; Clement Jonquet; Agricultural Data Interest Group (IGAD)
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    Interesting task.
    I would second Armando’s recommendation that for your use case you might need an ontology / model / schema for agricultural experiments rather than a thesaurus (you can have a thesaurus for the controlled values of specific properties).
    It’s also true that I’ve seen cases in which people use SKOS to create a list of properties that they need to describe something, but I don’t think it’s the best use of SKOS.
    And I would also highlight what Armando says about distinguishing between vocabularies and datasets: you will probably need vocabularies for the experiment description structure (a schema or an ontology) and for controlled values, but the data that you expect will be contributed would be the actual instances of the experiments data, which would ideally go into a dataset, with records structured according to the vocabularies you designed.
    The only suggestion I would add is to look at what has already been done in a few projects (which you may already know):
    - The ICASA data standards for field experiments: https://dssat.net/data/standards_v2. I think the initial standard was a specification with possible serializations in CSV and XML. It provides a model and a list of variables. From what I understand there were plans to render ICASA in RDF in the TERRA-REF project (http://terraref.org/about/) and the ICASA data dictionary is also being mapped to various ontologies as part of the Agronomy Ontology project (http://www.obofoundry.org/ontology/agro.html).
    - Projects on crop modelling and sharing experiment data (no RDF yet, but useful models to base your RDF upon): AgMIP (in which they’re reusing the ICASA variables: http://research.agmip.org/display/dev/ICASA+Master+Variable+List), APSIM: http://www.apsim.info/, CGIAR AgTrials (http://www.agtrials.org/).
    - Useful RDF vocabularies to build upon or reuse: the Crop Research Ontology: http://agroportal.lirmm.fr/ontologies/CO_715; AgroRDF: http://data.igreen-services.com/agrordf.
    Hope this helps a little.
    Best regards,
    Valeria
    From: richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Tuesday, February 20, 2018 12:45 PM
    To: Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi Clement,
    Thanks for your reply. I think more background would be useful for explaining my motivation.
    At Rothamsted we have several agriculture long-term experiments (e.g. Broadbalk winter wheat, Park grass, Hoosfield spring barley and more). Globally, with other agricultural research institutes, we’re starting to identify other long-term experiments and develop a better network for utilising them – there is growing interest in meta-analyses across diverse LTEs for questions on yield sustainability/sustainable intensification and natural capital.
    From our initial work it is apparent there are not any metadata standards or recommendations being widely used in his community.
    Some defining characteristics of agriculture long-term experiments are:
    1. They run for many years (decades) and an experimental plot will receive the same treatment regime over that time
    2. The experiments and their plots have names or identifiers which persist over time, however they can be known by different names in the literature.
    3. The experiment design can be modified (e.g. plots split or renamed, treatments modified)– this is a particular problem for older experiments with less robust/modern statistical designs
    4. Experiments can be used for research beyond their original purpose and new, but related, datasets generated. For example, our primary datasets are yield, but additional datasets on soils, biodiversity and phenotypic traits have been generated. These can all be linked to an experiment plot but currently there isn’t a recommendation on persistent identifiers for doing this.
    For point 2 above, a gazetteer-like controlled vocabulary of Long-term experiment, field and plot names could provide a resource for consistently naming these things and semantically tagging resources referencing them. It could also capture alternative names and basic relationships between different classes. – e.g. Plot A is part of Experiment X. The main classes would be site/field, experiment, plot. Potentially you could expand this into an ontology to capture other relationships for example, Experiment X is a Rotation Experiment.
    In fact I think the experiments, plots and fields should be assigned persistent identifiers and have relevant metadata captured. e.g. location, geographic & climate for the site; area, soil characteristics, treatment group for a plot; treatments; standard management & cropping for the experiment.
    I’ve looked closely at the DEIMS-SDR metadata models and MIAPPE and I don’t think either is quite sufficient to represent metadata for agricultural LTE datasets. I like the MI checklist approach and recommended ontologies taken by MIAPPE and I think existing ontologies can be used for much of the checklist detail – I think what is needed is an MI checklist relevant to agriculture LTEs.
    Thanks
    Richard
    From: Clement Jonquet [mailto:***@***.***] On Behalf Of Clement Jonquet
    Sent: 20 February 2018 06:13
    To: Richard Ostler <***@***.***>
    Cc: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    but experiment names, fields and plots are not represented by any controlled vocabularly.
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------
    Le 16 févr. 2018 à 01:20, RichardOstler <***@***.***> a écrit :
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    Thanks for the tips.
    So far I’ve mostly been using ontology lookup service and agroportal, and have been selecting measurement concepts from Unit Ontology (https://github.com/bio-ontology-research-group/unit-ontology) and Crop Ontology. However, I have found a few gaps with older imperial measures (e.g. hundredweight) which the Wageningen unit ontology does include, so it’s a shame it isn’t listed/searchable on OLS and Agroportal - I’ve also checked bioportal and fairsharing and it isn’t listed on those either.
    Thanks
    Richard
    From: ***@***.*** [mailto:***@***.***]
    Sent: 21 February 2018 08:20
    To: simon.cox <***@***.***>
    Cc: valeria pesce <***@***.***>; Richard Ostler <***@***.***>; ***@***.***; ***@***.***-groups.org
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    You may also be interested in ontologies for units of measurement, like http://www.wurvoc.org/vocabularies/om-1.6/ .
    Sent via Webmail interface
    ________________________________
    From: "simon.cox" <***@***.***>
    To: "valeria pesce" <***@***.***>, "richard ostler" <***@***.***>, ***@***.***, ***@***.***-groups.org
    Sent: Wednesday, 21 February, 2018 4:07:17 AM
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Also consider the w3c SSN ontology for sampling and observations.
    w3.org/TR/vocab-ssn
    ________________________________
    From: valeria.pesce=***@***.***-groups.org <***@***.***-groups.org> on behalf of valeriapesce <***@***.***>
    Sent: Tuesday, 20 February 2018 11:50:40 AM
    To: RichardOstler; Clement Jonquet; Agricultural Data Interest Group (IGAD)
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    Interesting task.
    I would second Armando’s recommendation that for your use case you might need an ontology / model / schema for agricultural experiments rather than a thesaurus (you can have a thesaurus for the controlled values of specific properties).
    It’s also true that I’ve seen cases in which people use SKOS to create a list of properties that they need to describe something, but I don’t think it’s the best use of SKOS.
    And I would also highlight what Armando says about distinguishing between vocabularies and datasets: you will probably need vocabularies for the experiment description structure (a schema or an ontology) and for controlled values, but the data that you expect will be contributed would be the actual instances of the experiments data, which would ideally go into a dataset, with records structured according to the vocabularies you designed.
    The only suggestion I would add is to look at what has already been done in a few projects (which you may already know):
    - The ICASA data standards for field experiments: https://dssat.net/data/standards_v2. I think the initial standard was a specification with possible serializations in CSV and XML. It provides a model and a list of variables. From what I understand there were plans to render ICASA in RDF in the TERRA-REF project (http://terraref.org/about/) and the ICASA data dictionary is also being mapped to various ontologies as part of the Agronomy Ontology project (http://www.obofoundry.org/ontology/agro.html).
    - Projects on crop modelling and sharing experiment data (no RDF yet, but useful models to base your RDF upon): AgMIP (in which they’re reusing the ICASA variables: http://research.agmip.org/display/dev/ICASA+Master+Variable+List), APSIM: http://www.apsim.info/, CGIAR AgTrials (http://www.agtrials.org/).
    - Useful RDF vocabularies to build upon or reuse: the Crop Research Ontology: http://agroportal.lirmm.fr/ontologies/CO_715; AgroRDF: http://data.igreen-services.com/agrordf.
    Hope this helps a little.
    Best regards,
    Valeria
    - Show quoted text -From: richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Tuesday, February 20, 2018 12:45 PM
    To: Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi Clement,
    Thanks for your reply. I think more background would be useful for explaining my motivation.
    At Rothamsted we have several agriculture long-term experiments (e.g. Broadbalk winter wheat, Park grass, Hoosfield spring barley and more). Globally, with other agricultural research institutes, we’re starting to identify other long-term experiments and develop a better network for utilising them – there is growing interest in meta-analyses across diverse LTEs for questions on yield sustainability/sustainable intensification and natural capital.
    From our initial work it is apparent there are not any metadata standards or recommendations being widely used in his community.
    Some defining characteristics of agriculture long-term experiments are:
    1. They run for many years (decades) and an experimental plot will receive the same treatment regime over that time
    2. The experiments and their plots have names or identifiers which persist over time, however they can be known by different names in the literature.
    3. The experiment design can be modified (e.g. plots split or renamed, treatments modified)– this is a particular problem for older experiments with less robust/modern statistical designs
    4. Experiments can be used for research beyond their original purpose and new, but related, datasets generated. For example, our primary datasets are yield, but additional datasets on soils, biodiversity and phenotypic traits have been generated. These can all be linked to an experiment plot but currently there isn’t a recommendation on persistent identifiers for doing this.
    For point 2 above, a gazetteer-like controlled vocabulary of Long-term experiment, field and plot names could provide a resource for consistently naming these things and semantically tagging resources referencing them. It could also capture alternative names and basic relationships between different classes. – e.g. Plot A is part of Experiment X. The main classes would be site/field, experiment, plot. Potentially you could expand this into an ontology to capture other relationships for example, Experiment X is a Rotation Experiment.
    In fact I think the experiments, plots and fields should be assigned persistent identifiers and have relevant metadata captured. e.g. location, geographic & climate for the site; area, soil characteristics, treatment group for a plot; treatments; standard management & cropping for the experiment.
    I’ve looked closely at the DEIMS-SDR metadata models and MIAPPE and I don’t think either is quite sufficient to represent metadata for agricultural LTE datasets. I like the MI checklist approach and recommended ontologies taken by MIAPPE and I think existing ontologies can be used for much of the checklist detail – I think what is needed is an MI checklist relevant to agriculture LTEs.
    Thanks
    Richard
    From: Clement Jonquet [mailto:***@***.***] On Behalf Of Clement Jonquet
    Sent: 20 February 2018 06:13
    To: Richard Ostler <***@***.***>
    Cc: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    but experiment names, fields and plots are not represented by any controlled vocabularly.
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------
    Le 16 févr. 2018 à 01:20, RichardOstler <***@***.***> a écrit :
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.

  • Elizabeth Arnaud's picture

    Author: Elizabeth Arnaud

    Date: 21 Feb, 2018

    Just a quick naïve remark:
    Why not to submit to the reference Unit Ontology (UO) all missing concepts that are the WUR Ontology of Unit of Measurement (UM)? UM terms would then get a UO ID. Additionally, Unit Ontology is available on OBO-foundry ( http://www.obofoundry.org/ontology/uo.html) which is the reference source of ontologies used by OLS through the API, then integrated concepts would automatically appear in OLS.
    Elizabeth
    From: <***@***.***-groups.org> on behalf of RichardOstler <***@***.***>
    Date: mercredi 21 février 2018 12:33
    To: "***@***.***" <***@***.***>, "simon.cox" <***@***.***>, "Agricultural Data Interest Group (IGAD)" <***@***.***-groups.org>
    Cc: "***@***.***" <***@***.***>, "***@***.***" <***@***.***>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Thanks for the tips.
    So far I’ve mostly been using ontology lookup service and agroportal, and have been selecting measurement concepts from Unit Ontology (https://github.com/bio-ontology-research-group/unit-ontology) and Crop Ontology. However, I have found a few gaps with older imperial measures (e.g. hundredweight) which the Wageningen unit ontology does include, so it’s a shame it isn’t listed/searchable on OLS and Agroportal - I’ve also checked bioportal and fairsharing and it isn’t listed on those either.
    Thanks
    Richard
    From: ***@***.*** [mailto:***@***.***]
    Sent: 21 February 2018 08:20
    To: simon.cox <***@***.***>
    Cc: valeria pesce <***@***.***>; Richard Ostler <***@***.***>; ***@***.***; ***@***.***-groups.org
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    You may also be interested in ontologies for units of measurement, like http://www.wurvoc.org/vocabularies/om-1.6/ .
    Sent via Webmail interface
    ________________________________
    From: "simon.cox" <***@***.***>
    To: "valeria pesce" <***@***.***>, "richard ostler" <***@***.***>, ***@***.***, ***@***.***-groups.org
    Sent: Wednesday, 21 February, 2018 4:07:17 AM
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Also consider the w3c SSN ontology for sampling and observations.
    w3.org/TR/vocab-ssn
    - Show quoted text -From:valeria.pesce=***@***.***-groups.org <***@***.***-groups.org> on behalf of valeriapesce <***@***.***>
    Sent: Tuesday, 20 February 2018 11:50:40 AM
    To: RichardOstler; Clement Jonquet; Agricultural Data Interest Group (IGAD)
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    Interesting task.
    I would second Armando’s recommendation that for your use case you might need an ontology / model / schema for agricultural experiments rather than a thesaurus (you can have a thesaurus for the controlled values of specific properties).
    It’s also true that I’ve seen cases in which people use SKOS to create a list of properties that they need to describe something, but I don’t think it’s the best use of SKOS.
    And I would also highlight what Armando says about distinguishing between vocabularies and datasets: you will probably need vocabularies for the experiment description structure (a schema or an ontology) and for controlled values, but the data that you expect will be contributed would be the actual instances of the experiments data, which would ideally go into a dataset, with records structured according to the vocabularies you designed.
    The only suggestion I would add is to look at what has already been done in a few projects (which you may already know):
    - The ICASA data standards for field experiments: https://dssat.net/data/standards_v2. I think the initial standard was a specification with possible serializations in CSV and XML. It provides a model and a list of variables. From what I understand there were plans to render ICASA in RDF in the TERRA-REF project (http://terraref.org/about/) and the ICASA data dictionary is also being mapped to various ontologies as part of the Agronomy Ontology project (http://www.obofoundry.org/ontology/agro.html).
    - Projects on crop modelling and sharing experiment data (no RDF yet, but useful models to base your RDF upon): AgMIP (in which they’re reusing the ICASA variables: http://research.agmip.org/display/dev/ICASA+Master+Variable+List), APSIM: http://www.apsim.info/, CGIAR AgTrials (http://www.agtrials.org/).
    - Useful RDF vocabularies to build upon or reuse: the Crop Research Ontology: http://agroportal.lirmm.fr/ontologies/CO_715; AgroRDF: http://data.igreen-services.com/agrordf.
    Hope this helps a little.
    Best regards,
    Valeria
    From:richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Tuesday, February 20, 2018 12:45 PM
    To: Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi Clement,
    Thanks for your reply. I think more background would be useful for explaining my motivation.
    At Rothamsted we have several agriculture long-term experiments (e.g. Broadbalk winter wheat, Park grass, Hoosfield spring barley and more). Globally, with other agricultural research institutes, we’re starting to identify other long-term experiments and develop a better network for utilising them – there is growing interest in meta-analyses across diverse LTEs for questions on yield sustainability/sustainable intensification and natural capital.
    From our initial work it is apparent there are not any metadata standards or recommendations being widely used in his community.
    Some defining characteristics of agriculture long-term experiments are:
    1. They run for many years (decades) and an experimental plot will receive the same treatment regime over that time
    2. The experiments and their plots have names or identifiers which persist over time, however they can be known by different names in the literature.
    3. The experiment design can be modified (e.g. plots split or renamed, treatments modified)– this is a particular problem for older experiments with less robust/modern statistical designs
    4. Experiments can be used for research beyond their original purpose and new, but related, datasets generated. For example, our primary datasets are yield, but additional datasets on soils, biodiversity and phenotypic traits have been generated. These can all be linked to an experiment plot but currently there isn’t a recommendation on persistent identifiers for doing this.
    For point 2 above, a gazetteer-like controlled vocabulary of Long-term experiment, field and plot names could provide a resource for consistently naming these things and semantically tagging resources referencing them. It could also capture alternative names and basic relationships between different classes. – e.g. Plot A is part of Experiment X. The main classes would be site/field, experiment, plot. Potentially you could expand this into an ontology to capture other relationships for example, Experiment X is a Rotation Experiment.
    In fact I think the experiments, plots and fields should be assigned persistent identifiers and have relevant metadata captured. e.g. location, geographic & climate for the site; area, soil characteristics, treatment group for a plot; treatments; standard management & cropping for the experiment.
    I’ve looked closely at the DEIMS-SDR metadata models and MIAPPE and I don’t think either is quite sufficient to represent metadata for agricultural LTE datasets. I like the MI checklist approach and recommended ontologies taken by MIAPPE and I think existing ontologies can be used for much of the checklist detail – I think what is needed is an MI checklist relevant to agriculture LTEs.
    Thanks
    Richard
    From: Clement Jonquet [mailto:***@***.***] On Behalf Of Clement Jonquet
    Sent: 20 February 2018 06:13
    To: Richard Ostler <***@***.***>
    Cc: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    but experiment names, fields and plots are not represented by any controlled vocabularly.
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------
    Le 16 févr. 2018 à 01:20, RichardOstler <***@***.***> a écrit :
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    Just a quick naïve remark:
    Why not to submit to the reference Unit Ontology (UO) all missing concepts that are the WUR Ontology of Unit of Measurement (UM)? UM terms would then get a UO ID. Additionally, Unit Ontology is available on OBO-foundry ( http://www.obofoundry.org/ontology/uo.html) which is the reference source of ontologies used by OLS through the API, then integrated concepts would automatically appear in OLS.
    Elizabeth
    From: <***@***.***-groups.org> on behalf of RichardOstler <***@***.***>
    Date: mercredi 21 février 2018 12:33
    To: "***@***.***" <***@***.***>, "simon.cox" <***@***.***>, "Agricultural Data Interest Group (IGAD)" <***@***.***-groups.org>
    Cc: "***@***.***" <***@***.***>, "***@***.***" <***@***.***>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Thanks for the tips.
    So far I’ve mostly been using ontology lookup service and agroportal, and have been selecting measurement concepts from Unit Ontology (https://github.com/bio-ontology-research-group/unit-ontology) and Crop Ontology. However, I have found a few gaps with older imperial measures (e.g. hundredweight) which the Wageningen unit ontology does include, so it’s a shame it isn’t listed/searchable on OLS and Agroportal - I’ve also checked bioportal and fairsharing and it isn’t listed on those either.
    Thanks
    Richard
    From: ***@***.*** [mailto:***@***.***]
    Sent: 21 February 2018 08:20
    To: simon.cox <***@***.***>
    Cc: valeria pesce <***@***.***>; Richard Ostler <***@***.***>; ***@***.***; ***@***.***-groups.org
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    You may also be interested in ontologies for units of measurement, like http://www.wurvoc.org/vocabularies/om-1.6/ .
    Sent via Webmail interface
    ________________________________
    From: "simon.cox" <***@***.***>
    To: "valeria pesce" <***@***.***>, "richard ostler" <***@***.***>, ***@***.***, ***@***.***-groups.org
    Sent: Wednesday, 21 February, 2018 4:07:17 AM
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Also consider the w3c SSN ontology for sampling and observations.
    w3.org/TR/vocab-ssn
    ________________________________
    From:valeria.pesce=***@***.***-groups.org <***@***.***-groups.org> on behalf of valeriapesce <***@***.***>
    Sent: Tuesday, 20 February 2018 11:50:40 AM
    To: RichardOstler; Clement Jonquet; Agricultural Data Interest Group (IGAD)
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    Interesting task.
    I would second Armando’s recommendation that for your use case you might need an ontology / model / schema for agricultural experiments rather than a thesaurus (you can have a thesaurus for the controlled values of specific properties).
    It’s also true that I’ve seen cases in which people use SKOS to create a list of properties that they need to describe something, but I don’t think it’s the best use of SKOS.
    And I would also highlight what Armando says about distinguishing between vocabularies and datasets: you will probably need vocabularies for the experiment description structure (a schema or an ontology) and for controlled values, but the data that you expect will be contributed would be the actual instances of the experiments data, which would ideally go into a dataset, with records structured according to the vocabularies you designed.
    The only suggestion I would add is to look at what has already been done in a few projects (which you may already know):
    - The ICASA data standards for field experiments: https://dssat.net/data/standards_v2. I think the initial standard was a specification with possible serializations in CSV and XML. It provides a model and a list of variables. From what I understand there were plans to render ICASA in RDF in the TERRA-REF project (http://terraref.org/about/) and the ICASA data dictionary is also being mapped to various ontologies as part of the Agronomy Ontology project (http://www.obofoundry.org/ontology/agro.html).
    - Projects on crop modelling and sharing experiment data (no RDF yet, but useful models to base your RDF upon): AgMIP (in which they’re reusing the ICASA variables: http://research.agmip.org/display/dev/ICASA+Master+Variable+List), APSIM: http://www.apsim.info/, CGIAR AgTrials (http://www.agtrials.org/).
    - Useful RDF vocabularies to build upon or reuse: the Crop Research Ontology: http://agroportal.lirmm.fr/ontologies/CO_715; AgroRDF: http://data.igreen-services.com/agrordf.
    Hope this helps a little.
    Best regards,
    Valeria
    - Show quoted text -From:richard.ostler=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of RichardOstler
    Sent: Tuesday, February 20, 2018 12:45 PM
    To: Clement Jonquet <***@***.***>; Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Hi Clement,
    Thanks for your reply. I think more background would be useful for explaining my motivation.
    At Rothamsted we have several agriculture long-term experiments (e.g. Broadbalk winter wheat, Park grass, Hoosfield spring barley and more). Globally, with other agricultural research institutes, we’re starting to identify other long-term experiments and develop a better network for utilising them – there is growing interest in meta-analyses across diverse LTEs for questions on yield sustainability/sustainable intensification and natural capital.
    From our initial work it is apparent there are not any metadata standards or recommendations being widely used in his community.
    Some defining characteristics of agriculture long-term experiments are:
    1. They run for many years (decades) and an experimental plot will receive the same treatment regime over that time
    2. The experiments and their plots have names or identifiers which persist over time, however they can be known by different names in the literature.
    3. The experiment design can be modified (e.g. plots split or renamed, treatments modified)– this is a particular problem for older experiments with less robust/modern statistical designs
    4. Experiments can be used for research beyond their original purpose and new, but related, datasets generated. For example, our primary datasets are yield, but additional datasets on soils, biodiversity and phenotypic traits have been generated. These can all be linked to an experiment plot but currently there isn’t a recommendation on persistent identifiers for doing this.
    For point 2 above, a gazetteer-like controlled vocabulary of Long-term experiment, field and plot names could provide a resource for consistently naming these things and semantically tagging resources referencing them. It could also capture alternative names and basic relationships between different classes. – e.g. Plot A is part of Experiment X. The main classes would be site/field, experiment, plot. Potentially you could expand this into an ontology to capture other relationships for example, Experiment X is a Rotation Experiment.
    In fact I think the experiments, plots and fields should be assigned persistent identifiers and have relevant metadata captured. e.g. location, geographic & climate for the site; area, soil characteristics, treatment group for a plot; treatments; standard management & cropping for the experiment.
    I’ve looked closely at the DEIMS-SDR metadata models and MIAPPE and I don’t think either is quite sufficient to represent metadata for agricultural LTE datasets. I like the MI checklist approach and recommended ontologies taken by MIAPPE and I think existing ontologies can be used for much of the checklist detail – I think what is needed is an MI checklist relevant to agriculture LTEs.
    Thanks
    Richard
    From: Clement Jonquet [mailto:***@***.***] On Behalf Of Clement Jonquet
    Sent: 20 February 2018 06:13
    To: Richard Ostler <***@***.***>
    Cc: Agricultural Data Interest Group (IGAD) <***@***.***-groups.org>
    Subject: Re: [rda-agrdatainterop-ig] Advice on new controlled vocabulary
    Dear Richard,
    All
    Will not get into the SKOS or OWL debate. Depending on what you want to model, SKOS is a good starter than ultimately you might need to pass to OWL for more expressivity.
    Could you tell us more about :
    but experiment names, fields and plots are not represented by any controlled vocabularly.
    Can you list exactly the terms/concepts that you will be interested in modeling? Is the specificity of these experiments the fact that they are long-term only ?
    Because, I have the feeling you can certainly attach this to other existing vocabularies out there…
    You might consider using the AgroPortal Recommender which could help identify what are the ontologies that have good coverage over a certain list of terms.
    In case you decide to create your own new URIs (because you really have to say something specific about them), we can help you map your new vocabulary to existing ones in AgroPortal (we have someone now in the team, Elcio, who recently joined us to support users into this process). And of course, host it in AgroPortal.
    Will be in Berlin too, to discuss.
    Clement
    -------------------------------------------------------------------------------------------
    Dr. Clement JONQUET - PhD in Informatics - Assistant Professor
    University of Montpellier
    http://www.lirmm.fr/~jonquet
    ------------------------------------------------------------------------------------------
    Le 16 févr. 2018 à 01:20, RichardOstler <***@***.***> a écrit :
    Hi all,
    I'm involved in a project to establish a global network for long-term agricultural experiments (experiments running for >10 years). One aim of the project is to improve access to and interoperability of datasets from these experiments. In most cases I think existing ontologies/vocabularies can be used, but experiment names, fields and plots are not represented by any controlled vocabularly. Clearly creating a concept ID for a name means an experiment, field or plot can be umabiguously referenced, e.g. in DOI metadata.
    I think a SKOS is an appropriate model since it can capture:
    * Preferred and alternative names for experiments, experiment sites/fields and plots
    * Capture relations between a field/site and experiments conducted there on that field
    * Capture relations between experiment plots and an experiment and field and relations between plots (where they've merged or been split into sub-plots).
    I would appreciate this groups thoughts on the following questions:
    1. Is it better to create a new vocabulary specific for long-term agricultural experiments or add to an existing agriculture domain vocabulary (e.g. Agrovoc, GACS)?
    2. If we build a new vocabulary where is the best place to host it? Agroportal?
    3. If successful the vocabulary would have information contributed from many institutes, any advice on collecting data and curating this?
    I will be at the Berlin meeting so happy to discuss this and the network further there.
    thanks
    Richard
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
    --
    Full post: https://www.rd-alliance.org/group/agricultural-data-interest-group-igad/...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/58922
    Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.

submit a comment