Survey on hierarchy scenarios for GACS - help requested by Nov 20

09 Nov 2016
Groups audience: 

Dear all,
I'm writing to request your attention to a survey [1,2] that will help us
decide among alternative ways to structure the hierarchy of the new Global
Agricultural Concept Scheme (GACS) [3,4]. If you can do this, we'd appreciate
your responses by Sunday, November 20, in order to prepare for a face-to-face
meeting on November 22.
GACS was created by mapping frequently used concepts from three large thesauri
about agriculture to a smaller set of core concepts.
The source concepts were merged into GACS concepts along with their
relationships to broader and narrower concepts, with the result that many
concepts in GACS are embedded in a "polyhierarchy" -- that is, they are
situated in more than one chain of hierarchically related concepts.
For example, "plant fats and oils" [5] is situated at the end of three separate
hierarchical chains:
products
products and commodities
agricultural products
plant products
plant fats and oils
substances
materials
oils
plant fats and oils
products
products and commodities
oils
plant fats and oils
In order to fix this situation, we need to establish principles for the GACS
hierarchy. Osma Suominen has outlined three alternative scenarios for the GACS
hierarchy in a blog post [1], which links to the survey [2].
Please feel free to forward my note to any colleagues who may be interested.
Many thanks,
Tom
[1] http://aims.fao.org/activity/blog/gacs-structural-survey-and-hierarchy-s...
[2] https://docs.google.com/forms/d/e/1FAIpQLSelyXbrva4xdQtq2d6-9vaRgU68cQ-y...
[3] http://browser.agrisemantics.org/gacs/en/
[4] http://agrisemantics.org/gacs/
[5] http://browser.agrisemantics.org/gacs/en/page/C7011
--
Tom Baker <***@***.***>

  • Simon Cox's picture

    Author: Simon Cox

    Date: 09 Nov, 2016

    Hi Tom -
    I've already completed the survey, and have been in a brief correspondence with Osma (more to come).
    But want to just draw attention in the broader group to the assumption here:
    I'm not convinced there is a situation that needs fixing. There are multiple hierarchies reflecting different approaches or points of view, each of which must make sense to some part of the community. There can be multiple routes walking a graph to successfully find the leaf you are seeking. The three examples below all make sense to me. If you remove any of these, then some points of view are implicitly suppressed.
    Now I understand that there would be risks in overwhelming the user if multiple organizing principles are presented at the top of the hierarchy, but even at this level I don't see why there should be only one (I think this is the focus of the survey). And then further down in the tree, I see no problem at all with poly-hierarchies.
    I find it easier taking the set view (which I think is equivalent to facetted search), rather tree/hierarchy view.
    Thus
    :broader ,
    .
    :broader ,
    .
    I'm sure there are some improvements that could be made to the hierarchies, but not at all convinced they should be coerced into just one.
    Perhaps the needs work on the UI, rather than focussing on the data?
    Simon
    -----Original Message-----
    From: tom=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of tombaker
    Sent: Thursday, 10 November 2016 2:33 AM
    To: RDA Agrisemantics WG <***@***.***-groups.org>
    Subject: [rda-agrisemantics-wg] Survey on hierarchy scenarios for GACS - help requested by Nov 20
    Dear all,
    I'm writing to request your attention to a survey [1,2] that will help us decide among alternative ways to structure the hierarchy of the new Global Agricultural Concept Scheme (GACS) [3,4]. If you can do this, we'd appreciate your responses by Sunday, November 20, in order to prepare for a face-to-face meeting on November 22.
    GACS was created by mapping frequently used concepts from three large thesauri about agriculture to a smaller set of core concepts.
    The source concepts were merged into GACS concepts along with their relationships to broader and narrower concepts, with the result that many concepts in GACS are embedded in a "polyhierarchy" -- that is, they are situated in more than one chain of hierarchically related concepts.
    For example, "plant fats and oils" [5] is situated at the end of three separate hierarchical chains:
    products
    products and commodities
    agricultural products
    plant products
    plant fats and oils
    substances
    materials
    oils
    plant fats and oils
    products
    products and commodities
    oils
    plant fats and oils
    In order to fix this situation, we need to establish principles for the GACS hierarchy. Osma Suominen has outlined three alternative scenarios for the GACS hierarchy in a blog post [1], which links to the survey [2].
    Please feel free to forward my note to any colleagues who may be interested.
    Many thanks,
    Tom
    [1] http://aims.fao.org/activity/blog/gacs-structural-survey-and-hierarchy-s...
    [2] https://docs.google.com/forms/d/e/1FAIpQLSelyXbrva4xdQtq2d6-9vaRgU68cQ-y...
    [3] http://browser.agrisemantics.org/gacs/en/
    [4] http://agrisemantics.org/gacs/
    [5] http://browser.agrisemantics.org/gacs/en/page/C7011
    --
    Tom Baker <***@***.***>

  • Thomas Baker's picture

    Author: Thomas Baker

    Date: 10 Nov, 2016

    On Wed, Nov 09, 2016 at 11:08:05PM +0000, Simon Cox wrote:
    > I've already completed the survey, and have been in a brief
    > correspondence with Osma (more to come). But want to just draw
    > attention in the broader group to the assumption here:
    >
    > > In order to fix this situation
    >
    > I'm not convinced there is a situation that needs fixing. There are
    > multiple hierarchies reflecting different approaches or points of
    > view, each of which must make sense to some part of the community.
    > There can be multiple routes walking a graph to successfully find the
    > leaf you are seeking. The three examples below all make sense to me.
    > If you remove any of these, then some points of view are implicitly
    > suppressed.
    Hi Simon,
    An excellent point!
    I agree with you, in principle, from a user point of view -- multiple
    routes to a concept can be useful (caveat: as long as they don't
    actually contradict each other in confusing ways).
    > Now I understand that there would be risks in overwhelming the user if
    > multiple organizing principles are presented at the top of the
    > hierarchy, but even at this level I don't see why there should be only
    > one (I think this is the focus of the survey). And then further down
    > in the tree, I see no problem at all with poly-hierarchies.
    I do not think we are assuming that all polyhierarchy is bad. Clear
    feedback from the survey [2] on this issue would be very helpful!
    To me, the issues are consistency and maintainability.
    In the example given, "plant fats and oils" [1] is situated at the end
    of three separate hierarchical chains inherited from the source
    thesauri.
    products > products and commodities > agricultural products > plant products > plant fats and oils
    substances > materials > oils > plant fats and oils
    products > products and commodities > oils > plant fats and oils
    If we value consistency of application -- i.e., that users should
    reasonably expect to find a complete set of relevant concepts at the end
    of the chain -- then in the short term, to ensure quality, we might want
    to check the existing chains for completeness. This would be quite a
    big task.
    In the medium term, we would need to ensure, somehow, that when a new
    concept were created, it would be linked to all applicable broader
    concepts. I'm trying to picture the detailed guidelines that would need
    to be written. In my opinion, it is a Good Thing to keep the core
    maintenance principles as simple as possible, both to limit the
    cognitive burden on maintainers and to reduce the risk of errors.
    In this case, if someone were creating "plant fats and oils" for the
    first time, they would need to know the hierarchies supported by GACS
    well enough to link both to "oils" and to "plant products". Were this
    not done, then the set of concepts under "plant products" would be
    incomplete and, were users to notice, GACS would appear inconsistent.
    There is an additional wrinkle to this particular example inasmuch as
    GACS, as currently defined, already has a set of five concept types, one
    of which happens to be Product (the full set is: Chemical, Geographical,
    Organism, Product, Topic). In this specific case, having a top concept
    for Product would seem confusingly redundant.
    Also: if polyhierarchy can be good, it does not follow that hierarchies
    must always lead to top concepts (see Scenario C in the survey). Let's
    say someone in the GACS community had a specific need for a
    well-maintained list of plant products. The concept "plant fats and
    oils" is already of type Product, so there might be no need to link
    plant products to higher-level concepts of product. The hierarchy, in
    this case, could be flat: just BT relations from specific plant products
    all pointing to the concept "plant products".
    In principle, a member of the GACS community could serve as the
    maintainer of a plant product subvocabulary within GACS. Such
    subvocabularies could either be hard-wired into the core GACS
    definitions or defined an external link set overlaid onto GACS. This is
    a question both of policy and of practicality. For example, we might
    want to make it easy for people in the community to edit link sets in a
    wiki-like manner.
    For issues like this, I personally think we should strive to keep GACS
    Core -- the part of GACS under a long-term maintenance commitment -- as
    simple as possible so as to ensure maintainability with low resources.
    Each additional construct taken into the core incurs a long-term
    maintenance commitment. In the case of a community-maintained list of
    plant products, the question would be whether maintenance of that list
    would end up on the plate of the core editorial board when its
    maintainers moves on, as they inevitably will. By setting clear
    policies on such things up-front, we can hopefully avoid problems like
    this down the line.
    Tom
    [1] http://browser.agrisemantics.org/gacs/en/page/C7011
    [2] http://aims.fao.org/activity/blog/gacs-structural-survey-and-hierarchy-s...
    --
    On Wed, Nov 09, 2016 at 11:08:05PM +0000, Simon Cox wrote:
    > I've already completed the survey, and have been in a brief
    > correspondence with Osma (more to come). But want to just draw
    > attention in the broader group to the assumption here:
    >
    > > In order to fix this situation
    >
    > I'm not convinced there is a situation that needs fixing. There are
    > multiple hierarchies reflecting different approaches or points of
    > view, each of which must make sense to some part of the community.
    > There can be multiple routes walking a graph to successfully find the
    > leaf you are seeking. The three examples below all make sense to me.
    > If you remove any of these, then some points of view are implicitly
    > suppressed.
    Hi Simon,
    An excellent point!
    I agree with you, in principle, from a user point of view -- multiple
    routes to a concept can be useful (caveat: as long as they don't
    actually contradict each other in confusing ways).
    I do not think we are assuming that all polyhierarchy is bad. Clear
    feedback from the survey [2] on this issue would be very helpful!
    To me, the issues are consistency and maintainability.
    In the example given, "plant fats and oils" [1] is situated at the end
    of three separate hierarchical chains inherited from the source
    thesauri.
    products > products and commodities > agricultural products > plant products > plant fats and oils
    substances > materials > oils > plant fats and oils
    products > products and commodities > oils > plant fats and oils
    If we value consistency of application -- i.e., that users should
    reasonably expect to find a complete set of relevant concepts at the end
    of the chain -- then in the short term, to ensure quality, we might want
    to check the existing chains for completeness. This would be quite a
    big task.
    In the medium term, we would need to ensure, somehow, that when a new
    concept were created, it would be linked to all applicable broader
    concepts. I'm trying to picture the detailed guidelines that would need
    to be written. In my opinion, it is a Good Thing to keep the core
    maintenance principles as simple as possible, both to limit the
    cognitive burden on maintainers and to reduce the risk of errors.
    In this case, if someone were creating "plant fats and oils" for the
    first time, they would need to know the hierarchies supported by GACS
    well enough to link both to "oils" and to "plant products". Were this
    not done, then the set of concepts under "plant products" would be
    incomplete and, were users to notice, GACS would appear inconsistent.
    There is an additional wrinkle to this particular example inasmuch as
    GACS, as currently defined, already has a set of five concept types, one
    of which happens to be Product (the full set is: Chemical, Geographical,
    Organism, Product, Topic). In this specific case, having a top concept
    for Product would seem confusingly redundant.
    Also: if polyhierarchy can be good, it does not follow that hierarchies
    must always lead to top concepts (see Scenario C in the survey). Let's
    say someone in the GACS community had a specific need for a
    well-maintained list of plant products. The concept "plant fats and
    oils" is already of type Product, so there might be no need to link
    plant products to higher-level concepts of product. The hierarchy, in
    this case, could be flat: just BT relations from specific plant products
    all pointing to the concept "plant products".
    In principle, a member of the GACS community could serve as the
    maintainer of a plant product subvocabulary within GACS. Such
    subvocabularies could either be hard-wired into the core GACS
    definitions or defined an external link set overlaid onto GACS. This is
    a question both of policy and of practicality. For example, we might
    want to make it easy for people in the community to edit link sets in a
    wiki-like manner.
    For issues like this, I personally think we should strive to keep GACS
    Core -- the part of GACS under a long-term maintenance commitment -- as
    simple as possible so as to ensure maintainability with low resources.
    Each additional construct taken into the core incurs a long-term
    maintenance commitment. In the case of a community-maintained list of
    plant products, the question would be whether maintenance of that list
    would end up on the plate of the core editorial board when its
    maintainers moves on, as they inevitably will. By setting clear
    policies on such things up-front, we can hopefully avoid problems like
    this down the line.
    Tom
    [1] http://browser.agrisemantics.org/gacs/en/page/C7011
    [2] http://aims.fao.org/activity/blog/gacs-structural-survey-and-hierarchy-s...
    --
    Tom Baker <***@***.***>

  • catherine roussey's picture

    Author: catherine roussey

    Date: 10 Nov, 2016

    The main problem for me is not the polyhierarchy but the fact that the
    term "and" is indeed a logical OR.
    This is not easy to understand the hierarchy logic when you write and
    instead of or.
    Some element position in the hierarchy should be changed
    + products or commodities
    ++ products
    +++ agricultural products
    ++++ plant fat or oil
    +++++ "agricultural" oil
    + substances
    ++ materials
    ++++ oil
    +++++ "agricultural" oil
    The two hierarchy expresses a different view point of what is an
    "agricultural oil". first point of view is the usage of the product, the
    second is the composition.
    Best regards
    Catherine Roussey

submit a comment