Can we move towards a working definition of metadata

30 Apr 2014
Groups audience: 

As part of the DFT group I'm wondering of the MIG has any definition(s) of metadata to propose that would be useful for out efforts to define things and your interests.

 

As context we have had some working defintions a bit of which was disucssed at P3.

We start very generally with  "Metadata is a type of data object that that contains attributes describing properties of an associated data or digital object. The association between a data object and metadata is that the content of the metadata describes the data object. " 

Since metadata plays different roles for different functions we added some additional ideas such as a role in PIDs.

" It may contain as key the persistent identifier of that associated object. "......but things get a little complicated as we note that metadata (MD) may serve different  purposes, such as helping people to find data of relevance - discovery (Michener 2006) or to bring data  together –a  federation role.

And wecan  note many other roles:

(aiding in) Discovery, Access, providing context, Selection, Licensing, authorization, Quality, suitability and Provenance (such as describing how data were gathered, reproducibility or summarizing terms for reuse .

 

Because of the different roles we may distinguish many different types of MD such as:

Administrative Metadata "Provides information about how to manage a resource,such asrights data, intellectualproperty info, date of creation and editing – can be very important for legal reasons." or

 

Structural  Metadata

The structural organization of a data object,such as chaptersin a book,sentencesin achapter, etc,that allows usto figure out how an objectshould be puttogether.  Also refers to the underlying structural metadata of digital objects that tells computers how to assemble them.

To this one may add  list of MD  attributes such as:

 Quality &  Provenance (such as describing how data were gathered, reproducibility or summarizing terms for reuse.)

When all this is taken into account, and it is just part of the story, we end up with less of a definition and more of an encylopedia entry.. But perhaps this is still useful.

  • Quentin Reul's picture

    Author: Quentin Reul

    Date: 30 Apr, 2014

    Hi Gary,
    I tend to make a distinction between "*descriptive*" and "*functional*"
    metadata. The first type of metadata describes the digital object (e.g.
    topic, creation date, etc.), while the second type encodes information that
    drives functionality on a platform (e.g. sort date, etc.). I realize that
    the distinction can be quite artificial.
    With regards to "*structural*" metadata, are you referring to the dynamic
    combination of digital objects? Or is it about how digital objects are
    managed? Although I don't disagree that we want to represent the relation
    between digital object, I'm not sure that relations between them is really
    metadata.
    Kind regards,
    Quentin

  • Gary Berg-Cross's picture

    Author: Gary Berg-Cross

    Date: 30 Apr, 2014

    Quentin,
    Hello, seems like we were just together at the Ontology Summit.....
    >"*descriptive*" and "*functional*" metadata... I realize that the
    distinction can be quite artificial. "
    To me metadata is a role that some information, represented as data, plays.
    Quentin,
    Hello, seems like we were just together at the Ontology Summit.....
    >"*descriptive*" and "*functional*" metadata... I realize that the
    distinction can be quite artificial. "
    To me metadata is a role that some information, represented as data, plays.
    So we can create as many roles for it as we think useful. That's why we
    need to take a community view of what people find useful.
    Types like "structural" make a good deal of sense from out experience and
    we might be saying that this data object that we are describing is made up
    of the following things or that it is part of a data collection.
    I can imagine that such structural info might be useful for some functions
    too, although I'm not clear if you are just talking about data management
    functions or more broadly into analytic things.
    I can imagine situations in which the 2 categories may not be orthogonal. (But
    we might have some agreement on types of metadata and what attributes, like
    a Dublin Core, should be populated.)
    As an example of non-orthogonality, if we know certain structural relations
    it might help search or query functions.
    One other thought. There is RDA WG activity on defining Data Types, so
    this would be corresponding work to define MD types.
    >Although I don't disagree that we want to represent the relation between
    digital object, I'm not sure that relations between them is really metadata.
    Yes, but then how should we classify or think about this info that
    describes relations between digital objects? If the DOs make up a
    collection then it might be thought of as metadata about the collection or
    some of its structural parts.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    NSF INTEROP Project
    http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Wed, Apr 30, 2014 at 2:28 PM, qhreul <***@***.***> wrote:
    > Hi Gary,
    >
    > I tend to make a distinction between "*descriptive*" and "*functional*"
    > metadata. The first type of metadata describes the digital object (e.g.
    > topic, creation date, etc.), while the second type encodes information that
    > drives functionality on a platform (e.g. sort date, etc.). I realize that
    > the distinction can be quite artificial.
    Quentin,
    Hello, seems like we were just together at the Ontology Summit.....
    >"*descriptive*" and "*functional*" metadata... I realize that the
    distinction can be quite artificial. "
    To me metadata is a role that some information, represented as data, plays.
    So we can create as many roles for it as we think useful. That's why we
    need to take a community view of what people find useful.
    Types like "structural" make a good deal of sense from out experience and
    we might be saying that this data object that we are describing is made up
    of the following things or that it is part of a data collection.
    I can imagine that such structural info might be useful for some functions
    too, although I'm not clear if you are just talking about data management
    functions or more broadly into analytic things.
    I can imagine situations in which the 2 categories may not be orthogonal. (But
    we might have some agreement on types of metadata and what attributes, like
    a Dublin Core, should be populated.)
    As an example of non-orthogonality, if we know certain structural relations
    it might help search or query functions.
    One other thought. There is RDA WG activity on defining Data Types, so
    this would be corresponding work to define MD types.
    >Although I don't disagree that we want to represent the relation between
    digital object, I'm not sure that relations between them is really metadata.
    Yes, but then how should we classify or think about this info that
    describes relations between digital objects? If the DOs make up a
    collection then it might be thought of as metadata about the collection or
    some of its structural parts.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    NSF INTEROP Project
    http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770

  • Chris Taylor 's picture

    Author: Chris Taylor

    Date: 30 Apr, 2014

    Hi,
    I think a little taxonomy of metadata would be extremely useful. As you
    say, the broad categories of administrative and structural (I'd maybe
    separate 'descriptive' or something as a third broad category distinct from
    structural) can be broken down further, so this is certainly non-trivial,
    but we need this I think. Ideally as an XSD somewhere.
    Could we hack something together somewhere? Just start with your list...
    Chris.

  • Chris Taylor 's picture

    Author: Chris Taylor

    Date: 30 Apr, 2014

    Hi,
    On metadata describing interrelationships: the ISA structure relies on that
    sort of information; basically, an 'investigation' describes the
    relationship between studies and assays in a body of work (
    http://www.isa-tools.org/). #justsaying :)
    Chris.

  • Keith Jeffery's picture

    Author: Keith Jeffery

    Date: 01 May, 2014

    All –
    Really good to see this discussion. Can I try to inject some more structuring into it?
    1. First can we throw out the commonly used definition that metadata is ‘data about data’ ?
    2. Can we agree that there is no difference between data and metadata – except the purpose for which it is used? Example, a library catalog card is metadata for the researcher finding the book on the shelf but data for the librarian counting books on ‘biochemistry’.
    3. Can we agree metadata is multidimensional? Most published classifications rely on intrinsic properties or functional usage. Some (DC, DCAT) just relate to the dataset, some (e.g. ISA) provide some context.
    4. Just to get the discussion rolling, how about this:
    a. Dimension 1: ‘level of detail of metadata’ : descriptive | contextual | detailed/specific. Example: descriptive: DC; contextual (project, person, organisation, funding, facility, equipment, publications….) CERIF, ISA; detailed/specific: schema level connecting dataset to software;
    b. Dimension 2: Purpose: re-use | interoperation. Example: re-use: using dataset for a repeat or different purpose; interoperation: using the dataset along with others for some purpose so that the user has a homogeneous view over heterogeneous (distributed) datasets;
    c. Dimension 3: Intrinsic Properties: Description | Location | Contextualisation | Preservation | Provenance | Schema; Examples: Description: DC, CKAN; Location: URL/URI; Contextualisation: (used to assess relevance/quality for purpose) CERIF, ISA; Preservation: OAIS architecture but it needs populating; Provenance: versioning of datasets and relationships expressed semantically including relationships to software, persons, organisations etc.
    As you can see in the above there is some overlap between the dimensions in terms of what is recorded or used but the different dimensions do have different aspects or modes of usage.
    5. Can we agree that metadata needs formal syntax (structure) and declared semantics (terms, meanings, relationships) ?
    And just to add to the fun here is a list of characteristics (maybe entities/objects or attributes) that I think need to be available concerning a dataset:
    • Unique Identifier (for later use including citation)
    • Location (URL)
    • Description
    • Keywords (terms)
    • Temporal coordinates
    • Geospatial coordinates
    • Originator (organisation(s) / person(s))
    • Project
    • Facility / equipment
    • Quality
    • Availability (licence, persistence)
    • Provenance
    • Citations
    • Related publications (white or grey)
    • Related software
    • Schema
    • Medium / format
    Some of these may be simple values, others (e.g. Quality) have a whole substructure. If we can agree such a list it forms a basis for characterising / classifying metadata standards and leads towards recommended usage for various purposes.
    I hope this stimulates the discussion!
    Best
    Keith
    Keith G Jeffery Consultants
    Prof Keith G Jeffery
    E: ***@***.***
    T: +44 7768 446088
    S: keithgjeffery
    Past President ERCIM www.ercim.eu (***@***.***)
    Past President euroCRIS www.eurocris.org
    Past Vice President VLDB www.vldb.org
    Fellow (CITP, CEng) BCS www.bcs.org
    Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
    Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
    Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
    ----------------------------------------------------------------------------------------------------------------------------------
    The contents of this email are sent in confidence for the use of the
    intended recipient only. If you are not one of the intended
    recipients do not take action on it or show it to anyone else, but
    return this email to the sender and delete your copy of it.
    ----------------------------------------------------------------------------------------------------------------------------------
    From: chrisftaylor=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of chrisftaylor
    Sent: 30 April 2014 22:18
    To: ***@***.***-groups.org
    Subject: Re: [rda-metadata-ig] Can we move towards a working definition of metadata
    Hi,
    On metadata describing interrelationships: the ISA structure relies on that sort of information; basically, an 'investigation' describes the relationship between studies and assays in a body of work (http://www.isa-tools.org/). #justsaying :)
    Chris.
    - Show quoted text -On 30 April 2014 19:28, qhreul <***@***.***> wrote:
    Hi Gary,
    I tend to make a distinction between "descriptive" and "functional" metadata. The first type of metadata describes the digital object (e.g. topic, creation date, etc.), while the second type encodes information that drives functionality on a platform (e.g. sort date, etc.). I realize that the distinction can be quite artificial.
    With regards to "structural" metadata, are you referring to the dynamic combination of digital objects? Or is it about how digital objects are managed? Although I don't disagree that we want to represent the relation between digital object, I'm not sure that relations between them is really metadata.
    Kind regards,
    Quentin
    On 30 April 2014 12:54, Gary <***@***.***> wrote:
    As part of the DFT group I'm wondering of the MIG has any definition(s) of metadata to propose that would be useful for out efforts to define things and your interests.
    As context we have had some working defintions a bit of which was disucssed at P3.
    We start very generally with "Metadata is a type of data object that that contains attributes describing properties of an associated data or digital object. The association between a data object and metadata is that the content of the metadata describes the data object. "
    Since metadata plays different roles for different functions we added some additional ideas such as a role in PIDs.
    " It may contain as key the persistent identifier of that associated object. "......but things get a little complicated as we note that metadata (MD) may serve different purposes, such as helping people to find data of relevance - discovery (Michener 2006) or to bring data together –a federation role.
    And wecan note many other roles:
    (aiding in) Discovery, Access, providing context, Selection, Licensing, authorization, Quality, suitability and Provenance (such as describing how data were gathered, reproducibility or summarizing terms for reuse .
    Because of the different roles we may distinguish many different types of MD such as:
    Administrative Metadata "Provides information about how to manage a resource,such asrights data, intellectualproperty info, date of creation and editing – can be very important for legal reasons." or
    Structural Metadata
    The structural organization of a data object,such as chaptersin a book,sentencesin achapter, etc,that allows usto figure out how an objectshould be puttogether. Also refers to the underlying structural metadata of digital objects that tells computers how to assemble them.
    To this one may add list of MD attributes such as:
    Quality & Provenance (such as describing how data were gathered, reproducibility or summarizing terms for reuse.)
    When all this is taken into account, and it is just part of the story, we end up with less of a definition and more of an encylopedia entry.. But perhaps this is still useful.
    --
    Full post: https://rd-alliance.org/can-we-move-towards-working-definition-metadata....
    Manage my subscriptions: https://rd-alliance.org/mailinglist
    Stop emails for this post: https://rd-alliance.org/mailinglist/unsubscribe/1714
    --
    Full post: https://www.rd-alliance.org/can-we-move-towards-working-definition-metad...
    Manage my subscriptions: https://www.rd-alliance.org/mailinglist
    Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/1714

  • Nikos Houssos's picture

    Author: Nikos Houssos

    Date: 01 May, 2014

    Dear all,
    Really interesting discussion. My 2 cents:
    1. Indeed the distinction between metadata and data (or more accurately,
    identifying which data is metadata and which is not) is fuzzy and the
    "data about data" definition does not seem satisfactory in general.
    Broader definitions might be used, for example "data about data and the
    processes and environment involved in the generation of data" but I
    doubt whether this is helpful (probably it is even more ambiguous!).
    2. Metadata is normally used and designed (upfront) for exchange and
    interoperability, i.e. information foreseen to be consumed only by a
    particular system/application can hardly be treated as metadata.
    3. An important feature of metadata, compared with "plain" data is open
    structure, genericity and extensibility. This does not mean that formal
    syntax and semantics is not necessary (on the contrary - both are
    indispensable and some defined structure is absolutely required for any
    practical use and interoperability). However, the metadata
    representation method and technology should allow the continuous
    definition of new data elements, albeit conforming to the original
    structure and with clear semantics. For instance, the metadata structure
    used in a system or a community should not require that all data element
    types and related semantics (e.g. vocabularies) are rigidly defined a
    priori, consequently, that any addition of data elements would require a
    modification in a community data model/standard. In other words,
    metadata is data that inherently supports evolution of data and related
    semantics (albeit in a structured and standard-conformant way) and,
    consequently, involves less "hard-coding" at the data model level (which
    BTW leads also to less "hard-coding" in technical implementations).
    4. Relationships/associations are key in metadata and are certainly
    first-class citizens (while this might not hold for all types of data).
    In fact, this (representation of most data elements as associations -
    with clear semantics - between entity instances) is an important enabler
    to achieve data structure evolution (as in point 3 above).
    5. Another aspect of metadata is the ability to manage different
    versions of the same data element value (e.g. in different languages -
    multi-linguality or different encodings) and/or different versions of
    the described object/entity.
    Looking forward to further discussion on this issue!
    Best regards,
    Nikos
    --------------------------------------------
    Nikos Houssos, Ph.D.
    Head, Software Development Unit
    National Documentation Centre / N.H.R.F.
    48, Vas. Constantinou Av.
    116 35 Athens, Greece
    phone: +30 210 7273949 fax: +30 210 7252223
    email: ***@***.***
    http://www.ekt.gr
    --------------------------------------------
    Στις 2014-05-01 08:04, ***@***.***
    έγραψε:

  • Keith Jeffery's picture

    Author: Keith Jeffery

    Date: 01 May, 2014

    Nikos -
    Thanks for your comments; unsurprisingly I agree broadly and especially with your points 3,4,5 which cover things I failed to mention.
    Unusually, I’d like to disagree with your point 2; even if a dataset never shared or interoperated I think - for example - provenance and preservation metadata for one dataset is valid metadata. I would also argue that a database schema is valid metadata.
    I am sure your comments will stimulate further discussion, thanks again.
    Best
    Keith
    -----------------------------------------------------------------------------------------------------------------------------------
    Keith G Jeffery Consultants
    Prof Keith G Jeffery
    E: ***@***.***
    T: +44 7768 446088
    S: keithgjeffery
    Past President ERCIM www.ercim.eu (***@***.***)
    Past President euroCRIS www.eurocris.org
    Past Vice President VLDB www.vldb.org
    Fellow (CITP, CEng) BCS www.bcs.org
    Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
    Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
    Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
    ----------------------------------------------------------------------------------------------------------------------------------
    The contents of this email are sent in confidence for the use of the
    intended recipient only. If you are not one of the intended
    recipients do not take action on it or show it to anyone else, but
    return this email to the sender and delete your copy of it.
    ----------------------------------------------------------------------------------------------------------------------------------
    -----Original Message-----
    From: nhoussos=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of nhoussos
    Sent: 01 May 2014 10:24
    To: ***@***.***-groups.org
    Subject: Re: [rda-metadata-ig] Can we move towards a working definition of metadata
    Dear all,
    Really interesting discussion. My 2 cents:
    1. Indeed the distinction between metadata and data (or more accurately, identifying which data is metadata and which is not) is fuzzy and the "data about data" definition does not seem satisfactory in general.
    Broader definitions might be used, for example "data about data and the processes and environment involved in the generation of data" but I doubt whether this is helpful (probably it is even more ambiguous!).
    2. Metadata is normally used and designed (upfront) for exchange and interoperability, i.e. information foreseen to be consumed only by a particular system/application can hardly be treated as metadata.
    3. An important feature of metadata, compared with "plain" data is open structure, genericity and extensibility. This does not mean that formal syntax and semantics is not necessary (on the contrary - both are indispensable and some defined structure is absolutely required for any practical use and interoperability). However, the metadata representation method and technology should allow the continuous definition of new data elements, albeit conforming to the original structure and with clear semantics. For instance, the metadata structure used in a system or a community should not require that all data element types and related semantics (e.g. vocabularies) are rigidly defined a priori, consequently, that any addition of data elements would require a modification in a community data model/standard. In other words, metadata is data that inherently supports evolution of data and related semantics (albeit in a structured and standard-conformant way) and, consequently, involves less "hard-coding" at the data model level (which BTW leads also to less "hard-coding" in technical implementations).
    4. Relationships/associations are key in metadata and are certainly first-class citizens (while this might not hold for all types of data).
    In fact, this (representation of most data elements as associations - with clear semantics - between entity instances) is an important enabler to achieve data structure evolution (as in point 3 above).
    5. Another aspect of metadata is the ability to manage different versions of the same data element value (e.g. in different languages - multi-linguality or different encodings) and/or different versions of the described object/entity.
    Looking forward to further discussion on this issue!
    Best regards,
    Nikos
    --------------------------------------------
    Nikos Houssos, Ph.D.
    Head, Software Development Unit
    National Documentation Centre / N.H.R.F.
    48, Vas. Constantinou Av.
    116 35 Athens, Greece
    phone: +30 210 7273949 fax: +30 210 7252223
    email: ***@***.***
    http://www.ekt.gr
    --------------------------------------------
    Στις 2014-05-01 08:04, ***@***.***
    έγραψε:

  • Chris Taylor 's picture

    Author: Chris Taylor

    Date: 01 May, 2014

    Hi,
    First a serious question: what are instrument settings (or worse,
    'set-ups')? They are controlled parameters, not measurements, so are they
    the last level of data, or the first level of metadata? An analogy might be
    interviewer questions.
    Second, less seriously (perhaps); I like 'data about data'. It allows both
    that one person's metadata is another person's data and that the whole
    thing is recursive (and that essentially everything is, er, are? data).
    Simple but hard to sink anyway :)
    Chris.
    On 1 May 2014 10:36, ***@***.*** <
    ***@***.***> wrote:

  • Andrea Perego's picture

    Author: Andrea Perego

    Date: 06 May, 2014

    My two cents...
    1. IMHO, it would be important to clearly state why we need to
    classify metadata, and which is the purpose of such classification.
    Personally, I find such exercise very difficult, since the same
    metadata element can be classified in different groups depending on
    the context. Just to give an example, INSPIRE [1] metadata elements
    are grouped into three main classes: discovery, evaluation and use.
    Whether a metadata element is in one class or another does not depend
    on its intrinsic characteristics, but simply because of the role it
    plays in the context of INSPIRE.
    2. +1 from me to Keith's points (1) (metadata = "data about data") and
    (2) ("there is no difference between data and metadata"). It's again
    the context determining whether given data can be considered as
    metadata. I also think that (1) highlights two important points. The
    former is recursion (as noted by Chris) - BTW, in INSPIRE we do have
    metadata about metadata (more precisely, who created the metadata,
    when and in which language). The other is that, after all, metadata
    are all descriptive - the difference is what they are describing, and
    for which purpose. On this, it may be worth noting that the de facto
    standard way on the Web to link a resource to its metadata is by using
    the "describedby" relation - see:
    - http://www.iana.org/assignments/link-relations/link-relations.xhtml
    - http://www.w3.org/TR/powder-dr/#assoc-linking
    (I must also say I have a conflict of interest here, since I
    contributed to the definition of such relation)
    Cheers,
    Andrea
    ----
    [1]http://inspire.ec.europa.eu/

  • Gary Berg-Cross's picture

    Author: Gary Berg-Cross

    Date: 07 May, 2014

    >it would be important to clearly state why we need to
    Andrea noted:
    >it would be important to clearly state why we need to
    classify metadata, and which is the purpose of such classification.
    The earlier discussion includes descriptive vs. functional categories and
    then some additional subclasses within each. The descriptive
    classification seems useful and is related to talking about data types.
    When we understand what type of data we are describing we are talking
    about descriptive MD. Different processes may be applied to data of
    different types in part because they have different data structures that
    get described for processing purposes etc.
    We need some other MD besides this. Data is often held in particular
    disciplinary repositories and effective use of these collections requires
    disciplinary expertise, including knowledge of assumptions and models used
    in creating and interpreting the data, but also the purposes for which the
    data was collected. Again it seems useful to try to come up with some way
    of capturing these purposes.
    On this issue of MD about MD, I agree that there MD about MD. After all MD
    is just data that plays a role of describing other data. And MD gets
    stored in repositories and we have data/MD about this such as who provided
    it, when it was last updated etc. After all it is just data playing a MD
    role.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    NSF INTEROP Project
    http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Tue, May 6, 2014 at 6:22 AM, andrea.perego <
    ***@***.***> wrote:

  • Gary Berg-Cross's picture

    Author: Gary Berg-Cross

    Date: 07 May, 2014

    A small addition to my last posting.
    When we trying to justify typing MD we might think of the some type like
    Provenance which has
    some special needs to track data over its lifetime.
    At any point data may be attached to a different data set than its original
    set. We want this type of information as well as still wanting to know
    what particular file the instance data came
    from, along with information about the organization responsible for
    generating the original data.
    Then there is temporal info - the range of time over which the data
    applies. When we are talking about dynamic data sets that range can change
    over time and this needs to be tracked.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    NSF INTEROP Project
    http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770

  • Keith Jeffery's picture

    Author: Keith Jeffery

    Date: 08 May, 2014

    Gary –
    Perhaps this is stretching the concept of ‘type’ too far; you are describing temporally-bound role-based relationships.
    Example: dataset Y was derived by summarisation from dataset X between date-time1 and date-time 2 (i.e. a transformation – provenance). A different example: dataset A was generated by equipment E between date-time1 and date-time2 (some time intervals e.g. in astronomy of neutrino science can be long). A third example is: dataset S was collected by seismic array A between date-time 1 and date-time 2 where typically the interval is some days.
    You could also express project P produced dataset D which relates to/covers time interval date-time 1 to date-time 2.
    The representation of relationships can be done by extended entity-relationship modelling, by object-relational modelling, by LOD/RDF, by OWL/RDF etc.
    Best
    Keith
    Keith G Jeffery Consultants
    Prof Keith G Jeffery
    E: ***@***.***
    T: +44 7768 446088
    S: keithgjeffery
    Past President ERCIM www.ercim.eu (***@***.***)
    Past President euroCRIS www.eurocris.org
    Past Vice President VLDB www.vldb.org
    Fellow (CITP, CEng) BCS www.bcs.org
    Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
    Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
    Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
    ----------------------------------------------------------------------------------------------------------------------------------
    The contents of this email are sent in confidence for the use of the
    intended recipient only. If you are not one of the intended
    recipients do not take action on it or show it to anyone else, but
    return this email to the sender and delete your copy of it.
    ----------------------------------------------------------------------------------------------------------------------------------
    From: gbergcross=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of Gary
    Sent: 08 May 2014 00:11
    To: ***@***.***-groups.org
    Subject: Re: [rda-metadata-ig] Can we move towards a working definition of metadata
    A small addition to my last posting.
    When we trying to justify typing MD we might think of the some type like Provenance which has
    some special needs to track data over its lifetime.
    At any point data may be attached to a different data set than its original set. We want this type of information as well as still wanting to know what particular file the instance data came
    from, along with information about the organization responsible for generating the original data.
    Then there is temporal info - the range of time over which the data applies. When we are talking about dynamic data sets that range can change over time and this needs to be tracked.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    NSF INTEROP Project
    http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Wed, May 7, 2014 at 1:45 PM, Gary Berg-Cross <***@***.***> wrote:
    >it would be important to clearly state why we need to
    classify metadata, and which is the purpose of such classification.
    The earlier discussion includes descriptive vs. functional categories and then some additional subclasses within each. The descriptive classification seems useful and is related to talking about data types. When we understand what type of data we are describing we are talking about descriptive MD. Different processes may be applied to data of different types in part because they have different data structures that get described for processing purposes etc.
    We need some other MD besides this. Data is often held in particular disciplinary repositories and effective use of these collections requires disciplinary expertise, including knowledge of assumptions and models used in creating and interpreting the data, but also the purposes for which the data was collected. Again it seems useful to try to come up with some way of capturing these purposes.
    On this issue of MD about MD, I agree that there MD about MD. After all MD is just data that plays a role of describing other data. And MD gets stored in repositories and we have data/MD about this such as who provided it, when it was last updated etc. After all it is just data playing a MD role.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    NSF INTEROP Project
    http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Tue, May 6, 2014 at 6:22 AM, andrea.perego <***@***.***> wrote:
    My two cents...
    1. IMHO, it would be important to clearly state why we need to
    classify metadata, and which is the purpose of such classification.
    Personally, I find such exercise very difficult, since the same
    metadata element can be classified in different groups depending on
    the context. Just to give an example, INSPIRE [1] metadata elements
    are grouped into three main classes: discovery, evaluation and use.
    Whether a metadata element is in one class or another does not depend
    on its intrinsic characteristics, but simply because of the role it
    plays in the context of INSPIRE.
    2. +1 from me to Keith's points (1) (metadata "data about data") and
    (2) ("there is no difference between data and metadata"). It's again
    the context determining whether given data can be considered as
    metadata. I also think that (1) highlights two important points. The
    former is recursion (as noted by Chris) - BTW, in INSPIRE we do have
    metadata about metadata (more precisely, who created the metadata,
    when and in which language). The other is that, after all, metadata
    are all descriptive - the difference is what they are describing, and
    for which purpose. On this, it may be worth noting that the de facto
    standard way on the Web to link a resource to its metadata is by using
    the "describedby" relation - see:
    - http://www.iana.org/assignments/link-relations/link-relations.xhtml
    - http://www.w3.org/TR/powder-dr/#assoc-linking
    (I must also say I have a conflict of interest here, since I
    contributed to the definition of such relation)
    Cheers,
    Andrea
    ----
    [1]http://inspire.ec.europa.eu/
    On Thu, May 1, 2014 at 2:37 PM, chrisftaylor <***@***.***> wrote:
    > Hi,
    Gary –
    Perhaps this is stretching the concept of ‘type’ too far; you are describing temporally-bound role-based relationships.
    Example: dataset Y was derived by summarisation from dataset X between date-time1 and date-time 2 (i.e. a transformation – provenance). A different example: dataset A was generated by equipment E between date-time1 and date-time2 (some time intervals e.g. in astronomy of neutrino science can be long). A third example is: dataset S was collected by seismic array A between date-time 1 and date-time 2 where typically the interval is some days.
    You could also express project P produced dataset D which relates to/covers time interval date-time 1 to date-time 2.
    The representation of relationships can be done by extended entity-relationship modelling, by object-relational modelling, by LOD/RDF, by OWL/RDF etc.
    Best
    Keith
    Keith G Jeffery Consultants
    Prof Keith G Jeffery
    E: ***@***.***
    T: +44 7768 446088
    S: keithgjeffery
    Past President ERCIM www.ercim.eu (***@***.***)
    Past President euroCRIS www.eurocris.org
    Past Vice President VLDB www.vldb.org
    Fellow (CITP, CEng) BCS www.bcs.org
    Co-chair RDA MIG https://rd-alliance.org/internal-groups/metadata-ig.html
    Co-chair RDA MSDWG https://rd-alliance.org/working-groups/metadata-standards-directory-work...
    Co-chair RDA DICIG https://rd-alliance.org/internal-groups/data-context-ig.html
    ----------------------------------------------------------------------------------------------------------------------------------
    The contents of this email are sent in confidence for the use of the
    intended recipient only. If you are not one of the intended
    recipients do not take action on it or show it to anyone else, but
    return this email to the sender and delete your copy of it.
    ----------------------------------------------------------------------------------------------------------------------------------
    From: gbergcross=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of Gary
    Sent: 08 May 2014 00:11
    To: ***@***.***-groups.org
    Subject: Re: [rda-metadata-ig] Can we move towards a working definition of metadata
    A small addition to my last posting.
    When we trying to justify typing MD we might think of the some type like Provenance which has
    some special needs to track data over its lifetime.
    At any point data may be attached to a different data set than its original set. We want this type of information as well as still wanting to know what particular file the instance data came
    from, along with information about the organization responsible for generating the original data.
    Then there is temporal info - the range of time over which the data applies. When we are talking about dynamic data sets that range can change over time and this needs to be tracked.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    NSF INTEROP Project
    http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Wed, May 7, 2014 at 1:45 PM, Gary Berg-Cross <***@***.***> wrote:
    Andrea noted:
    >it would be important to clearly state why we need to
    classify metadata, and which is the purpose of such classification.
    The earlier discussion includes descriptive vs. functional categories and then some additional subclasses within each. The descriptive classification seems useful and is related to talking about data types. When we understand what type of data we are describing we are talking about descriptive MD. Different processes may be applied to data of different types in part because they have different data structures that get described for processing purposes etc.
    We need some other MD besides this. Data is often held in particular disciplinary repositories and effective use of these collections requires disciplinary expertise, including knowledge of assumptions and models used in creating and interpreting the data, but also the purposes for which the data was collected. Again it seems useful to try to come up with some way of capturing these purposes.
    On this issue of MD about MD, I agree that there MD about MD. After all MD is just data that plays a role of describing other data. And MD gets stored in repositories and we have data/MD about this such as who provided it, when it was last updated etc. After all it is just data playing a MD role.
    Gary Berg-Cross, Ph.D.
    ***@***.***
    http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
    NSF INTEROP Project
    http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0955816
    SOCoP Executive Secretary
    Independent Consultant
    Potomac, MD
    240-426-0770
    On Tue, May 6, 2014 at 6:22 AM, andrea.perego <***@***.***> wrote:
    My two cents...
    1. IMHO, it would be important to clearly state why we need to
    classify metadata, and which is the purpose of such classification.
    Personally, I find such exercise very difficult, since the same
    metadata element can be classified in different groups depending on
    the context. Just to give an example, INSPIRE [1] metadata elements
    are grouped into three main classes: discovery, evaluation and use.
    Whether a metadata element is in one class or another does not depend
    on its intrinsic characteristics, but simply because of the role it
    plays in the context of INSPIRE.
    2. +1 from me to Keith's points (1) (metadata "data about data") and
    (2) ("there is no difference between data and metadata"). It's again
    the context determining whether given data can be considered as
    metadata. I also think that (1) highlights two important points. The
    former is recursion (as noted by Chris) - BTW, in INSPIRE we do have
    metadata about metadata (more precisely, who created the metadata,
    when and in which language). The other is that, after all, metadata
    are all descriptive - the difference is what they are describing, and
    for which purpose. On this, it may be worth noting that the de facto
    standard way on the Web to link a resource to its metadata is by using
    the "describedby" relation - see:
    - http://www.iana.org/assignments/link-relations/link-relations.xhtml
    - http://www.w3.org/TR/powder-dr/#assoc-linking
    (I must also say I have a conflict of interest here, since I
    contributed to the definition of such relation)
    Cheers,
    Andrea
    ----
    [1]http://inspire.ec.europa.eu/

  • Andrea Perego's picture

    Author: Andrea Perego

    Date: 10 Jun, 2014

    Dear colleagues,
    Just to mention that a similar issue is under discussion in the W3C
    Data on the Web Best Practices WG. See, e.g.:
    http://lists.w3.org/Archives/Public/public-dwbp-wg/2014Jun/0068.html
    Cheers,
    Andrea
    On Thu, May 8, 2014 at 9:14 AM,
    ***@***.***

submit a comment