Case Statement: Metadata Working Group

You are here

29 May 2013

Case Statement: Metadata Working Group

Case Statement: Metadata Working Group

Posted: Sat Dec 22, 2012 2:01 pm
by HermanStehouwer
Dear Community,

We have received the official draft case statement for "Metadata Working Group".
Please provide your comments.

Re: Case Statement: Metadata Working Group

Posted: Fri Dec 28, 2012 11:39 am
by dgiaretta

A few points:

  • as mentioned in the e-IRG/RDA meeting in Amsterdam in Dec 2012, the use of the word metadata can cause great confusion because it means different things in different communities. The case statement does say that a start will be made with descriptive metadata - it is unclear what that means; my initial assumption was that it was metadata for the discovery of data, but there are other definitions. Therefore it may be sensible to define or adopt a finer grained terminology. My suggestion is to adopt as a start the OAIS (ISO 14721, available fromhttp://public.ccsds.org/publications/archive/650x0m2.pdf) Information Model's finer grained terminology, which includes Representation Information, Provenance, Access Rights, etc, with associated definitions and examples.
  • it is not clear to me how this relates to the work of the proposed Data Type Registry WG
  • I assume "standards" will include international standards, community standards etc, but I know that there are many local conventions built on top of such standards, especially in smaller communities. I note that one of the impacts hoped for is the reduction in ad-hoc metadata formats, but the reality is that they already exist.



Hope this helps



..David Giaretta

director@alliancepermanentaccess.org


Re: Case Statement: Metadata Working Group

Posted: Thu Jan 03, 2013 5:20 pm
by janeg
Hola David,



Thank you very much for your feedback. 



- point 1, the label "descriptive" metadata as a start is loose at best, given the diversity of the way metadata schemes have been developed. Schemes (and even properties) often include multi-functional aspects, even if the stated "objects or goals" target a single function (e.g., discovery, preservation, authentication, etc.). Your point about giving context the definition is excellent!Agreed, it will strengthen this proposed WG, and we can and should reference OAIS, Dublin Core, and other relevant definitions to help the community converge on definitions of metadata, and the WG's focus. I believe this will be a learning process, over time, and your recommendation here will definitely help.

- re: type registry... we'll potentially need some liaising with this group. They have not yet proposed a case statement, but I have the same question... so you are prompting us to check in and make sure they are also viewing our proposal, and we should be communicating with them too.

- re: the "reality" of multiple schemes already existing (global, national, local, etc..). I can't speak for the group as a whole, although I suspect most folks engaged at this point agree that there is always a way to improve our metadata ecology. (me .. i'm a metadata optimist!). RDA has asked us to consider what can be done in 12 to 18 months, and some basic steps will help provide a positive step toward reducing redundancy, a foundation for future and better aligned work, etc... 



Thanks again for the thoughtful comments; they are much appreciated. best wishes, jane

Re: Case Statement: Metadata Working Group

Posted: Mon Jan 07, 2013 3:07 pm
by maxcraglia
Dear Colleagues,



thank you for sharing this draft. I have two main observations/contributions to make. 



1) As part of the environmental scanning activities it may be important for the WG to be aware that in Europe there is a legal requirement on all 27 members of the European Union to document spatial environmental datasets and services with a specific set of minimum metadata elements. This requirement comes from the Directive establishing an Infrastructure for Spatial Information in Europe (INSPIRE) (see http://inspire.jrc.ec.europa.eu). Although this legal framework is not specific to scientific data (it cover also observational data for policy and administration), it is nevertheless relevant in Europe because a) it does include data of scientific value and b) European governments investing heavily in documenting their environmental resources with the INSPIRE metadata wish to minimise the potential duplication of effort resulting from the use of other standards. 



With this in mind, I would recommend that the INSPIRE metadata elements for both datasets and services are considered.



2) The INSPIRE metadata could be considered as specific to a particular community bound by geography (the EU 27 member states) and theme (environmental and geospatial information). There are multiple other standards (de facto or de jure) that are in use by other communities and disciplines across the physical, environmental, social sciences and the humanities. Whilst the proposal in the WG document to “seek consensus on a minimum metadata set supporting the exchange and interoperability of scientific data resources” (post month 18) is valuable, it should also be recognized that it may be difficult enough to find consensus across multiple communities/disciplines even on a minimum set for discovery of resources, let alone for evaluation and use, if that is what is implied by “exchange and interoperability”. We may need therefore alternative/complementary approaches to ensure interoperability across the variety of standards that each disciplinary community is most likely to want to maintain. This is the multidisciplinary interoperability challenge. To address this challenge some recent research projects in Europe (e.g.http://www.eurogeoss.euhttp://www.geowow.euhttp://www.envirogrids.eu) have adopted a “brokering” approach to build bridges across the standards and practices of different communities. The NSF EarthCube initiative is also considering this approach to link existing cyberinfrastructures. The brokering approach is described in Nativi et al. 2012 (http://ijsdir.jrc.ec.europa.eu/index.ph ... ew/281/319) and Vaccari et al. 2012 (http://ieeexplore.ieee.org/xpl/articleD ... 6383184%29)



If the aim of the RDA WG on Metadata is to foster interoperability and use of information resources across multiple scientific disciplines and audiences (science, policy, public) then it may be already worth thinking of a Metadata Directory that links the different metadata standards and practices to the different disciplines that use them, evaluate evolution and possible convergence towards a few, more widely used solutions, and plan a strategy for achieving interoperability including cross-walks, brokering frameworks, and so on.

Re: Case Statement: Metadata Working Group

Posted: Fri Jan 18, 2013 10:16 pm
by schmitzd

I agree with Max Craglia in that at the current point in time, the interdisciplinary issue is not yet the most important one. Certainly, this is the ultimate goal – people from different disciplines can get a grasp on data from other communities.



But as we try to build up support for coping with research data at RWTH Aachen University, the current experience with researchers is that they are struggling with the description of their data at all. And if they describe them, they rarely check for existing standards. Thus, our current focus is to address the needs of these people within their own community, this is discipline-specific issues. INSPIRE is a good example: the researchers unfortunately were not really aware of this standard, the library staff found it and they accepted it happily, similarly for GEMET (a multilingual thesaurus in the field of environmental science). As the case statement put it in Sect. 2.2 "Reduce the proliferation of new and ad hoc metadata formats, and duplicative standards development efforts." 



Thus for the time being, we firstly have to encourage researchers to describe their data at all in order to better cope with it, understand it, reuse it. For this purpose, the description needs to be tailored to a discipline. In order to support this effort, advisors need to be aware of the many national and international standards (as well as their degree of adoption, e.g. DICOM standards are great for some parts of medicine but not (yet) for others). Keeping track of all these issues cannot be performed by a single institution and would in fact lead itself to redundant work. By preventing an uncontrolled growth of standards within a discipline we alleviate addressing interdisciplinary issues later on. Thus the collection of discipline-specific standards possibly triggers a harmonization of these standards preferably moderated by metadata experts (as a pre-step to interdisciplinary issues).



Thus, I would like to emphasize that the RD alliance is to me the perfect place to locate a directory of discipline-specificmetadata standards. Note that such a directory has an immediate impact since it might prevent duplicative standards development efforts outright. The short term goals (Sect. 1.1) should pick up on this by explicitly mentioning that the repository needs a discipline dimension as also suggested by Max. And for the “operational plan […] for sustainable growth and maintenance” we need to get professional societies involved in order to look after the discipline-specific standards. 

It might become hard to run the metadata directory as a Wiki then due to size issues. But that is a rather technical issue that can be addressed once we actually run into this problem.

In regard to the long-term goals (1.2), we agree on the expansion of the directory to content value vocabularies but would add classifications as well. For the "[promotion of] metadata solutions" we should possibly consider to provide input to the guidelines for setting up data management plans as required by the NSF for example.



Minor issues:

  • The list in 2.1 needs to be extended by “libraries” as some of them might (hopefully) be in the position to advise (local) researchers while being well trained to keep an eye on normative issues.
  • The German network of university libraries, university computing centers, and professional societies (DINI,http://www.dini.de) has a working group on metadata, since recently with a specialization in metadata to research data (https://wiki.d-nb.de/display/DINIAGKIM/Forschungsdaten, unfortunately only available in German yet) . That can serve as a national partner for 4.2 MWG operation, “broader community engagement and participation”.
  • Sect. 5, Gantt chart, component 1 has the wrong title.



Dominik Schmitz, IT Department of the university library at RWTH Aachen University, Germany


Re: Case Statement: Metadata Working Group

Posted: Mon Jan 21, 2013 11:06 am
by KeithJ
Hi all -



just a few points:



1. as well as INSPIRE (and the use of vocabularies or code-lists such as NUTS and NACE) there is an EU Recommendation to Member States named CERIF (Common European Research Information Format) which is maintained for the EU by euroCRIS (www.eurocris.org) - euroCRIS has members in 42 countries;



2. in some large projects a 'layered' structure is being adopted with:



(a) discovery metadata (simple, flat metadata like a library catalog card or Dublin Core) that is common across multiple disciplines; 

(b) contextual metadata (using CERIF) describing persons, organisations, projects, funding, facilities,m equipment, publications, patents, products etc etc; 

(c) detailed metadata which is domain specific or even individual experiment or observation-specific.



The idea is that (b) is the lowest level of commonality so the user provides (b); (a) is generated from it (from CERIF one can generate not onloy DC but also MARC, MODS etc and also the e-govertnment standards like CKAN expressed as RDF etc) and (b) 'point to' (c). (c) is more like schema level - i.e. used to connect the dataset to software. With such a structure automated interoperation becomes possible.



3. CERIF is not only designed to be multilingual but manages any declared semantics over a formal syntax.



4. euroCRIS has a strategic partnership with VIVO in the US on converging metadata for research informaiton (based on generation of RDF (semantic web/linked open data) from CERIF);



5. euroCRIS also has strategic relationships with EARMA, ALLEA, JISC, APA, COAR, ICSU/CODATA, CASRAI etc etc



I am only too aware of the rich diversity of metadata that can occur under (c). David in an earlier post mentions OAIS; this could best be considered at the detailed (c) level since the implementations vary across disciplines. Similarly the detailed metadata on different kinds of environmental data, astronomical data, physics data etc all fit under (c).



As we all know the problem with standards in any field (and it is especially true of metadata) is that there are so many of them! I am concerned that one metadata working group may be too large a circus tent - there are so many aspects of metadata that I suspect some internal structure will be necessary. One might perhaps envisage metadata plenary group and subgroups (SIGs, BoFs) on discovery, contextual and detailed metadata (the latter maybe split by discipline). Whether the subgroups are Working Groups in the RDA structure is - I guess - up for debate.



Keith

http://www.stfc.ac.uk/Staff/5656.aspx

Re: Case Statement: Metadata Working Group

Posted: Fri Jan 25, 2013 3:11 pm
by pwittenburg
Dear Jane and metadata Colleagues,



let me also add my personal impressions about the suggested Case Statement.



Metadata is part of almost any discussion when people speak about data management and re-usage, so it is utterly important for RDA to have activities in this area to help overcoming roadblocks still on the way. Let me first make a few points of critique and then add some possible suggestions.



• To me creating a wiki with survey information of metadata initiatives, schemas etc and maintaining the wiki leads to an infrastructure project which requires maintainance etc, but it is not really an RDA topics as I see it. It would be an RDA topic if you guys would specify a registry of MD standards which would have a clear focus with a clear chance of getting this done in 12 months. 

• Developing a minimal set of MD for data description is another valuable and focused work program, but this is intended to start after the first WG finished. I see the sequence in your work and see the motivation of doing this overview first. So I don't have a good solution how to get this done without the wiki component.

• Work which would start after the first 18 months should not be proposed in the CS (which is more a formal aspect).

• The participating people seem to be mainly US - so an international participation is missing or accidental, although there are so many initiatives in AU and EU and beyond.



Let me add here from my bad memory that there were 3 other concrete topics mentioned in the Washington meeting:

- establish basic metadata principles (register schemas and semantics etc., 

- how to achieve faster interoperability, 

- clarifying terminology and make suggestions for building blocks (preservation metadata, provenance metadata, description MD, system MD etc - so many different terms out there confusing all of us again and again)



So I hope that this helps to define the focused topics removing roadblocks within a short time frame.



best

Peter Wittenburg

Re: Case Statement: Metadata Working Group

Posted: Mon Jan 28, 2013 10:06 pm
by Gary
Is there any goal in the first 18 months to a compare and contrast on the MD standards and perhaps a gap analysis looking at what needs to be added to improve on standards and bring them together? This would seem to be a useful thing to do and perhaps could be done as standards are looked at for storing in a registry.

Re: Case Statement: Metadata Working Group

Posted: Sat Feb 02, 2013 2:38 am
by janeg
A brief note to thank folks for providing thoughtful feedback. Please note that case statement 1 was prepared following the guidelines set forth to consider low-hanging fruit (what could be accomplished in 12-18 months) with leveraging existing resources. Keep the comments and ideas coming, please; they will guide the revision for case statement 2. best wishes and thanks again, jane