Hi, all,
I've entered the terms/definitions from our paper (
http://dx.doi.org/10.5281/zenodo.34542) into the RDA Data Foundations &
Terminology TeD-T terms tool (
http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page). This can be listed as
part of the implementation & dissemination plan of action.
Claire, I wasn't sure which terms cross-linked to which definitions in the
CASRAI research data domain glossary, but if you can let me know, I will
add those as references.
Terms entered in the RDA DFT TeD-T tool:
1. Data publishing
2. Data publishing workflows
3. Data journal (nb: format issues for internal links)
4. Data article
5. Data review
6. Data repository entry (nb: format issues for internal links)
Best,
Amy
_________________________________
Amy Nurnberger, Research Data Manager
Center for Digital Research and Scholarship
Columbia University / 212.851.2827
E-mail: ***@***.*** <***@***.***>
ORCID: 0000-0002-5931-072X
Twitter: @DataAtCU
- Log in to post comments
- 5934 reads
Author: Leonardo Candela
Date: 20 Jan, 2016
One of the aspects puzzling me is the fact that all these definitions are not connected to any existing piece of work. I'm confident that there are other names that are potential "synonyms" of the selected ones, e.g. "data paper" seems to me more frequent than "data article" or "data descriptor". Evidence of this is in Candela, L., Castelli, D., Manghi, P. and Tani, A. (2015), Data journals: A survey. Journal of the Association for Information Science and Technology, 66: 1747–1762. doi: 10.1002/asi.23358
By citing / referring to existing pieces of work the given definitions might be less "out of the blue" than they appear now.
In addition to that, I do not know whether the authors believe in Wikipedia or not. BTW it might be a good idea to attack this data source to disseminate the terminology (although it is not shared yet).
Some pages are already there, e.g.
Author: Amy Nurnberger
Date: 20 Jan, 2016
Hi, Leonardo,
Thank you for your thoughts on this. Given the WikiMedia platform of the
RDA DFT TeD-T term tool, I encourage you to edit, enhance, open for
discussion, and add references to the entries for these terms. Most
certainly, please do add the related terms for "data paper" that are
detailed in your article.
Regarding your idea for editing (& adding) Wikipedia entries, that's great!
It would be especially nice to add details/images regarding the data
publishing workflow to the Data Publishing entry.
Are there any Wikipedians that would like to volunteer for this?
Best,
Amy
On Wed, Jan 20, 2016 at 8:52 AM, leonardo.candela <
***@***.***> wrote:
Author: Gary Berg-Cross
Date: 20 Jan, 2016
Leonardo,
>One of the aspects puzzling me is the fact that all these definitions are
not connected to any existing >piece of work. I'm confident that there are
Leonardo,
>One of the aspects puzzling me is the fact that all these definitions are
not connected to any existing >piece of work. I'm confident that there are
other names that are potential "synonyms" of the selected ones,
One may put alternative definitions in the DFT term tool and discuss them.
In earlier work on the core DFT definitions we had a synthesis document
that discussed some of the variant definitions for things like digital
objects. Below is that section of the Synthesis report showing what this
consideration of alternatives feels like. You may want to use the tool to
carry on such a discussion,
1 Digital Object (DO)
*A. Definition*
*A digital object (DO) is represented by a bitstream, is referenced and
identified[1]
by a persistent identifier and has properties being characterized by
metadata. *
*Note: As indicated we only talk about registered DOs in the context of
this document. *
*Note: Properties included in metadata include discovery, contextual,
schema, rights, curation and provenance information. *
*Note: A DO is said to be dynamic when the information content represented
in a DO is changing for some period of time or even for indefinite
duration.*
*B. Elaboration*
There are many alternative views and definitions out there, we just want to
mention 4 of them:
*Variant 1*
Digital objects (or digital materials) refer to any item that is available
digitally. (Wikipedia)
*Variant 2*
A digital object is composed of structured sequence of bits/bytes. As an
object it is named. The bit sequence realizing the object can be identified
& accessed by a unique and persistent identifier or by use of referencing
attributes describing its properties. (in DFT Term Tool and from the
Practical Policy WG).
*Variant 3*
Digital Object is also called a Digital Entity defined as
“machine-independent data structure consisting of one or more elements in
digital form that can be parsed by different information systems; the
structure helps to enable interoperability among diverse information
systems in the Internet.” (in DFT Term Tool)
*Variant 4*
The Fedora Commons architecture defines a generic digital object model that
can be used to persist and deliver the essential characteristics for many
kinds of digital content including documents, images, electronic books,
multi-media learning objects, datasets, metadata and many others. This
digital object model is a fundamental building block of the Content Model
Architecture and all other Fedora-provided functionality. A Fedora object
contains a persistent identifier. (Fedora Commons)
*Variant 5*
Digital objects are marked by a limited set of variable yet generic
attributes such as editability, interactivity, openness and
distributedness. As digital objects diffuse throughout the institutional
fabric, these attributes and the information–based operations and
procedures out of which they are sustained install themselves at the heart
of social practice. (Kallinikos et.al.).
*Variant 6*
Digital objects consist of multiple elements, each of which consists of a
type-value pair. Each of the types is represented by identifier and can
thereby be interrogated individually. Identifying the data structure
itself, instead of a specific file or folder that may contain it, or
perhaps the machine on which it was first made available, enables
persistent information access that is decoupled from most aspects of the
underlying technology. (Robert E. Kahn: http://hdl.handle.net/4263537/5044)
*Variant 7*
A Digital Object is an entity consisting of a sequence of bits, or a set of
sequences of bits, having an associated unique and persistent identifier. A
DO may be static or dynamic, or some combination thereof. An entity is then
defined as: An entity is anything that has a separate and distinct
existence that can be uniquely identified.(Kahn et.al.)
It is important to note that not all communities insist on a PID and
registration as definitional of a DO. But (a) in this document we are only
making statements on the sphere of registered data and (b) many do and the
value of that has been a theme in some of this work and that of other RDA
WGs[2]
.
Fedora Commons, as cited in variant 4, offers an implementation of a
specific DO model which is central to its architecture and allows users to
bundle a number of content streams, to give it an identifier and to
associate metadata descriptions with the bundle and its components which
are themselves typed streams. So it fits with the definition we have
chosen.
The first variant is a highly condensed view of DOs and may lack enough
detail to support automating some aspects of DO management. The second
variant specifies some additional metadata for a DO and also takes a more
flexible approach to IDs recognizing the use of local IDs. The third
variant is a process view in part, since it focuses on DOs’ construction
principle and its process characteristics, and does not tell us what about
the structure is needed to enable interoperability. More information may be
needed to help automate interoperability. The construction principle is
reflected in the definition in an abstract way and the attribute
descriptions within ID or metadata records will enable processing. The
second variant makes use of the words “is being composed” instead of “is
a”. Since we find “is a” more simple and direct we opt for these words.
Having a name, as is suggested in variant 2, is one of its properties that
can be found in the ID and/or metadata records; therefore it does not to be
specified in the definition. Variant 2 does not require an ID but also
allows using “referencing attributes” as identification basis. Here,
however, we would clearly like to speak of externally registered persistent
and unique identifiers, since this will be the only way to register DO’s as
an explicit step as it is necessarily required in the domain of registered
digital data. Variant 5 adds abstract and useful requirements which are
essential for accessibility, but is neutral as to the role of an ID. It
does not tell us how to document a DO to do this. Variant 6 refers to the
internal structure of a DO and the importance of types that describe the
elements of the DO independent of its creation contexts. In so far it comes
close to the Fedora object model in Variant 3. Variant 7 includes the
possibility of changing content, introduces the term "entity" and
emphasizes the importance of being uniquely identified.
*C. Conclusion*
*We can conclude that the above definition is in agreement with the 7
variants except for the explicit registration which will be essential in
our growing domain of DOs. *
------------------------------
[1]
Some repositories include passport like information with the PID which goes
beyond pure referencing.
[2]
This can be seen in analogy of the Internet. There are many nodes out there
that do not have an IP address, but they cannot participate in the Internet
exchange.
Gary Berg-Cross, Ph.D.
***@***.***
http://ontolog.cim3.net/cgi-bin/wiki.pl?GaryBergCross
Member, Ontolog Board of Trustees
Independent Consultant
Potomac, MD
240-426-0770
On Wed, Jan 20, 2016 at 8:52 AM, leonardo.candela <
***@***.***> wrote: