Case Statement: Preservation e-Infrastructure

You are here

29 May 2013

Case Statement: Preservation e-Infrastructure

Case Statement: Preservation e-Infrastructure

Posted: Tue Jan 15, 2013 8:48 am
by HermanStehouwer
Dear all,

we received a case statement on Preservation e-Infrastructure.

Here it is.

Re: Case Statement: Preservation e-Infrastructure

Posted: Tue Jan 15, 2013 10:28 pm
by DFFlanders
Should this group also be considering the policy and potentially APIs for how 'weeding' data might occur? For example, unlike LOCKSS which works with publications; datasets/objects are too big for us to keep everything, so rather than doing bit level preservation of datasets/objects won't their be a need for repositories who are replicating data also need a method by which to know if the data should be gotten rid of so that other things can be preserved. In short, is this WG taking a 'let's preserve everything approach' to their API design and build or will their be a function for querying "is it time for us to delete this dataset/object" to make room for other data which we should be keeping? 

Also, ANDS is cited but no one from ANDS is sitting on the WG. 

Kind Regards,

David F. Flanders

Senior Analyst

ANDS.org.au

Re: Case Statement: Preservation e-Infrastructure

Posted: Wed Jan 16, 2013 9:01 am
by dgiaretta
Hi David

the approach in the case statement is agnostic about what is preserved in the sense that if a repository has decided to preserve something, i.e. keeping it understandable and usable, then the e-Infrastructure services we define should help.

I think I am right in saying that that although appraisal is an important topic it would not be something we would work on - unless someone already has a good way of doing this as an e-Infrastructure service. In order to make significant progress in the timescale available my view is that we need to be led by what is already out there, or what we believe is reasonably doable, and at the moment I don't know of any such service but that probably reflects my ignorance and I look forward to further discussions!

This is my view but I'd appreciate views of others in the working group and beyond.

By the way, I'll be emailing invitations to people, including ANDS, again - probably got buried in inboxes over the break - if you or anyone else would like to take part please let me know (director@alliancepermanentaccess.org). We'd certainly like you on board and playing an active part.

Regards

..David

Re: Case Statement: Preservation e-Infrastructure

Posted: Fri Jan 25, 2013 3:25 pm
by pwittenburg
Dear David and colleagues,

let me add my personal comments to your case statement.

• At this moment the cs sounds more like a project proposal aiming at an infrastructure (like a registry of services) while RDA should be about specifications etc. So I don't see which concrete barrier you want to remove within the required short time frame.

• As a consequence the claims in the value proposition are very generic and the adoption statement is not clear.

• there was no debate in a group and no forum interaction, i.e. engagement etc is completely obscure

• To me the term “service” is very generic and does not say what kind of services etc. you have in mind. Several projects are mentioned without pointing to specific type of services. So even if the work is "RDA compliant" if would have no idea whether your goals can be achieved.

As it stands I would suggest to have a BoF session to a) make the goals more specific and b) see who is interested. There is also some overlap with other groups where a BoF session as suggested would be good to synchronize goals and the work.

best

Peter

Re: Case Statement: Preservation e-Infrastructure

Posted: Sun Jan 27, 2013 5:21 pm
by stotzka
Dear Peter, dear David, and colleagues,



I agree with Peter that the title "Preservation e-Infrastructure" is a bit misleading, because we are not aiming at building up an infrastructure.

We definitely need to define the term Service clearly. For me as a software engineer a service is a piece of software with well defined interfaces, e.g. REST, and well defined functionality.

As an example for a service I choose "Bit Preservation", a service receiving files which are stored, replicated, and their integrity is checked in regular intervals:

Currently many different bit preservation services have been or are being defined in various data projects, e.g. DARIAH Bit preservation API, DPIF, and others. Due to their differences interoperability hardly exists, e.g. users and applications using these services are not able to switch from one data centre to an other without adapting their software. The missing interoperability is the "barrier" I see in the current state of the art.

From my point of view the major goals are to collect the state of the art in preservation services and APIs and to extract specifications/recommendations for a common set of services/APIs.

Definitely there is a need for more communication within the group. Having regular phone conferences and a session at Gothenburg are good ideas. Overlaps with other RDA groups are partly identified, e.g. with the WG "Practical Policies".

@David: Do you think we could organise a phone conference before meeting in Gothenburg?

Best regards.

RAiner

Re: Case Statement: Preservation e-Infrastructure

Posted: Tue Jan 29, 2013 9:43 am
by dgiaretta
Dear Peter

I have put some responses interspersed below:

> At this moment the cs sounds more like a project proposal aiming at an infrastructure (like a registry of services) 

> while RDA should be about specifications etc. So I don't see which concrete barrier you want to remove within the required short time frame.

In the draft charter we say: "The long-term vision is a standardization of preservation services and their application programming interfaces (APIs)."

so it should be clear we are aligned with what the RDA is about. The point is that there are already at least some examples of services out there and we should not ignore them. Hence we also say "identify options for service interoperability where similar service offerings are available". The barrier is that the services, especially about usability, are not in place, although they are needed, while those that do exist are not interoperable, which makes it difficult for application and solution developers.

Clearly these points are not stated well enough in the draft case statement so we can try to clarify these in the update.

> • As a consequence the claims in the value proposition are very generic and the adoption statement is not clear.

> • there was no debate in a group and no forum interaction, i.e. engagement etc is completely obscure

Sine we started slightly after the initial group of candidate groups, our email discussions took place outside the forum but there have been a number of iterations of the case statement.

> To me the term “service” is very generic and does not say what kind of services etc. you have in mind. Several projects are mentioned without 

> pointing to specific type of services. So even if the work is "RDA compliant" if would have no idea whether your goals can be achieved.

We will add some more specific examples.

> As it stands I would suggest to have a BoF session to a) make the goals more specific and b) see who is interested. There is also some overlap with 

> other groups where a BoF session as suggested would be good to synchronize goals and the work.

A face to face meeting at Gothenburg will certainly be very valuable but we'll be holding virtual meetings before then. A good number of people are registered and coming for the WG meeting so I believe that we are more advanced than you suggest.

I think there are potential overlaps with other CWG but the difference will revolve around whether there are specific preservation aspects.

Re: Case Statement: Preservation e-Infrastructure

Posted: Thu Feb 14, 2013 11:04 pm
by dgiaretta

Updated case statement following review comments:

 Preservation-e-Infrastructure-CaseStatementv4.docx
Updated case statement after reviews
(47.28 KiB) Downloaded 27 times

Case Statement initial video conference?

Posted: Mon Feb 18, 2013 5:21 am
by stotzka
Dear David,

End of January you proposes that we should organise an initial videomeeting to start the activities of the working group. 

Do we already have a date set?

Best regards

RAiner
 

 

 

  • Reagan Moore's picture

    Author: Reagan Moore

    Date: 22 Jul, 2013

    The Practical Policy Working Group is developing a consensus on computer actionable preservation policies for ensuring and validating authenticity, integrity, chain of custody, and arrangement.  We encourage submission of preservation policies used in production systems.

    We note that the policies are also needed by other data management applications (data sharing, data publication).  An example is a replication policy for creating and ensuring that multiple copies of each file are maintained on separate storage systems.  The Practical Policy Working Group has initiated a group discussion on replication policies with the goal of a unifying discussion at the September RDA meeting in Washington DC.

     

    Reagan Moore

  • Jamie Shiers's picture

    Author: Jamie Shiers

    Date: 23 Jul, 2013

    I would like to mention 2 areas of work that will hopefully be of general interest, and where you can find more information:

    1. "Exa-scale" bit preservation: a presentation on the experience and plans for "bit preservation" at the scale of the Large Hadron Collider (LHC) is being planned for the RDA Europe meeting. A working title is "Bit Integrity for Data Archiving, Sharing and Exchange". (100PB today growing to ~1EB during the active lifetime of the LHC and its successors).
    2. Some hard numbers of the associated costs - and the plan for sustaining this for several decades - will be presented at the 4C workshop following iPRES in Lisbon in September.

    There is a 3rd area which is still being discussed, which concerns a sustainable model for long-term collaborative data storage.

    None of these are domain specific and I hope will be of use to others in their Data Preservation activities.

    Cheers, Jamie

     

submit a comment