An Interview with Lina Sitz: applying FAIR principles in IPCC activities and the RDA Ambassadorship
In mid-August, I was contacted by Anna Rzhevkina from Science Business, who expressed interest in my experience applying FAIR principles within the context of the AR6 WGI IPCC activities. Within that framework, she also showed interest in my perspective on how colleagues in my field approach activities and key challenges related to open science. Given my recent role as an RDA/EOSC Future Ambassador, I pondered how communities such as RDA or EOSC could contribute to enhancing outcomes.
Anna's inquiries provided me with the opportunity to take a few moments to contemplate these matters in depth. I appreciate RDA for granting me this space to articulate these reflections. Additionally, I am immensely grateful to Alex Delipalta for helping me make this text more fluid and suitable for reading, and to Matti Heikkurinen and Najla Rettberg for encouraging me to publish it. If you're interested in reading the very interesting article authored by Anna in which she shows with concrete examples how RDA plays a crucial role in promoting cross-disciplinary global data sharing, you can find it at this link.
- 1. Could you provide more details about your project aimed at implementing FAIR data principles in the Intergovernmental Panel for Climate Change (IPCC) and explain its significance?
-
Note: It is important to clarify that I am speaking as a member of the data team of the Working Group I Technical Support Unit, not on behalf of the IPCC. I also wanted to point out that it's not a project per se, but rather a collaborative activity.
It is the first time that the IPCC has recommended and supported the implementation of FAIR data principles as part of the assessment process.
This has effectively been a pilot activity, so we have been able to make data and code available for over 200 figures, about a third of the figures in the main report. We have also been able to promote a new policy across the whole IPCC to make available all the plotted data of the Summary for Policymakers figures; this summary is a product of the assessment that is agreed line-by-line with governments. By implementing FAIR principles in a comprehensive manner, we have been able to include with the WGI assessment a novel product - an Interactive Atlas where users can explore, download and plot many of the datasets used in the report. The IA provides full provenance of the plotted images that it produces.
Implementing FAIR principles in this context matters because it makes the data used in the report accessible to all - users can easily find the data, find out what it is, and also find the code that is used to generate the figures. It makes the visual information transparent in how it has been prepared and also complements the IPCC principles of the assessment being transparent and robust. Through the use of the data and code, errors in the figures could be found, reported with the IPCC Error Protocol and addressed. With the data and also scripts available, the figures are reusable, increasing the usefulness and uptake by students, researchers and other users of the report information and findings.
-
2. What are the primary challenges that arise when applying FAIR data principles in science related to climate change? From your perspective, how can these challenges be effectively addressed?
-
In general, within disciplines related to earth sciences, awareness regarding the significance of high-quality data is longstanding. In that sense, there's an advantage compared to other fields because I believe most of my colleagues value the production of open and globally accessible resources. However, it's still challenging to establish standardized approaches that can be widely implemented, as there isn't a one-size-fits-all solution when it comes to adopting FAIR practices. Once someone embarks on a particular path, it's difficult to ask them to adapt to a new option.
On the other hand, I think many of my colleagues, even though they are aware of the importance of applying good data management practices, feel they lack sufficient knowledge, time, or perhaps that they're not the ones who should be responsible for these tasks, as they weren't academically trained for them. In this regard, I believe one way to address these challenges is by incorporating these methodologies as a fundamental part of academic training, thus making them a natural part of the scientific process.
For those of us who are already at more advanced stages of our careers, as a general piece of advice, I would say just start somewhere. For example by opening a GitHub account and storing your code there, sharing data in trustworthy repositories or publishing the specific methodology you used to apply FAIR practices in your particular activity via Zenodo. Gradually, we learn which practices truly facilitate our work, including feedback and suggestions from colleagues, and find who can assist us throughout the process. And above all, not to forget that we are not alone in this process, and that there are likely many people who share our same doubts and probably also the same needs. At the same time, there are colleagues who are highly motivated in promoting these activities and are happy to provide support.
-
3. Please describe the structure of the project and the methods employed to gather information and outline best practices for FAIR data implementation.
-
We didn't actually undertake a specific 'project.' Instead, in collaboration with the WGI Bureau, we decided to implement the FAIR data principles as part of the WGI report preparation.
We gathered information on how to implement these principles through meetings with Data Distribution Centre (DDC) managers, a meeting with chapter scientists from various chapters, and then in coordination with TG-Data (Task Group).
During the first TG-Data meeting in 2019, Anna Pirani, Head of the WGI TSU, advocated for the inclusion of FAIR principles on the agenda. A sub-group was formed and we began developing guidance for the IPCC in general. In the meantime, we started establishing the data team within the WGI TSU and working with DDC managers to develop protocols, guidance for the authors, templates for metadata, and workflows with a focus on quality control.
Initially, the process faced some challenges due to the absence of author instructions, timelines, and other essential elements, and consequently the initial adoption rate was moderate. However, a notable upswing occurred with the onset of the Final Government Distribution Review of the SPM, coinciding with the near-completion of the chapters.
A key product of this endeavour was the creation of the IPCC WGI Interactive Atlas. The tool was developed by authors of the Atlas chapter and includes many of the assessed datasets from other chapters. The possibility to include a digital, interactive product with the report was possible thanks to the full implementation of FAIR principles, including full provenance. Technical support was provided by CSIC, through the Instituto de Física de Cantabria and Predictia. With the support of EU DIGITAL, CSIC continues on the future development of this open source tool through the execution of a project aligned with the Copernicus Climate Change Service (C3S), driving the future evolution of the Interactive Atlas resource. The project will continue to grow with the aspiration of becoming a cornerstone resource that is relevant for the new IPCC cycle (AR7).
- 4. According to the RDA website, WGI has made available the data and code for nearly 200 figures coming from all report components, applying best practices in data sharing and data management. What kind of challenges occurred during this process?
-
In a few words, time constraints and standardisation. It has been a challenge to implement such a process during the assessment while it was already underway. We had to develop guidance material, protocols for documenting, saving, quality checking and curating new types of data, as well as code, while authors were busy undertaking the assessment.
This placed a burden on the authors, TSU and DCC managers who were asked to do more tasks than were expected. Often, these tasks were not prioritised, which led to bottlenecks and therefore delays in assembling the information. Another challenge was defining the aggregated information for the digital resources to make them useful for potential users, on the one hand due to the extensive amount of diverse data stemming from various disciplines as well as different sources (satellites, direct observations, models, etc), and on the other hand because of potential users possessing varying backgrounds and possibly using the resources differently. This ranges from scientists who will employ this information to validate their research, to teachers, students, journalists, science communicators, and policymakers.
Our two primary allies in addressing these challenges were flexibility and simplicity. Striving to keep things straightforward and clear, and acknowledging that not everything can fit within an overly specific structure, allowed us to navigate obstacles smoothly and attain results that are practical and useful.
-
5. As an RDA Ambassador, how do you perceive your role at RDA? Could you share some of your most notable achievements thus far and your aspirations for the future?
-
Being an RDA Ambassador has provided me with a unique opportunity to bridge two worlds: those who work on climate change and those who are dedicated to working on and enhancing aspects of open science. It has also allowed me to establish connections with individuals from different disciplines who share a similar vision of science. Serving as a communication channel, I've been able to showcase the IPCC's activities and achievements in the realm of open science. It also served as inspiration for other scientists, demonstrating that even in complex contexts involving highly sensitive and globally significant data, it's possible to implement good data management practices. It has also deepened my understanding of available tools and groups that can greatly aid the implementation of open science best practices in my future projects.
I believe that this new open communication channel could serve as an interesting starting point for addressing a need we've been noticing with our colleagues lately. In order to achieve a higher level of standardisation that facilitates reuse and interoperability, it's important to organise vocabularies, ontologies, and similar resources for describing climate products, taking into account the data's origin as well as the entire post-processing journey to reach the final product. While there are specific instances such as METACLIP aiming for this objective, it would be immensely helpful to organise such activities under the umbrella of RDA to ensure that the implementation of these standards becomes widespread and integrated throughout the community.
I'm eager to continue collaborating and fostering synergies, consistently promoting the message that I strongly believe in – one that I consider essential for a fair and healthy development of science.
-
6. You mentioned the importance of sharing best practices among scientists so that they do not have to “reinvent the wheel”. How in your view the collaboration and knowledge exchange can be encouraged
-
I believe a key aspect is working towards dispelling the sense of being alone in this challenge. What has particularly helped me is knowing that there's a community out there ready to find solutions that are beneficial for all, while also adapting to my specific problem. Hence, I think both virtual and in-person gatherings are pivotal in fostering collaboration. Community forums or chats, where users can exchange questions and users themselves can offer alternative solutions, are tools that should be considered within the context of promoting good open science practices. This way, it becomes easier to see that there are more people striving towards the same objectives, and in the process, it streamlines the resolution of potential difficulties.