Caring for Data: Toward Reusability and Knowledge Sharing


Published on: January 29, 2019, Submitted by Valerio Graziano on: January 27, 2019


ICARDA and ILRI focus on data curation as cornerstone for the accuracy of research and usability of its results.



The International Center for Agricultural Research in Dry Areas (ICARDA) and the International Livestock Research Institute (ILRI) have led the operative workshop on "Making Dataset Interoperable and Machine Readable" on 17th January 2019 in Amman (Jordan), were scientists from both organizations - experienced in data management for research - have shared practices of dataset curation. The pivotal subject of discussion has been the presentation of the first General Dataset Curation Guide (GDCG), an open collection of formatting practices for datasets compiled by ICARDA and ILRI in partnership with the CRP on Grain Legumes and Dryland Cereals (GLDC), CRP on Livestock, Platform on Big Data and ICARDA GeoAgro.

The workshop objectives were to:

  • Collect and discuss feedback from scientists on data collection constraints and elaborate possible solutions;
  • Learn and share experiences of dataset curation to increase the overall quality of Open Access research publications;
  • Share knowledge among scientists about data collection and discuss plans for promoting the research results.


The welcoming of Mr. Enrico Bonaiuti (ICARDA Monitoring, Evaluation & Learning Specialist) has been followed by the presentation of Dr. Jane Poole (Research Leader for ILRI Research Method Group) on theCRP on Livestock Open Access and Open Data approaches as informed by the CGIAR Open Access and Data Management Policy and the CGIAR Open Access and Data Management Implementation Guidelines. Dr. Poole highlighted how publishing Open Access powers science delivery and fosters development actors to share, re-use and take up knowledge, which is a most desirable goal to achieve. Knowledge sharing covers a major role in raising global awareness on research activities and their ultimate goals, such as the Sustainable Development Goals set by the UN, as well as equipping students and scientists with the latest information products, tools and practices developed worldwide.

 

" Open Access is possible: it needs proper skills, incentives and recognition." - Dr. Jane Poole

 

Following the introductory presentations, Mr. Francesco Bonechi (ICARDA) has introduced the General Dataset Curation Guide and related solutions as a way to foster the setting of standards for dataset compiling and curation for machine readability, reusability and accessibility for peers and public in general. The Guide has been taken up for live testing by the participants in a late session moderated by Mr. Francesco Bonechi who has supported the scientists in implementing the data practices and discuss methodologies by confronting real operative limitations with the dataset curation solutions highlighted during the workshop.

 

 "When we follow our needs we often do not follow the standards; machine readability is likely to act as baseline for both needs and standards." - Mr. Francesco Bonechi

 

The core part of the workshop has been dedicated to the critical analysis of datasets potential to become Open Access got-to resources for other scientists, also adding value to their publications. Mr. Enrico Bonaiuti has moderated the session, enabling the individual flash talks of the participants by asking the following guiding questions:

 

  • Why my dataset was collected, what was the knowledge gap?
  • Where and When was collected?
  • How and Who collected the data?
  • Did I already develop a paper based on my data? Am I planning to publish one in 2019?
  • Did I receive requests for my data? How I am planning to promote?
  • Am I planning to collect new data in 2019?

 

The first talk, by Dr. Dhehibi Boubaker (ICARDA) noting how standards in dataset formatting shall consider the coming together of multidisciplinary data, has generated an early discussion where the general consensus of the scientists verted on the need to enhance the support to national partners on-field, such as the National Agricultural Research Systems (NARS) currently engaged with ICARDA and the CRP on Livestock in Tunisia. Building on the previous intervention, Dr. Peter Hloniphani (ICARDA) has also stressed the importance of planning data collection activities as deliverables per se, ensuring proper time and resources to produce datasets in line with the best formatting practices for reusability.

 

"National partners are responsible for more than 80% of our data collection; they play a pivotal role which is important to acknowledge and support." - Dr. Dhehibi Boubaker

 

The subsequent talks looked at the promotional potential for datasets from several perspectives. Dr. Mourad Rekik (ICARDA) suggested the adoption of an adaptive and mindful sharing strategy for datasets as they represent a snapshot of ongoing research efforts. Dr. Mira Haddad (ICARDA) has underlined how promoting evidence through datasets can be very effective in showing real measures of the improvements that can derive from the research activities and their many types present an opportunity to highlight collaborations and partnerships. Dr. Mounir Louhaichi (ICARDA) has noted that promoting datasets along with journal articles and other information products is a very transparent way to conduct research that also calls for more institutional activity to support both scientists and partners in using the best ways to share their results.

 

The General Dataset Curation Guide has provided the ground for future discussion and more in-depth analysis of data collection dynamics and datasets promotion strategies as boost to the research results reusability. ICARDA and ILRI will build on this foundation, starting by ensuring the finalization of the Guide for institutional dissemination use and Open Access sharing in early 2019.

Acknowledgement

This operative workshop is the result of the synergic efforts between the International Center for Agricultural Research in Dry Areas (ICARDA), the International Livestock Research Institute (ILRI), the CRP on Grain Legumes and Dryland Cereals (GLDC), CRP on Livestock and the Platform on Big Data.

Projects

  • CGIAR Platform for Big Data in Agriculture

Keywords

data-set data management data quality fair dataset data management plans data interoperability data curation machine readable data fairness

Countries

About the author

Valerio Graziano is at International Center for Agricultural Research in the Dry Areas - ICARDA.