Being the student fellow this year for the Data Stewardship Committee has been an amazing learning experience for me.  Through working with the Committee, not only did I get to know more Committee members and learn more about the Committee’s key collaborative areas, but I was also able to participate in some of the key projects.

Among the various activities that I was able to contribute to (and had reported on in the previous blogs), data management training continues to be one of my favorite topics.  Even though the concept of data management is becoming more familiar with researchers and their information partners, such as librarians and data curators, resources like the ESIP’s Data Management Short Course for Scientists will remain crucial to assist in the promotion, understanding, and propagation of best practices for data management across diverse communities.  Additionally, as the data management practices evolve, I know it will be vital for the Committee to provide leadership, engage in the discussions, and support the changes through its knowledge and expertise.

Besides data management training, the ESIP Summer Meeting 2015 provided me with the opportunity to be introduced to the Committee’s Information Quality Cluster.  Based on the work that the Cluster has been developing, I am looking forward to exploring with the Cluster how we could help in establishing different levels of information quality based on the various stakeholders’ points of view that were involved, including users and data centers, and to provide recommendations regarding how to achieve and uphold these different levels of information quality.

While I am looking forward to the opportunities to help out with the activities relating to both data management training and the Information Quality Cluster, I know there is still so much more that I can learn from the Committee.  This is why I am excited and honored that I will be able to continue to work with the Committee as its student fellow for another year.

As a final note, I would like to provide a summary of the upcoming events that I will be presenting on behalf of the Committee, and I look forward to having another great year supporting and collaborating with the Committee and the ESIP Community as a whole.

1) Title: Assessing Information Quality: Use Cases for the Data Stewardship Maturity Matrix

Event: 2015 AGU Fall Meeting – Poster


    Co Authors: Sophie Hou, Matthew Mayernik, Ge Peng, Ruth Duerr, Antonia Rosati

    Abstract: Information Quality (IQ) is an important characteristic of a data repository. Being recognized for providing “good” or “high” quality information enables trust to be built between the data repository and its communities, and therefore, fosters collaborations and potentially improves the utility of its data holdings. However, currently, a common standard or framework does not exist to allow IQ to be assessed consistently across different data repositories.

There are several aspects that need to be considered when evaluating IQ. In particular, the data stewardship practices applied to datasets during the curation process can have significant impact on the accessibility, usability, understandability, and integrity of the datasets over time. The Data Stewardship Maturity Matrix (DSMM) provides a framework for the evaluation of a dataset’s quality based on nine distinct categories. For each of the categories, the DSMM provides criteria that can be used to apply a 5-level rating to an individual dataset, ranging from Ad Hoc to Optimal.

This presentation introduces the overview of the DSMM and the recommended process for using DSMM to evaluate the quality of a dataset. The presentation will also provide the key findings after applying the DSMM to several datasets, including those from the Advanced Cooperative Arctic Data and Information Service, the National Center for Atmospheric Research, and the Long Term Ecological Research’s Santa Barbara Coastal site. The presentation concludes by summarizing the crucial lessons learned and the potential benefits when a data repository uses the DSMM to assess and convey the quality of its datasets.

2) Title: Data Management Training Modules: An Initial Survey and Comparison Result (Funding Friday)

    Event: ESIP Federation Winter Meeting 2016 – Poster


    Co Authors: Sophie Hou, Matthew Mayernik

    Abstract: Focusing on the need for basic yet broad perspective training resources, the Earth Science Information Partners (ESIP) collaborated with the National Oceanic and Atmospheric Administration and the Data Conservancy to produce the “ESIP Data Management for Scientists Short Course” (Short Course) from 2011 to 2013.  The current Short Course syllabus consists of a collection of 35 independent informational modules that cover a wide range of data management topics.  As ESIP is considering updates and new development of the Short Course, it is assessing training resources that are also being generated in parallel by other organizations for overlaps and complementarities.

This poster presents the results of an initial survey assessing the extant data management training resources in the earth and geosciences.  Using the ESIP Short Course as the baseline, a comparison with different training modules/courses created by other types of organizations was performed in order to understand the current needs for data management training and to build effective training resources.  Consequently, the poster summarizes the common topics presented in the training modules/courses as well as the advantages and disadvantages of each training module/course with respect to its content and its presentation style.  In conclusion, the poster recommends a list of missing topics and desirable features or characteristics that could be considered when designing the ESIP’s next generation training resource in order to improve and enhance the effective of data management training.

3) Title: Data Management Training Resources Survey and Clearinghouse Project Report

    Event: ESIP Federation Winter Meeting 2016 – Session


    Co Session Leads: Sophie Hou, Matthew Mayernik, Nancy Hoebelheinrich

    Abstract: From 2011-2103, ESIP partnered with NOAA and the Data Conservancy to produce the current version of the “ESIP Data Management for Scientists Short Course” (ESIP Short Course).  The members of the ESIP Data Stewardship Committee (the Committee) were able to create 35 unique modules addressing a variety of topics pertaining to data management guidelines and best practices. The Committee is interested in obtaining further funding and to begin developing the next groups of the training modules. However, the Committee also recognizes that there are many training resources that are being generated in parallel by other organizations.

In order to understand the current needs for data management training and to build effective training resources, the Committee would like to assess the current landscape Data Management Training (DMT) resources.  This session provides the findings as a result of the initial survey and the comparison of the DMT resources conducted using the ESIP Short Course as the baseline.  The session will also review and discuss the following results, which help in contributing to the Committee’s roadmap for the Short Course and a clearinghouse project.

  1. The introduction and the selection rationale for the DMT resources that were used in the comparison.
  2. Summary of the gaps and the overlaps in the training topics when the different DMT resources were compared against the ESIP Short Course
  3. Identification of potential collaborators.
  4. Collaborative Data Management Clearinghouse proposal