Join us at the next ESIP Meeting! Learn more:

Arranging the Orchestra of Data

ESIP Community Fellow and PhD student, Zachary Robbins, shares his takeaways from the 2019 ESIP Winter Meeting.

During the opening plenary of the 2019 ESIP Winter Meeting, Dr. Lesley Wyborn described the Earth and Environmental Sciences as an orchestra, each section having its own instruments, notes, and intonation. The diverse community of ESIP members extends from industry to government to academia each with a vast array of experiences, all which can be utilized like the many pitches of an orchestra. The work for ESIP members, as Dr. Wyborn frames, is to ensure that the orchestration and musical notation they share is the same, allowing the symphony to play together. The advent of digital labs and cloud computing processes has provided a way to extend this orchestra to encompass people internationally and across broad disciplines. In order to construct a full arrangement of data symphony, we are required to integrate standards across these boundaries.

While the western musical scale has 12 notes (and a musically inclined reader might note that from this there are 266 possible signatures of these notes), the amount of unique data types, formats, and machines that ESIP members orchestrate is many orders of magnitude greater. We will need a truly high-caliber notation system to synthesize the tenors and basses of the many sensors that transverse and surround the earth. Semantics and Semantic technology form the basis of this notation and can provide the structure of this orchestration.

While connecting each piece of data is a bold aspiration in practice, Semantics is just ensuring that across a wide group of people consensus is maintained as to what certain data refers to. A single language or set of data may have many conflicting synonyms and homonyms that a machine (or other researchers) would not understand. A diverse collection of researchers might have different definitions for the measurements they take or may refer to the same thing in varied ways. Semantics provides us the tools to resolve these differences in a way that can allow for a wider cooperative range and more efficient collaboration. This transdisciplinary research requires a common language for researchers from vast disciplines to work not just concurrently but to integrate their work. As Dr. Wyborn said, the interoperability of your standard defines the community of your collaboration.

The maintaining of a core conceptualization and language is the key to what Semantics provide the earth science community. The Semantic Technologies Committee works to encourage and promote research and development of Semantic technologies in support of the Earth Sciences. It further works to foster collaboration within other committee’s clusters and member organization around the use of semantic technologies. By hosting the community ontology repository the committee provides the community with semantic capabilities through hosting and storing semantic vocabularies. The semantics seminars at the winter meeting provided many examples of expanding these technologies.

Data discovery relies on standards-based descriptions to allow researchers to access the data necessary to them. The amount of work that we can accomplish is often more determined by the means to acquire and integrate the data rather than the time spent in analysis. By shortening this window of discovery and data wrangling we can increase the amount of scientific work that can be done. I saw many examples of Semantics work accomplishing this at the winter meeting. Dr. Doug Newman spoke on this through utilizing Google Data Search to better optimize the landing pages for NASA data sites. This provides a way to better reach the population of researchers who utilize the data. Dr. Lewis McGibbney, current chair of the ESIP Semantic Technologies Committee, spoke on linked geospatial data into Resource Description Framework for the upcoming Surface Water Ocean Topography (SWOT) mission, allowing data products to be linked spatially, and queries on them through a linked spatial connection. Currently, many of these sources are only spatially located through geospatial information systems (GIS), limiting the capability to utilize them. Both these improve smart handoffs, integrating how the next member in the data chain can integrate a greater amount of research into their workflows.

We are now extending into an age where we are not just collaborating with people across disciplines but further, we are trying to find ways to better collaborate with computers. Machine learning provides us with tools to find more complex relationships between Earth Science and Biological data, but it hinges on machines having a precise understanding of how data is structured. Ontologies provide the mechanism to relate the difference between an iceberg and an ice core in a way that can be understood by machines. At the ESIP Winter Meeting, I saw there is no magic bullet for making these ontologies. They rely on the expert knowledge both in the field of interest and semantics to meticulously build connections between these terms. During the working session to harmonize the ontologies of SWEET and ENVO, Ruth Duerr and Pier Buttigieg worked along with the larger semantics community present, to evaluate disparities between these two ontologies and come to an understanding of how to make them interoperable. This is the labor at the heart of collaboration. This is the work of linking the disciplines and the type of work ESIP brings people together to do.

The ESIP Winter Meeting provided me a lot more insight on the important role Semantics plays in Earth Science research. After attending the meeting, I am even more excited for this next year of being an ESIP community fellow and to help in the work of the Semantic Technologies Committee. The work done by ESIP members enables the type of research I do as a Ph.D. student, and I am excited for the opportunity to help further ESIP’s mission. Together I feel that we can build a truly great symphony. I would like to thank everyone at the meeting for being so open to me and all the community fellows. Many conferences can leave junior researchers feeling out to drift, but the other fellows and I were amazed at how many people came to talk to them, heard their ideas, and offered them help with their research.

If you are interested in what the Semantic Technologies Committee is working on, please join our email list and find out how to attend our regular meetings (currently on the 4th Tuesday of the month at 4 pm ET):

Watch Lesley Wyborn's full talk at the ESIP Winter Meeting here (starts 30 minutes in).

More about Zachary:
For his PhD at North Carolina State University, Zachary plans to develop models to forecast insect disturbance and forest biogeochemical states. This work relies on creating data science tools to integrate climate models, soils inventories, atmospheric deposition data, and forest imputations to generate a multivariate analysis of when and where outbreaks are likely to occur. The data that drives these models comes from many of the organizational members of ESIP. Zachary is working with the Semantic Technologies Committee.