Join us at the next ESIP Meeting! Learn more: esipfed.org/meetings

Metamojis: metadata with personality

Why metamojis?

Data flags are commonly used throughout the earth sciences to communicate data quality and provide other helpful information to users.

SeaDataNet, for instance, uses numeric codes to convey information about data quality:

0 – No QC
1 – Good value
2 – Probably good
4 – Bad value
6 – Below Detection

Similarly, Andrews LTER uses simple letter-based codes.

A – Accepted
E – Estimated
M – Missing
Q – Questionable

But what if we could make things a little more exciting? What if instead of using numbers and letters, we used … emojis ⁉

While it may seem silly at first, emojis provide a number of attractive properties.

Self-describing:

Unlike letter and number-based codes, emojis are immediately interpretable. Even without a data dictionary, users can get the gist of what an emoji-based flag means. A user who sees a flag for instance, might guess that it means good data quality, whereas means poor data quality, and so on.

Unicode:

Each emoji is a unique character in the unicode standard. This means that emojis can be used in any context where unicode is supported (basically everywhere on the internet). Moreover, this feature makes emojis easy to search, parse or index.

Lightweight:

Emojis can describe relatively complex concepts using a single character. At 1-4 bytes per UTF-8 character, an emoji is significantly more efficient than a full textual description. For instance, using instead of “below detection” would quarter the number of bytes required.

Accessible:

Let's face it, using emojis makes the task of flagging and annotating data a lot more fun! And at the risk of sounding old, it could also help to get a new generation of earth scientists excited about metadata.


Registry of metamojis

A list of some proposed metamojis is included below. To see a full registry, or to propose changes/additions, see the github repo here.


Data quality flags


Subjective data quality assessment

Quality assessment based on user’s judgment (e.g. manual inspection)

Description Emoji
Good

Probably good

Questionable

Bad


Objective data quality assessment

Quality assessment determined by objective criteria (e.g. automated scripts)

Description Emoji
Good

Probably good

Questionable

Bad


Data quality indicators

Metadata indicators related to data quality

Description Emoji
Raw Data

Processed Data

No Quality Control

Missing Data

Duplicate Data

Sensitive Data

Below Detection

Above Detection

Outlier

Spike

Sensor drift

Frozen value

Impossible value


Transformative processes

Metadata related to processes that change the data

Description Emoji
Manually edited

Rolling filter (convolution)

Interpolated


Non-transformative processes

Metadata related to processes that do not change the data

Description Emoji
Manually verified

Manually rejected

Manually flagged


Pending actions

Actions to be completed in the future

Description Emoji
To be categorized

To be archived

To be deleted

To be inspected

To be edited


Modifiers

Flags to qualify or expand upon other flags

Description Emoji
See annotation

Uncertainty about classification


Data descriptors


Machine data types

Data type (for interpretation by computer programs, etc.)

Description Emoji
Integer

Floating point

String (ASCII)

String (Unicode)

Boolean


Observable properties ⚖


SI Base properties

Properties described by SI base units

Description Emoji
Length

Mass

Time

Temperature

Electric current

Luminous Intensity

Amount of substance