Why metamojis?

Data flags are commonly used throughout the earth sciences to communicate data quality and provide other helpful information to users.

SeaDataNet, for instance, uses numeric codes to convey information about data quality:

0 – No QC
1 – Good value
2 – Probably good
4 – Bad value
6 – Below Detection

Similarly, Andrews LTER uses simple letter-based codes.

A – Accepted
E – Estimated
M – Missing
Q – Questionable

But what if we could make things a little more exciting? What if instead of using numbers and letters, we used … emojis πŸ˜›β‰

While it may seem silly at first, emojis provide a number of attractive properties.

Self-describing:

Unlike letter and number-based codes, emojis are immediately interpretable. Even without a data dictionary, users can get the gist of what an emoji-based flag means. A user who sees a πŸ˜€flag for instance, might guess that it means good data quality, whereas πŸ™means poor data quality, and so on.

Unicode:

Each emoji is a unique character in the unicode standard. This means that emojis can be used in any context where unicode is supported (basically everywhere on the internet). Moreover, this feature makes emojis easy to search, parse or index.

Lightweight:

Emojis can describe relatively complex concepts using a single character. At 1-4 bytes per UTF-8 character, an emoji is significantly more efficient than a full textual description. For instance, using πŸ”¬instead of “below detection” would quarter the number of bytes required.

Accessible:

Let’s face it, using emojis makes the task of flagging and annotating data a lot more fun! And at the risk of sounding old, it could also help to get a new generation of earth scientists excited about metadata.


Registry of metamojis

A list of some proposed metamojis is included below. To see a full registry, or to propose changes/additions, see the github repo here.


Data quality flags 🚩


Subjective data quality assessment

Quality assessment based on user’s judgment (e.g. manual inspection)

Description Emoji
Good

πŸ˜€

Probably good

😐

Questionable

πŸ€”

Bad

πŸ™


Objective data quality assessment

Quality assessment determined by objective criteria (e.g. automated scripts)

Description Emoji
Good

βœ…

Probably good

πŸ†—

Questionable

❓

Bad

❌


Data quality indicators

Metadata indicators related to data quality

Description Emoji
Raw Data

πŸ₯©

Processed Data

🌭

No Quality Control

🚧

Missing Data

πŸ”

Duplicate Data

πŸ‘―

Sensitive Data

🀐

Below Detection

πŸ”¬

Above Detection

πŸ”­

Outlier

πŸ›Έ

Spike

πŸ¦”

Sensor drift

🎈

Frozen value

🍦

Impossible value

πŸ¦„


Transformative processes

Metadata related to processes that change the data

Description Emoji
Manually edited

✍

Rolling filter (convolution)

✳

Interpolated

γ€½


Non-transformative processes

Metadata related to processes that do not change the data

Description Emoji
Manually verified

πŸ‘

Manually rejected

πŸ‘Ž

Manually flagged

🀚


Pending actions

Actions to be completed in the future

Description Emoji
To be categorized

πŸ—ƒ

To be archived

πŸ—„

To be deleted

πŸ—‘

To be inspected

πŸ›‚

To be edited

πŸ›ƒ


Modifiers

Flags to qualify or expand upon other flags

Description Emoji
See annotation

πŸ’¬

Uncertainty about classification

🀷


Data descriptors 🏷


Machine data types

Data type (for interpretation by computer programs, etc.)

Description Emoji
Integer

πŸ”’

Floating point

πŸ•΄

String (ASCII)

πŸ” 

String (Unicode)

πŸ”£

Boolean

πŸ”Ÿ


Observable properties βš–


SI Base properties

Properties described by SI base units

Description Emoji
Length

πŸ“

Mass

πŸ‹

Time

⏳

Temperature

🌑

Electric current

⚑

Luminous Intensity

πŸ’‘

Amount of substance

βš—