Metamojis: metadata with personality

Matt Bartos

Jun 17, 2018

Community Fellows

Why metamojis?

Data flags are commonly used throughout the earth sciences to communicate data quality and provide other helpful information to users.

SeaDataNet, for instance, uses numeric codes to convey information about data quality:

0 – No QC
1 – Good value
2 – Probably good
4 – Bad value
6 – Below Detection

Similarly, Andrews LTER uses simple letter-based codes.

A – Accepted
E – Estimated
M – Missing
Q – Questionable

But what if we could make things a little more exciting? What if instead of using numbers and letters, we used … emojis ⁉

While it may seem silly at first, emojis provide a number of attractive properties.

Self-describing:

Unlike letter and number-based codes, emojis are immediately interpretable. Even without a data dictionary, users can get the gist of what an emoji-based flag means. A user who sees a flag for instance, might guess that it means good data quality, whereas means poor data quality, and so on.

Unicode:

Each emoji is a unique character in the unicode standard. This means that emojis can be used in any context where unicode is supported (basically everywhere on the internet). Moreover, this feature makes emojis easy to search, parse or index.

Lightweight:

Emojis can describe relatively complex concepts using a single character. At 1-4 bytes per UTF-8 character, an emoji is significantly more efficient than a full textual description. For instance, using instead of “below detection” would quarter the number of bytes required.

Accessible:

Let's face it, using emojis makes the task of flagging and annotating data a lot more fun! And at the risk of sounding old, it could also help to get a new generation of earth scientists excited about metadata.

Registry of metamojis

A list of some proposed metamojis is included below. To see a full registry, or to propose changes/additions, see the github repo here.

Data quality flags

Subjective data quality assessment

Quality assessment based on user’s judgment (e.g. manual inspection)

Description	Emoji
Good
Probably good
Questionable
Bad

Objective data quality assessment

Quality assessment determined by objective criteria (e.g. automated scripts)

Description	Emoji
Good	✅
Probably good
Questionable	❓
Bad	❌

Data quality indicators

Metadata indicators related to data quality

Description	Emoji
Raw Data
Processed Data
No Quality Control
Missing Data
Duplicate Data
Sensitive Data
Below Detection
Above Detection
Outlier
Spike
Sensor drift
Frozen value
Impossible value

Transformative processes

Metadata related to processes that change the data

Description	Emoji
Manually edited	✍
Rolling filter (convolution)	✳
Interpolated	〽

Non-transformative processes

Metadata related to processes that do not change the data

Description	Emoji
Manually verified
Manually rejected
Manually flagged

Pending actions

Actions to be completed in the future

Description	Emoji
To be categorized
To be archived
To be deleted
To be inspected
To be edited

Modifiers

Flags to qualify or expand upon other flags

Description	Emoji
See annotation
Uncertainty about classification

Data descriptors

Machine data types

Data type (for interpretation by computer programs, etc.)

Description	Emoji
Integer
Floating point
String (ASCII)
String (Unicode)
Boolean

Observable properties ⚖

SI Base properties

Properties described by SI base units

Description	Emoji
Length
Mass
Time	⏳
Temperature
Electric current	⚡
Luminous Intensity
Amount of substance	⚗

Share This Post

Metamojis: metadata with personality

Why metamojis?

Self-describing:

Unicode:

Lightweight:

Accessible:

Registry of metamojis

Data quality flags

Subjective data quality assessment

Objective data quality assessment

✅

❓

❌

Data quality indicators

Transformative processes

✍

✳

〽

Non-transformative processes

Pending actions

Modifiers

Data descriptors

Machine data types

Observable properties ⚖

SI Base properties

⏳

⚡

⚗

More Stories of Earth Science Data

Meet 2025 Raskin Scholar Kate Thompson

Guest Blog: Earth Science Data Should Relate Science to Society

Guest Blog: Reproducible data pipelines in R with {targets}

Quick Links

Contact