WeatherXM QoD (Quality of Data), Simplified

Panayotis Vryonis
WeatherXM
Published in
3 min readSep 28, 2023

--

Our mission at WeatherXM, is to collect high quality, hyperlocal weather data for every corner of the world and make it universally accessible and useful.

In order to make data collected from our weather stations around the world useful for a wider range of applications, we need a reliable way to gauge its quality -which can be influenced by various factors, ranging from less-than-ideal installations and technical glitches to individuals attempting to manipulate the system for their own ends.

That’s where QoD (Quality of Data) comes into play: It’s the method we use to assess the quality of each data point we receive from WeatherXM stations. QoD involves a series of techniques and processes designed to help us distinguish between expected and unexpected data behaviors.

Our upcoming QoD algorithm, Version 1 (v1), which is set to go live in the coming days, boasts the ability to handle time series data with varying intervals and can process data from weather stations with different capabilities and specifications. It includes two types of checks:

  • Out-of-Bounds Checks (OBC): These identify values that fall outside the limits specified by the manufacturer of each sensor.
  • Self-Quality Checks (SQC): These are on the lookout for unnaturally flat values, spikes or drops in reported measurements, and lack of sufficient data to compute an average.

The outcome of this process is an hourly report for each station, offering two key pieces of information:

  • the percentage of valid data points reported
  • text annotations per meteorological variable (temperature, relative humidity, wind speed, wind direction, atmospheric pressure, and illuminance) that describe issues in data quality.

QoD v1 represents a major milestone for us, serving a dual purpose.

Firstly, it empowers us to build an “enhanced dataset” exclusively comprised of data that meet specific quality criteria. This enhanced dataset is incredibly valuable for the commercial applications of WeatherXM data.

Secondly, it provides a transparent and verifiable mechanism for assigning a daily QoD score to each station. This score plays a key role in our station rewards system: Stations providing high-quality data receive full rewards, while those with lower QoD scores get less.

This is a very high-level, quite simplified, overview of the process. If you’re eager to dive into the nitty-gritty details, including how we handle varying data rates, or out-of-bounds and self quality checks, we have a comprehensive technical description of the process available.

Coming up in next versions:

  • Indoors Station Detector (ISD): Detect stations that have not been deployed outdoors by analysing and comparing observed and theoretical solar irradiance.
  • Solar Obstacle Detector (SOD): Detects obstacles that may affect solar irradiance, and sub-optimal station placement using solar irradiance.
  • Comparative Quality Check (CQC): Evaluates a weather station accuracy by comparing its observations with a reference neighbouring WeatherXM station or 3rd party network.

We have great things coming up in the next few months! Now is the time to get your WeatherXM station!

Join us! Website / Discord / Twitter / Facebook / LinkedIn

--

--