Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use median instead of mean #61

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

joshmh
Copy link

@joshmh joshmh commented May 21, 2018

Oftentimes, some cells on the map will become orange or red just because of one faulty device. This gives an unrealistic view of air quality in the region. Using median instead of mean is a more robust statistic and automatically filters out extreme values, while still representing the consensus of sensors in the region.

@joshmh
Copy link
Author

joshmh commented May 21, 2018

Here is a before and after comparison, top is with mean, bottom is with median.

screen shot 2018-05-21 at 1 25 58 pm

screen shot 2018-05-21 at 1 26 09 pm

@joshmh
Copy link
Author

joshmh commented May 21, 2018

And here is an example of the offending sensor (Dresden area):

screen shot 2018-05-21 at 1 28 28 pm

@BrunoKestemont
Copy link

Not "Use median instead of mean". For air quality, extreme values are the most important (if they aren't outliers). 1) Actually, there should be options to display mean and max values+ percentiles (25-50-75-90-95) (thus including median).
2) If a max value seems to be unlikely (sensor problem), then the sensor should be verified and the data deleted. If the sensor is correct, it is extremely important to keep the value (and still calculating mean with it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants