MOS-X

MOS-X is a machine learning-based forecasting model built in Python designed to produce output tailored for the WxChallenge weather forecasting competition. It uses an external executable to download and process time-height profiles of model data from the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS) and North American Mesoscale (NAM) models. These data, along with surface observations from MesoWest, are used to train any of scikit-learn's ML algorithms to predict tomorrow's high temperature, low temperature, peak 2-minute sustained wind speed, and rain total.

Installing

Requirements

Python packages - easier with conda

Installation

Nothing to do really. Just make sure the scripts in the main directory (build, run, verify, validate, and performance) are executable, for example:

chmod +x build run verify validate performance

Building a model

  1. The first thing to do is to set up the config file for the particular site to forecast for. The default.config file has a good number of comments to describe how to do that. Parameters that are not marked 'optional' or with a default value must be specified.
    • The parameter climo_station_id is now automatically generated!
    • It is not recommended to use the upper-air sounding data option. In my testing adding sounding data actually made no difference to the skill of the models, but YMMV. Use with caution. I don't test it.
  2. Once the config is set up, build the model using build <config>. The config reader will automatically look for <config>.config too, so if you're like me and like to call your config files KSEA.config, it's handy to just pass KSEA.
    • Depending on how much training data is requested, it may take several hours for BUFRgruven to download everything.
    • Actually building the scikit-learn model, however, takes only 10 minutes for a 1000-tree random forest on a 16-core machine.

Running the model

Some notes on advanced model configurations