Machine learning detection of high-voltage grid (ml-hv-grid)

Scope & overall goal (from proposal)

Development Seed proposed to use machine assisted tracing to generate a high-voltage (HV) grid map for three countries: Pakistan, Nigeria and Zambia. A full report is available online. The methodology that we proposed is based on a pilot R&D effort described in the report Machine Learning for Africa’s Grid.

The goal here is to develop a cost-effective HV grid mapping approach that can be replicated in countries across the globe. At the end of this project, Development Seed is to publish and deliver the following to the energy team at the World Bank:

  1. a complete and accurate (within 10 meters) map of the HV transmission network in three priority countries in Africa and Asia, delivered in GeoJSON format and added directly to OSM;
  2. the training data and ML models;
  3. a thoroughly documented approach that would allow the Bank or other organizations to replicate this in other countries.

All data is stored within OpenStreetMap.

Machine learning approach

The machine learning (ML) goal here is to detect HV towers as accurately as possible in satellite imagery. Here, we are using zoom 18 tiles, as the towers are large enough to be clearly visible. Therefore, the specific ML task is a basic form of classification -- take any zoom 18 RGB tile and calculate the probability (on the interval [0, 1]) that it contains a HV tower. By default, we can use a threshold of 0.5 as the cutoff, but this can be modified to adjust the false-positive rate. For training, several thousand images (from the Digital Globe standard layer) spanning all 3 target countries were manually checked by the Peru Data Team.

We used a ML model called the "Xception" network that was pre-trained on ImageNet -- a large corpus of natural images used as a benchmark by the ML community. The model is retrained using satellite imagery (i.e., we use a "transfer learning" approach) to detect HV towers. We evaluate performance based on raw detection accuracy, false- and true-positive rates, and the ROC curve.

With a trained model, the final task is to run predictions entire countries (tiled to zoom 18). With this prediction map, the data-team can prioritize it's time to focus on only areas predicted to have high value.

Files and workflow