Offline-MapMatching

A QGIS-plugin for matching a trajectory with a network using a Hidden Markov Model and Viterbi algorithm.

The source code is still under developement and does not show the version available within QGIS.

Goal of the project

Matching a trajectory with a network, e.g. a road network, is a classical task in geoinformatics. It can be differentiated into online, i.e. during the measurement of the trajectory like in car navigation, and offline, i.e. after the measurement of the trajectory, map matching. This plugin addresses only the offline map mathing.
Because of inaccuracies of the GNSS signal and/or a low data quality of the network, the positions of the trajectory and the network are differing. In many cases a simple snapping of the trajectory on the network will not work. Reasons for that can be e.g. outliers in the trajectory or crossings of edges in the network.
This plugin provides a statistical approach to solve the problem of offline map matching using the principles of Hidden Markov Models (HMM) and the Viterbi algorithm.

Installation

Usage

Match Trajectory


Clip Network


Reduce Trajectory Density


Hints for usage of the map matching

Hints for developement

Description of the computation of the map matching

First, the plugin calculates possible candidate points for each point of the trajectory (observations) (Budig 2012: 10). The candidates will be arranged in a candidate graph with n layers, n = count of observation points (Budig 2012: 13). After calculating different posibilities, the path with the highest posibility will be found using the Viterbi algorithm. Below the different parts of the computation in a nutshell, followed by explicit descriptions of each step:

Read the input layers

The network and the trajectory layers will be imported into the plugin and the internal structur of objects will be initialised. The trajectory cannot be imported, if the layer has no attribute field. Each position of the trajectory is called observation and one part of the five tupel of a Hidden Markov Model (HMM).

Calculate candidate points/create candidate graph

A candidate point is a possible position of an observation on the network. The distance to each linestring of the network will be calculated for every observation. If the distance is less or equal the maximum search distance, a candidate as the nearest point on the linestring to the observation will be extracted and added to the candidate graph (Budig 2012: 14):


"A candidate graph is a directed acyclic graph (DAG) which represents all paths that are considered possible final matching results for the given GPS trajectory" (Budig 2012: 13). The graph will be used as a stochastic matrix for the HMM. The graph has n layers with n as the count of observations. Each column contains all candidates of the same/corresponding observation (Budig 2012: 13):


The calculation of candidates has a huge impact of the computation time. A higher search distance can result in more candidates per observation and more possible paths, which will be calculated in the next steps. Also a high segmentation of the network can cause a lot of candidates per observation with the same problem for the computation time.

Calculate emission probabilities

A GNSS-measured position is not the real position because of the inaccuracy of the GNSS signal. The GNSS-errors are following almost, but not strictly, a normal distribution (Newson, Krumm 2009: 4). Also the quality of the network geometry can be very low. Therefore the candidates are more or less away from the observations. To calculate the probability, that an observation was emitted by a candidate, a normal distribution is postulated, i.e. the probability is pending on the distance between observation and candidate, the standard deviation of the GNSS-error and the expected value of the difference (Newson, Krumm 2009: 4). The emission probabilities for each candidate are one part of the five tupel of a Hidden Markov Model (Haenelt 2007: 4-5) and the emission probabilities of the candidates of the first observation will be used as initial state probabilities in the HMM and represent another part of the five tupel (Raymond et al. 2012: 2244). The expected value and the standard deviation can be set by the user. The expected value represents the best possible or expected distance between a candidate and an observer. In most cases, this should be a value of zero (Newson, Krumm 2009: 3). If the standard deviation of the GNSS-receiver is unknown, it has to be estimated e.g. with this equation (Newson, Krumm 2009: 6):


The term (z - x) means the distance between an observation and its candidate. I reached good results to set the standard deviation to around 2 * (maximum search distance). A larger value represents less trust in the measurement. A too high or too low standard deviation can overemphasise or underemphasise the emission probability (compared to the transition probability) and maybe results in an incorrect map matching.

Calculate transition probabilities

Another and the last tupel of the HMM are the transition probabilities whereat transitions are connections between two candidates in the graph, i.e. the edges of the graph. Two characteristics of the transition between two candidates and the transition between the two corresponding observations will be compared:

  1. the distances
  2. the directions

    The difference between the euklidean distance of the two observations and the distance on the network of two correpsonding candidates can be described by an exponential probability distribution (Newson, Krumm 2009: 4):


    The distance on the network will be calculated using the analysis framework of QGIS and the Dijkstra algorithm with the distance between nodes as edge weight. The difference of the two distances will be used in the following equation, in which the parameter d is the difference of the distances (Newson, Krumm 2009: 4):


    The parameter beta (named "Transition weight" in the GUI of the plugin) can be estimated as around 1.5 (median of all distances) (Newson, Krumm 2009: 6). I get good results setting beta = 1.5 (maximum search distance). A larger value of beta represents more tolerance of non-direct routes and vice-versa.

    The directions will be calculated using the slope of the line through both observations and the product of the slops of each subline, i.e. each line between two following vertices, of the shortest way from one candidate to the other candidate on the network. The angle is the arcustanges of the slope and will have a value between 0 and 180 (the python function atan is used and the result will be summed up by 90 to avoid negative values). The difference of this two angles will be normalised to receive a value between 0 and 1 (Budig 2012: 17):

Viterbi algorithm

All probabilities (initial, emission and transition) along each path will be multiplied in an efficient way by using the Viterbi algorithm. The Viterbi algorithm uses results of former calculations again to avoid multiple calculations of same sequences. Also just the highest probabilities to reach a candidate will be stored and reused (Haenelt 2007: 12). The most likely path through the HMM, i.e. the path through the candidate graph with the highest probability, is the result of this algorithm.

Store the path in a new layer

The founded Viterbi path represents the vertices of the resulting line. A routing between following points of the Viterbi path returns linestrings, which will be written to a new memory layer. If you start the plugin with its GUI, i.e. not from the toolbox, the plugin provides a style file, so that the new layer will be displayed as a red line.

Attribute Table of the resulting layer

id

an integer value as counter of the linestrings

observation_id_start

the id of the trajectory feature, which corresponds to the start vertex

observation_id_end

the id of the trajectory feature, which corresponds to the end vertex

emission_probability_start

the emission probability of the start vertex

emission_probability_end

the emission probability of the end vertex

transition_probability

the transition probability of the current linestring

total_probability_start

the total probability of the start vertex of the current linestring, i.e. the product of the emission and transition probability of the start vertex and the total probability of the parent vertex/candidate

total_probability_end

the total probability of the end vertex of the current linestring, i.e. the product of the emission and transition probability of the end vertex and the total probability of the parent vertex/candidate (should be the start vertex of the current linestring)

Screenshots of matched and unmatched trajectories

not successful matched successful matched successful matched legend
st. deviation = 50, max. search dist. = 20, transition weight = 30 st. deviation = 50, max. search dist. = 20, transition weight = 35 st. deviation = 60, max. search dist. = 50, transition weight = 30

Further Description

More information about the calculations can be found in the following paper (in german language with an english abstract):
Jung, C. (2019): Offline-MapMatching – A QGIS Plugin for Matching a Trajectory with a Network. In: AGIT ‒ Journal für Angewandte Geoinformatik 5-2019 (pp. 156-163). Berlin/Offenbach: Wichmann Verlag. doi: 10.14627/537669014

Sources

Newson, P., & Krumm, J. (2009). Hidden Markov map matching through noise and sparseness. In Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems (pp. 336-343). ACM. Retrieved Oct 10, 2018 from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.5145&rep=rep1&type=pdf

Budig, B. (2012). An algorithm for map matching on incomplete road databases. Retrieved Jul 31, 2018 from http://www1.pub.informatik.uni-wuerzburg.de/pub/theses/2012-budig-bachelor.pdf

Raymond, R., Morimura, T., Osogami, T., Hirosue, N. (2012). Map Matching with Hidden Markov Model on Sampled Road Network. 21st International Conference on Pattern Recognition (ICPR 2012). Retrieved Jul 31, 2018 from https://www.computer.org/csdl/proceedings/icpr/2012/2216/00/06460610.pdf

Haenelt, K. (2007). Der Viterbi-Algorithmus. Eine Erl\ufffduterung der formalen Spezifikation am Beispiel des Part-of-Speech Tagging. Retrieved Jul 31, 2018 from http://kontext.fraunhofer.de/haenelt/kurs/folien/Haenelt_Viterbi-Tutor.pdf

Other

The icons are based on products from icons8.com