A QGIS-plugin for matching a trajectory with a network using a Hidden Markov Model and Viterbi algorithm.
The source code is still under developement and does not show the version available within QGIS.
Matching a trajectory with a network, e.g. a road network, is a classical task in geoinformatics.
It can be differentiated into online, i.e. during the measurement of the trajectory like in car navigation,
and offline, i.e. after the measurement of the trajectory, map matching. This plugin addresses only the offline map mathing.
Because of inaccuracies of the GNSS signal and/or a low data quality of the network, the positions of the trajectory
and the network are differing. In many cases a simple snapping of the trajectory on the network will not work. Reasons for that
can be e.g. outliers in the trajectory or crossings of edges in the network.
This plugin provides a statistical approach to solve the problem of offline map matching using the principles of
Hidden Markov Models (HMM) and the Viterbi algorithm.
processing.run('omm:match_trajectory', {'NETWORK': 'network_layer',
'TRAJECTORY': 'trajectory_layer',
'TRAJECTORY_ID': 'id',
'CRS': '25833',
'SIGMA': 50.0,
'MY': 0.0,
'BETA': 30.0,
'MAX_SEARCH_DISTANCE': 20.0,
'OUTPUT': 'destination_in_filesystem'})
{'OUTPUT': 'destination_in_filesystem',
'ERROR_CODE': 0,
'COMPUTATION_TIME': 123.45}
processing.run('omm:clip_network', {'NETWORK': 'network_layer',
'TRAJECTORY': 'trajectory_layer',
'ORDER_FIELD': 'field_name',
'BUFFER_RADIUS': 20.0,
'OUTPUT': 'destination_in_filesystem'})
processing.run('omm:reduce_trajectory_density', {'TRAJECTORY': 'trajectory_layer',
'DISTANCE': 100.0,
'OUTPUT': ':memory'})
processing.run('omm:reduce_trajectory_density', {'TRAJECTORY': 'trajectory_layer',
'DISTANCE': 100.0,
'KEEP_LAST_FEATURE': True,
'OUTPUT': ':memory'})
First, the plugin calculates possible candidate points for each point of the trajectory (observations) (Budig 2012: 10). The candidates will be arranged in a candidate graph with n layers, n = count of observation points (Budig 2012: 13). After calculating different posibilities, the path with the highest posibility will be found using the Viterbi algorithm. Below the different parts of the computation in a nutshell, followed by explicit descriptions of each step:
The network and the trajectory layers will be imported into the plugin and the internal structur of objects will be initialised. The trajectory cannot be imported, if the layer has no attribute field. Each position of the trajectory is called observation and one part of the five tupel of a Hidden Markov Model (HMM).
A candidate point is a possible position of an observation on the network. The distance to each linestring of the network will be calculated for every observation. If the distance is less or equal the maximum search distance, a candidate as the nearest point on the linestring to the observation will be extracted and added to the candidate graph (Budig 2012: 14):
"A candidate graph is a directed acyclic graph (DAG) which represents all paths that are
considered possible final matching results for the given GPS trajectory" (Budig 2012: 13). The graph will be used as a stochastic matrix
for the HMM. The graph has n layers with n as the count of observations.
Each column contains all candidates of the same/corresponding observation (Budig 2012: 13):
The calculation of candidates has a huge impact of the computation time. A higher search distance can result in more candidates per
observation and more possible paths, which will be calculated in the next steps. Also a high segmentation of the network can cause
a lot of candidates per observation with the same problem for the computation time.
A GNSS-measured position is not the real position because of the inaccuracy of the GNSS signal. The GNSS-errors are following
almost, but not strictly, a normal distribution (Newson, Krumm 2009: 4). Also the quality of the network geometry
can be very low. Therefore the candidates are more or less away from the observations.
To calculate the probability, that an observation was emitted by a candidate, a normal distribution is postulated, i.e.
the probability is pending on the distance between observation and candidate, the standard deviation of the GNSS-error and
the expected value of the difference (Newson, Krumm 2009: 4).
The emission probabilities for each candidate are one part of the five tupel of a Hidden Markov Model (Haenelt 2007: 4-5) and the emission probabilities
of the candidates of the first observation will be used as initial state
probabilities in the HMM and represent another part of the five tupel (Raymond et al. 2012: 2244).
The expected value and the standard deviation can be set by the user. The expected value represents the best possible or expected distance between
a candidate and an observer. In most cases, this should be a value of zero (Newson, Krumm 2009: 3).
If the standard deviation of the GNSS-receiver is unknown, it has to be estimated e.g. with this equation (Newson, Krumm 2009: 6):
The term (z - x) means the distance between an observation and its candidate. I reached good results to set the standard deviation to around 2 * (maximum search distance). A larger value represents less trust in the measurement.
A too high or too low standard deviation can overemphasise or underemphasise the emission probability (compared to the transition
probability) and maybe results in an incorrect map matching.
Another and the last tupel of the HMM are the transition probabilities whereat transitions are connections between two candidates in the graph, i.e. the edges of the graph. Two characteristics of the transition between two candidates and the transition between the two corresponding observations will be compared:
The distance on the network will be calculated using the analysis framework of QGIS and the Dijkstra algorithm with the distance between nodes as
edge weight. The difference of the two distances will be used in the following equation, in which the parameter d is the difference of the distances (Newson, Krumm 2009: 4):
The parameter beta (named "Transition weight" in the GUI of the plugin) can be estimated as around 1.5 (median of all distances) (Newson, Krumm 2009: 6). I get good results setting beta = 1.5 (maximum search distance).
A larger value of beta represents more tolerance of non-direct routes and vice-versa.
The directions will be calculated using the slope of the line through both observations and the product of the
slops of each subline, i.e. each line between two following vertices, of the shortest way from one candidate
to the other candidate on the network. The angle is the arcustanges of the slope and will have a value between 0 and 180 (the python function atan is used and the
result will be summed up by 90 to avoid negative values). The difference of this two angles will be normalised to receive
a value between 0 and 1 (Budig 2012: 17):
All probabilities (initial, emission and transition) along each path will be multiplied in an efficient way by using the Viterbi algorithm. The Viterbi algorithm uses results of former calculations again to avoid multiple calculations of same sequences. Also just the highest probabilities to reach a candidate will be stored and reused (Haenelt 2007: 12). The most likely path through the HMM, i.e. the path through the candidate graph with the highest probability, is the result of this algorithm.
The founded Viterbi path represents the vertices of the resulting line. A routing between following points of the Viterbi path returns linestrings, which will be written to a new memory layer. If you start the plugin with its GUI, i.e. not from the toolbox, the plugin provides a style file, so that the new layer will be displayed as a red line.
an integer value as counter of the linestrings
the id of the trajectory feature, which corresponds to the start vertex
the id of the trajectory feature, which corresponds to the end vertex
the emission probability of the start vertex
the emission probability of the end vertex
the transition probability of the current linestring
the total probability of the start vertex of the current linestring, i.e. the product of the emission and transition probability of the start vertex and the total probability of the parent vertex/candidate
the total probability of the end vertex of the current linestring, i.e. the product of the emission and transition probability of the end vertex and the total probability of the parent vertex/candidate (should be the start vertex of the current linestring)
not successful matched | successful matched | successful matched | legend |
---|---|---|---|
st. deviation = 50, max. search dist. = 20, transition weight = 30 | st. deviation = 50, max. search dist. = 20, transition weight = 35 | st. deviation = 60, max. search dist. = 50, transition weight = 30 |
More information about the calculations can be found in the following paper (in german language with an english abstract):
Jung, C. (2019): Offline-MapMatching – A QGIS Plugin for Matching a Trajectory with a Network. In: AGIT ‒ Journal für Angewandte Geoinformatik 5-2019 (pp. 156-163). Berlin/Offenbach: Wichmann Verlag. doi: 10.14627/537669014
Newson, P., & Krumm, J. (2009). Hidden Markov map matching through noise and sparseness.
In Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems (pp. 336-343). ACM.
Retrieved Oct 10, 2018 from
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.5145&rep=rep1&type=pdf
Budig, B. (2012). An algorithm for map matching on incomplete road databases.
Retrieved Jul 31, 2018 from
http://www1.pub.informatik.uni-wuerzburg.de/pub/theses/2012-budig-bachelor.pdf
Raymond, R., Morimura, T., Osogami, T., Hirosue, N. (2012). Map Matching with Hidden Markov Model on Sampled Road Network.
21st International Conference on Pattern Recognition (ICPR 2012).
Retrieved Jul 31, 2018 from
https://www.computer.org/csdl/proceedings/icpr/2012/2216/00/06460610.pdf
Haenelt, K. (2007). Der Viterbi-Algorithmus. Eine Erl\ufffduterung der formalen Spezifikation am Beispiel des Part-of-Speech Tagging.
Retrieved Jul 31, 2018 from
http://kontext.fraunhofer.de/haenelt/kurs/folien/Haenelt_Viterbi-Tutor.pdf
The icons are based on products from icons8.com