We have a system with which we periodically capture the latitude and longitude of a vehicle's location. The interval between 2 samples ranges between 2 minutes and 5 seconds. For all the 100 vehicles in the system, we store the data in a database.
We want to identify the various routes taken by the vehicles over a period of a few moths and also identify the most frequented routes.
We can identify when the vehicle starts and when it shuts down. The series of lat-lon between the start and stop is the route taken by the vehicle.
For any given vehicle, if we take all such series, is it possible to identify similar routes among all the series? Is there an algorithm that can take multiple arrays of lat-lon and separate out the similar routes?
For e.g. a car travels as A->B->C->D at 9 am.
At 3 pm it travels A1->B1->C->D. A1 has a lat-lon very near to A. B1 is very near to B.
At 6 pm it travels B1->E->F->G.
The algorithm should identify the first 2 routes as 'similar' and the third route as different.
The complexity is: the lat-lon coordinates may not match one-to-one as during the 2 trips, as the vehicle may have taken the same road but transmitted/recorded its position from different points.
Is there an existing algorithm/tool to achieve this?