Google are making AI for all kinds of purposes, including tackling the challenging Chinese game of Go. Now, they have revealed their latest deep-learning program, PlaNet, which is capable of recognizing where an image was taken even without it being geotagged.
PlaNet was trained by Google researchers with 90 million geotagged images from around the globe, which were taken from the internet. This means that PlaNet can easily identify locations with obvious landmarks, such as the Eiffel Tower in Paris or the Big Ben in London, a simple task which any human with knowledge of landmarks can do. This is taken even further by PlaNet, which sets itself apart with its ability to determine the location in a picture that is lacking any clear landmarks, with the deep learning techniques it uses making it able to even identify pictures of roads and houses with a reasonable level of accuracy.
The PlaNet team, led by software engineer Tobias Weyand challenged the accuracy of the software using 2.3 million geotagged images taken from Flickr while limiting the platform’s access to the geotag data. It was capable of street-level accuracy 3.6% of the time, city-level accuracy 10.1% of the time, country-level accuracy 28.4% of the time and continent level accuracy 48% of the time. This may not sound too impressive, but when Weyand and his team challenged 10 “well-travelled humans” to face off against PlaNet, it was able to beat them by a margin, winning 28 out of 50 games played with a median localization error of 1131.7 km compared to the humans 2320.75 km. Weyand reported that PlaNet’s ability to outmatch its human opponent was due to the larger range of locations it had “visited” as part of its learning.
What the plans are for PlaNet going forward is anyone’s guess, with a potential application being to locate pictures that were not geotagged at the time of photography, but it will be interesting to see how the techniques that bring machine learning closer and closer to humans can advance in the future.