We have been out at conferences recently, and there is a
major shift in the investigative world to harness the power of geo-location
data. This data can provide valuable insight into the whereabouts and patterns
of groups and individuals. It is also vital when gathering intelligence and
evidence about major events. However, there are many issues analysts must
consider when they are moving to a geo-location based system and conducting
location-based investigations and monitoring.
Privacy Settings and
Restrictions
No matter how great a tool is, it cannot circumvent privacy
settings. These settings can be instituted at both the user and site level.
Ultimately, many of these privacy settings are implemented to protect users
from harm. These privacy settings may scrub geo-location data from posts and/or
restrict the flow of location-based information to real-time streams. Previously,
Foursquare allowed
applications to pull user check-in data without permission. Within the last
year, Foursquare removed this feature and now requires users to first grant
permission to applications to pull check-in data. As social media sites and
applications make changes like these to protect the privacy of individuals, it
will become increasingly difficult to base searches and monitors on
geo-location data.
Savvy Users
An example of a Facebook user who posted from his couch in Florida but tagged himself in Mali.
As we discussed last week, savvy social media users bring their own challenges to the table. These users can opt to remove
geo-location information from their content at a few different levels. First,
they can turn-off GPS tracking on their phones’ settings to prevent the device from acquiring their locations. Users can also opt to
remove geotagging from their photos which will prevent the data from being embedded in the photos’ EXIF data. Users
have another option to remove geo-location data from their posts at the
application level.
Any users who disable geo-location tagging will ultimately prevent locations
from being embedded in metadata, meaning the posts cannot be searched using
location-based means. In addition, users can enter false location data into
their profiles and posts, creating inaccurate geo-location data.
Defining the Location
Every social media site and geo-location tool defines areas
differently. One tool may use specific coordinates to map exact locations of
posts. Another may use geofencing to draw a specific area or radius around an
exact location. Both of these means are problematic. Depending on the quality
of the geo-location data collected from the device, users’ locations can often
be marked miles away from their exact location at the time of the post.
Sometimes the data about their location is collected from a user’s profile,
which means a person posting from St. Augustine, Florida who lives in New York
City may show up as making the post from New York City.
A further problem with defining locations arises from the
use of language. Some tools allow users to use words to describe their area of
interest. For instance, if we used New York City, that might encompass anything
within the areas of the Bronx, Queens, Brooklyn, Staten Island, or Manhattan.
However, some tools will only pick up New York City and will not recognize that
synonyms include NYC or Manhattan or that neighborhoods within New York City
include areas such as Harlem and the Lower East Side. These differences can
often exclude data points from your area of interest.
Conquering the
Problem
Currently, there are many challenges investigators and
analysts face when parsing through geo-location data. Many of these issues
arise from technology. Until social media sites perfect the collection and
dissemination of geo-location data, all tools will be deficient in displaying
the information. Additionally, until developers code more comprehensive means
to categorize and disseminate geotagged data in tools, the information we can
extract from them will be limited. However, there are a few things we can do to
make our lives easier.
Foremost, we can use the language of the person or event or
topic of interest. For instance, if we are monitoring activity in a specific
neighborhood, we can use slang terms for the area, area codes, street names,
popular businesses in the area, and any other terms which may describe the
region. This can capture data which is not reliant upon geotagged metadata.
Another way to maximize the capture of geo-location data is to use a variety of
tools. Since every tool has issues collecting and displaying the applicable
geographic data, harnessing the power of a multitude of sources allows us to
build a more comprehensive data set to analyze.