Monday, December 23, 2013

Issues Gathering Information Using Geo-location Parameters

We have been out at conferences recently, and there is a major shift in the investigative world to harness the power of geo-location data. This data can provide valuable insight into the whereabouts and patterns of groups and individuals. It is also vital when gathering intelligence and evidence about major events. However, there are many issues analysts must consider when they are moving to a geo-location based system and conducting location-based investigations and monitoring.

Privacy Settings and Restrictions

No matter how great a tool is, it cannot circumvent privacy settings. These settings can be instituted at both the user and site level. Ultimately, many of these privacy settings are implemented to protect users from harm. These privacy settings may scrub geo-location data from posts and/or restrict the flow of location-based information to real-time streams. Previously, Foursquare allowed applications to pull user check-in data without permission. Within the last year, Foursquare removed this feature and now requires users to first grant permission to applications to pull check-in data. As social media sites and applications make changes like these to protect the privacy of individuals, it will become increasingly difficult to base searches and monitors on geo-location data.

Savvy Users

An example of a Facebook user who posted from his couch in Florida but tagged himself in Mali.

As we discussed last week, savvy social media users bring their own challenges to the table. These users can opt to remove geo-location information from their content at a few different levels. First, they can turn-off GPS tracking on their phones’ settings to prevent the device from acquiring their locations. Users can also opt to remove geotagging from their photos which will prevent the data from being embedded in the photos’ EXIF data. Users have another option to remove geo-location data from their posts at the application level. Any users who disable geo-location tagging will ultimately prevent locations from being embedded in metadata, meaning the posts cannot be searched using location-based means. In addition, users can enter false location data into their profiles and posts, creating inaccurate geo-location data.

Defining the Location

Every social media site and geo-location tool defines areas differently. One tool may use specific coordinates to map exact locations of posts. Another may use geofencing to draw a specific area or radius around an exact location. Both of these means are problematic. Depending on the quality of the geo-location data collected from the device, users’ locations can often be marked miles away from their exact location at the time of the post. Sometimes the data about their location is collected from a user’s profile, which means a person posting from St. Augustine, Florida who lives in New York City may show up as making the post from New York City.

A further problem with defining locations arises from the use of language. Some tools allow users to use words to describe their area of interest. For instance, if we used New York City, that might encompass anything within the areas of the Bronx, Queens, Brooklyn, Staten Island, or Manhattan. However, some tools will only pick up New York City and will not recognize that synonyms include NYC or Manhattan or that neighborhoods within New York City include areas such as Harlem and the Lower East Side. These differences can often exclude data points from your area of interest.

Conquering the Problem

Currently, there are many challenges investigators and analysts face when parsing through geo-location data. Many of these issues arise from technology. Until social media sites perfect the collection and dissemination of geo-location data, all tools will be deficient in displaying the information. Additionally, until developers code more comprehensive means to categorize and disseminate geotagged data in tools, the information we can extract from them will be limited. However, there are a few things we can do to make our lives easier.

Foremost, we can use the language of the person or event or topic of interest. For instance, if we are monitoring activity in a specific neighborhood, we can use slang terms for the area, area codes, street names, popular businesses in the area, and any other terms which may describe the region. This can capture data which is not reliant upon geotagged metadata. Another way to maximize the capture of geo-location data is to use a variety of tools. Since every tool has issues collecting and displaying the applicable geographic data, harnessing the power of a multitude of sources allows us to build a more comprehensive data set to analyze.

About CES PRISM Blog

My photo
The CES PRISM blog is the place where CES shares the newest developments in social media sites and tools, data analytics, eDiscovery, investigations, and intelligence. We will also share workflow tips and tricks, case studies, and the developmental progress of our open source social media research and analysis tool, PRISM. Our goal is to open a dialogue with the community which allows all of us to learn together.