Last year Mapbox launched global Movement Data for customers across all industries. This data set quickly became an important tool across a number of industries: retail companies are using it to manage COVID impact and re-opening of store locations, public health officials used the data to model viral replication correlation with mobility, the hospitality industry is eagerly tracking post-COVID trends and forecasting impact on global tourism, and EV companies and city planners are measuring vehicle counts to understand the “new normal” for traffic patterns.
This week Mapbox launched a major data version upgrade to Movement Data. This enhanced data set delivers even further geospatial insights with 5 major enhancements, resulting in 10-100x more coverage across all major markets, more powerful and reliable analysis throughout historical time periods, and 13% higher correlation with leading ground truth indicators of mobility and commercial activity.
Challenges of Dynamic Location Data
By definition, location data is constantly changing. Building a high-quality privacy-forward data set from 30+ billion daily location updates streaming from 600M monthly active users is a never-ending journey in pursuit of perfection. Mobile device activity is a powerful proxy to measure changes in on-the-ground human mobility patterns. However, it can be subject to bias and anomalies:
- A small number of mobile apps contributing a large amount of location activity may skew activity levels.
- Inherent noise from iPhone and Android on-device GPS and system-wide changes in mobile OS updates materially change the type and amount of location data available.
- Mobile device adoption varies widely within countries (urban vs. suburban vs. rural) and is vastly different across geographies (ex: 60%+ iPhone adoption in Japan vs. 97% Android adoption in India)
These factors often result in inconsistencies that reduce data quality and reliability.
Enhanced V2 Movement Data
The Mapbox Movement data science team constantly builds, evaluates, and rebuilds complex models that take into account over 20+ different facets of Mapbox telemetry data to power the Movement data set. The latest upgrade to Movement Data bundles together 5 significant enhancements.
Localized App Adjustments
Numerous studies have shown that mobile device adoption and usage patterns vary meaningfully across urban, suburban, and rural residents. In addition, many mobile apps are designed for use exclusively in urban areas (e.g. ridesharing, food delivery, community/neighborhood activism apps). This generates a significant imbalance in mobile device activity levels measured in urban vs. rural counties within the US and similar markets, artificially inflating urban activity levels and depressing rural activity patterns. To address this inconsistency, Mapbox implemented an auto-scaling adaptive model that identifies and measures the urban/rural imbalance and applies a set of “corrections” at the city level on a daily basis. This self-adapting system is fully automated and operates in every production market without need for manual intervention.
Full support for updated Mapbox Boundaries
Movement Data now includes full support for over 4 million curated administrative, statistical, and postal boundary polygons as part of premium integration with Mapbox Boundaries. Movement data is delivered to customers with activity data aggregated to precise polygons consistent with the customer’s use case and worldview, including multiple supported variations on disputed territories and borders.
Activity Index re-scaled based on tile size
Movement Data is aggregated to zoom-level 18 tiles which are approximately 100m in size but vary based on the latitude position of the individual tile. For larger countries like Brazil and Canada, this can cause measurable differences in activity index values for tiles at different latitudes. Movement data implements programmatic re-scaling based on the exact square area of each tile so that activity levels in northern and southern regions of the same country can be more accurately compared.
Intelligent privacy thresholds yield massive improvements in coverage
Even fully anonymous data can be re-identified if the location signals are analyzed at a fine enough granularity without aggregation. To minimize the risk of data misuse, Mapbox applies minimum privacy thresholds and small amounts of random noise to low-activity tiles. This intelligent privacy mechanism maximizes data coverage, minimizes risk of re-identifying activity patterns, and guarantees no statistical impact on the resulting analysis.
The number of tiles with consistent activity coverage is 10x increased in the US, 50x increased in Mexico, and 250x increased in the UK and Canada.
Data delivery to any cloud storage service
Movement Data is now available as a flat file data set for delivery to customers in any cloud environment. Amazon Web Services S3 buckets and Google Cloud Storage buckets are natively supported. Movement Data is available in the Snowflake Data Marketplace and can be securely shared with Snowflake customers in any cloud (AWS, GCP, Azure) and any global region.
Better Data, Better Results
Movement data is benchmarked against a variety of real-world indicators including The Economist Normalcy Index. The Normalcy Index is one of the most comprehensive publicly available metrics, consisting of 8 indicators of ground truth measurements across three domains: travel and mobility, commercial revenue generation, and retail and work activities.
The benchmark results show a remarkably strong correlation between Movement Data and the Normalcy Index, with a Pearson correlation coefficient value of 0.84 for the United Kingdom as an example (13% higher than previous data versions). Mapbox Movement shows a stronger correlation than both Google and Apple mobility data (neither of which are available at the same 100m granularity as Movement data).
Upgrade your analysis with Mapbox Movement data available in over 140 countries. Download your sample of enhanced Movement Data to get started.