Data Library


Dataset Library


#Dataset TitleDescriptionRelevant Links
1MLB Baseball Pitch Data (2015-2018)OMLB released a dataset containing detailed information regarding every pitch during the 2015 to 2018 seasons. Details include whether it was a strike or a ball, the type of pitch, names of batter and pitcher, the result of the at bat, and more.Link to Dataset
Company Blog Post
2Ford GoBike DataAnonymized, timestamped data about the start- and end- station for a bike, the user type (subscriber or casual rider), as well as some customer-reported attributes like birth year and gender.Link to Dataset
Company Blog Post
3NYC 311 Service Calls DataAll NYC 311 Service Requests from 2010 to present.Link to Dataset
Company Blog Post
4Probe Vehicle DataThe dataset is publicly available from the USDOT website and is based on the field testing conducted in Fairfax County, Virginia, which utilized a fleet of 10 vehicles and a finalized test matrix for the Advanced Messaging Concept Development (AMCD) project.Link to Dataset
Company Blog Post
5Spotify's Worldwide Daily Song Ranking DataContains the daily ranking of the 200 most streamed songs in 53 countries from 2017 and 2018 by Spotify users. It contains over 3 million rows, comprising 6629 artists, and 18598 songs for a total count of one hundred five billion streams.Link to Dataset
Company Blog Post
6Part D Prescriber Data CY 2015 DataProvides information on prescription drugs prescribed by individual physicians and other health care providers and paid for under the Medicare Part D Prescription Drug Program.Link to Dataset

Company Blog Post
7Caltrans Traffic DataMultiple telemetry data series are available, ranging from speed to traffic incident information. The data is recorded by stations located throughout the freeways and the stations collect a variety of data including speed (in miles/hr) and occupancy (percent that the lane is full).Link to Dataset

Company Blog Post
8Big Data Bowl DataPlayer tracking data one 2017 game. Player, play, and game-level data that correspond to the tracking data.Link to Dataset

Company Blog Post
9Uber Movement Data + San Francisco TAZ DataThe Uber movement data contains 1 month (plus few days from the subsequent month) worth of rides around SF area.

The TAZ stands for the Traffic Area Zones which is an ID used by the Uber dataset. The TAZ contains the polygons for each traffic zone and can be joined, based on the id, to create a choropleth to describe the data movement.
Link to Uber Dataset

Link to TAZ Dataset

Company Blog Post
10Jekyll Island LIDAR DataThis data helps understand the impact of the rise in sea level on buildings, providing valuable insights for both city planners and insurance companies.Link to Dataset
Company Blog Post