The Baseball Cube is an aggregator of baseball-related data. All of the data on the site comes from other sources. Many official, some not. That data is copied/re-organized/stored in a database and presented. TBC does not scrape programmatically nor do we have direct feeds from other sites. All of the data is sourced manually and entered manually into a database with some automation tools to speed up some steps.

There are many different data elements on the site and we often get asked where the data comes from. Here are many of the different sites we use as reference with some notes about each. Most of the datasets we accumulate come from multiple sources.

  • Baseball Reference - A surprisingly minimal amount of data comes from BR but full respect for the work they do over there. Various datasets are taken from BR but not as much as you would think.
  • Baseball America - We LOVE Baseball America. They cover baseball from head to toe with high quality at all levels. Their rankings drive the Prospects section and they do a great job with signing bonuses.
  • Retrosheet - A tremendous MLB web site that provides event logs and tools to parse them allowing TBC to provide a lot of the lower-level MLB data including Extended Stats, Splits, boxscores, game logs and many research tools. The site is designed in pure text and is very very fast with no ads and no intention to make any money. They are the backbone of much of the MLB data.
  • MLB.Com - MLB stats and transactions come from here.
  • MLB Pipeline - MLB's prospect site is also a source of rankings for prospects and scouting grades.
  • - One of our greatest sources. We get transactions, stats, player biographies and many other minor league attributes from the official site of Minor League Baseball
  • MLB Trade Rumors - Probably the only blog that also acts as a reference tool outside of Baseball America posts. They do a great job of capturing player transactions and contracts.
  • - Also a great resource for contracts. We use them from time to time to fill in contract blanks.
  • Baseball Savant - Manages the MLB StatCast data and present it for public consumption at a granular level.
  • - Another reference tool
  • MaxPreps.Com - High school stats for some teams.
  • NCAA.Com - Some college stats
  • College stats are sourced from school and conference web sites for more recent seasons. Older seasons are pulled from web archives and media guides. TBC is actively searching for historical NCAA Division I statistics. If you have any old media guides or access to pre-2002 stats, especially in bulk, please let me know.
  • Summer league data is pulled from league web sites
  • International stats have been pulled from various official web sites.
  • Prospects rankings are from Baseball Prospectus, FanGraphs, MLB PIPELINE and Baseball America.
  • Service Time data comes from multiple sources. Including BR,FanGraphs,Baseball Prospectus and Cot's Baseball Contracts

  • The list above is certainly not complete as we use many other web sites, blogs, online archives and offline publications. Please let us know if you have any questions on data sources.