Lahman

Lahman

Lahman’s baseball database contains complete batting and pitching statistics from 1871 to 2014, plus fielding statistics, standings, team stats, managerial records, post-season data, and more.

Original source: www.seanlahman.com

Versions

  • Lahman_2014 (by Jan Motl)

    • add foreign key constrains, delete records noncorforming key constrains

Dataset details

Associated task:
Regression
Domain:
Sport
Data types:
Size:
74.1 MB
Count of tables:
25
Count of rows:
470,225
Count of columns:
353
Missing values:
Yes
Compound keys:
No
Loops:
Yes
Type:
Real
Instance count:
23,111
Target table:
salaries
Target column:
salary
Target ID:
teamID, playerID, lgID
Target timestamp:
yearID

How to download the dataset

The datasets are publicly available directly from MySQL database.

  1. Open your favourite MySQL client (for example MySQL Workbench)
  2. Use following credentials:
    • hostname: relational.fit.cvut.cz
    • port: 3306
    • username: guest
    • password: relational
  3. Export "lahman_2014" database (or other version of the dataset, if available) in your favourite format (e.g. CSV or SQL dump).