Accidents
Traffic accident database consists of all accidents that happened in Slovenia’s capital city Ljubljana between the years 1995 and 2005.
AdventureWorks
Adventure Works 2014 (OLTP version) is a sample database for Microsoft SQL Server, which has replaced Northwind and Pub sample databases that were shipped earlier. The database is about a fictious, multinational bicycle manufacturer called Adventure Works Cycles.
BasketballMen
The task is to predict rank of teams.
BasketballWomen
The task is to predict whether the team plays playoff, or not.
ClassicModels
The schema is for Classic Models, a retailer of scale models of classic cars. The database contains typical business data such as customers, orders, order line items, products and so on.
ConsumerExpenditures
The Consumer Expenditure Survey (CE) collects data on expenditures, income, and demographics in the United States. The public-use microdata (PUMD) files provide this information for individual respondents without any information that could identify respondents. PUMD fi…
Countries
The task is to predict "Forest area (% of land area)" for 247 countries in 2012 based on the previous values.
Credit
A bit more complex artificial database with loops.
Dunur
Dunur is a relatedness of two people due to marriage such that A is dunur of B if a child of A is married to a child of B.
Financial
PKDD'99 Financial dataset contains 606 successful and 76 not successful loans along with their information and transactions.
FTP
PAKDD'15 Data Mining Competition: The task is to reconstruct the information about user’s gender from product viewing logs. The data were obtained from simulations of product viewing activities of users with known gender. The data closely follow the real-life distribut…
Hockey
The Hockey Database follows the same general design as the Lahman Baseball Database. In addition to the NHL, the Hockey DB covers the following early and alternative leagues: NHA, PCHA, WCHL and WHA. It contains individual and team statistics from 1909-10 through the 2…
IMDb
The IMDb database: moderately large, real database of movies.
Lahman
Lahman’s baseball database contains complete batting and pitching statistics from 1871 to 2014, plus fielding statistics, standings, team stats, managerial records, post-season data, and more.
LegalActs
Bulgarian court decision metadata.
Thrombosis
PKDD'99 Medical dataset describes 41 patients with Thrombosis.
Mondial
A geography dataset from University of Göttingen describes 114 Christian countries and 71 non-Christian countries.
MooneyFamily
The dataset describes a family composed of 86 people across 5 generations. The family dataset includes 744 positive instances and 1488 randomly generated negative instances.
NCAA
2015 NCAA Basketball Tournament.
Northwind
The Northwind database contains the sales data for a fictitious company called Northwind Traders, which imports and exports specialty foods from around the world.
Pubs
The pubs sample database is modeled after a book publishing company.
Sakila
The venerable sakila test database: small, fake database of movies.
SalesDB
A simple artificial database in star schema.
Stats
An anonymized dump of all user-contributed content on the Stats Stack Exchange network.
PTC
Predictive Toxicology Challenge (2000) consists of more than three hundreds of organic molecules marked according to their carcinogenicity on male and female mice and rats.
TPCC
TPC-C is the benchmark published by the Transaction Processing Performance Council (TPC) for Online Transaction Processing (OLTP).
TPCDS
TPC-DS is the new decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. Although the underlying business model of TPC-DS is a retail product supplier, the database schema, data …
VOC
VOC database provides a peephole view into the administrative system of an early multi-national company, the Vereenigde geoctrooieerde Oostindische Compagnie (VOC for short - The (Dutch) East Indian Company) established on March 20, 1602.
World
A database of 239 states and their cities.