Alternative names: PTE
For prediction of whether a given molecule is carcinogenic or not. The dataset contains 182 positive carcinogenicity tests and 148 negative tests.
Carcinogenesis (by Janez Kranjc)
- Foreign key constraints violated. Specifically, table "atom" has a drug "d115" that is missing in "canc" table. The dataset contains just 329 instances, the expected number is 330.
- Associated task:
- Data types:
- 21 MB
- Count of tables:
- Count of rows:
- Count of columns:
- Missing values:
- Compound keys:
- Instance count:
- Target table:
- Target column:
- Target ID:
- Target timestamp:
|Dataset version||Target||Algorithm||Author text||Measure||Value|
|Carcinogenesis||class||Aleph||Wordification: Propositionalization by unfolding relational data into bags of words||Accuracy||0.5532|
|Carcinogenesis||class||Predictor Factory||Predictor Factory||Accuracy||0.6689|
|Carcinogenesis||class||RelF||Wordification: Propositionalization by unfolding relational data into bags of words||Accuracy||0.6018|
|Carcinogenesis||class||RSD||Wordification: Propositionalization by unfolding relational data into bags of words||Accuracy||0.6049|
|Carcinogenesis||class||Wordification||Wordification: Propositionalization by unfolding relational data into bags of words||Accuracy||0.6231|
How to download the dataset
The datasets are publicly available directly from MariaDB database.
- Open your favourite MariaDB client (MySQL Workbench works, but see FAQ)
- Use following credentials:
- hostname: relational.fit.cvut.cz
- port: 3306
- username: guest
- password: relational
- Export "Carcinogenesis" database (or other version of the dataset, if available) in your favourite format (e.g. CSV or SQL dump).