Skip to content

Latest commit

 

History

History
46 lines (30 loc) · 1.07 KB

Page_Blocks_Classification.md

File metadata and controls

46 lines (30 loc) · 1.07 KB

Page Blocks Classification

Dataset

Page Blocks Classification Data Set

Data Set Information

The 5473 examples comes from 54 distinct documents. Each observation concerns one block. All attributes are numeric. Data are in a format readable by C4.5.

Abstract

- -
Data Set Characteristics Multivariate
Attribute Characteristics Integer, Real
Number of Attributes 10
Number of Instances 5473
Associated Tasks Classification

Source

  • Original Owner:

    Donato Malerba Dipartimento di Informatica University of Bari

  • Donor:

    Donato Malerba

Result

Measure the accuracy of the test subset (30% of instances)

Model Accuracy Training Time
Decision Tree Scikit Learn 0.9622 00:00.035
Decision Tree From Scratch 0.9608 05:58.906

Can be improve (Decision Tree Algorithm)

  • Add support for missing (or unseen) attributes
  • Prune the tree to prevent overfitting
  • Add support for regression