Classification Homework

DM 352 Syllabus | DM 552 Syllabus

last updated 23-Jun-2021

Objectives

Become comfortable using a graphical user interface data mining tool Weka (download link). Do not use Python or R.

Create decision trees with some relatively simple, but clean, datasets.

Experiment with the different hyperparameters of the classification algorithm to minimize the error rates of the trees.

Procedure

Be sure you have Weka loaded onto your computer.

Download the following two data files:

These .csv or .arff files can be loaded into Weka.

Develop decision trees using the C5.4 algorithm (Weka calls it J48). These will be demonstrated in class.

Experiment with different parameters to create your best, but reasonably simple, decision tree with your best accuracy.

 

Write Up

In a single Word document with your name:

For each data set:

  1. capture the several (4-5) different trees based on a variety of algorithm parameters you chose
    • try to capture the graphical representation
    • capture the error analysis as well
  2. explain as best you can the reasons why you chose the "best" tree and why the parameters led you to that choice.

Upload your Word document into Moodle.