· • • •
DBDT • • • ·
Distance-based
Decision Tree Learning
Copyright
2004-2005 The MIP Group:
Estruch-Gregori,Vicent; Ferri-Ramírez,Cèsar; Hernandez-Orallo,José; Martínez-Plumed,Fernando;
Ramirez-Quintana,M.José
Presentation:
DBDT is a machine
learning algorithm that integrates decision tree learning and center splitting.
Roughly speaking, the inferred classifer can be viewed as a tree of attribute
prototypes (The value distribution of an attribute is represented by a set of
prototypes.). An instance is linked to one prototype or other depending on its
proximity.
ProbDBDT is a variation of DBDT that uses probabilities-based distances.
The System:
You can download the whole
system package for academic use with the following conditions:
DISCLAIMER
& COPYRIGHT: The software has been checked on a several Intel-based
machines (PCs) under different versions of Ms. Windows (2000,XP). In
this regard, you can make any modification to the software, provided you always
make the changes explicit and refer to its original authors. Obviously, we are
not responsible for any damage caused by the use or misuse of this software. If
you find any bug please contact the authors. For commercial use *do* contact
the authors.
Source and Executable Code:
- Windows stable version: DBDT & ProbDBDT
: DBDT has been implemented in JBuilder using the WEKA
libraries. The GUI uses a non-standard Java layout being the compatibility
with other platforms not ensured. A release JBuilder version is usually
avalaible here
.
Getting started
- Select the Load Files panel to load a new data set (see figure below). Two
files are required, an *.arff file and the metric_space.txt (This one
stores the distance information). Just in case the latter one does not
exist, the system can authomatically compute it. Click on the second File
menu entry ( Save Ms File from arff) for this purpose.
- Once the sample file is loaded, run the DBDT algorithm. Select the Experimenter panel and
click on the Run button. Remember setting up those parameters in which you are interested
(Accuracy or/and AUC).
- To save the results in a file, press the Save As
button. The Clear button cleans the edition area.
Note that the
Weka classes are requiered as well. The aplication link them from a default path
(
c:/weka3-4). Of course, the latest version of the library can be obtained from the
weka project.
SAMPLE DATASETS:
Many example datasets in DBDT (and ProbDBDT) format (*.arff file + metric_space.txt) can be found here . If you have no
examples in DBDT format, please download them because they will be required.
How should it look like?
After loading and runing the
Java project, the look should be as follows.

Current Features
Experiment Settings
- Several cross validation.
- Detailed perfomance achieved in each fold (mistakes, ratio, etc.).
- Accuracy and AUC (only for two class problems) evaluation.
- Methods ID3 and C4.5 (prunning option is disabled) are also
invokable for non-structured problems.
DBDT Settings
- Distance-based criterion for instance classification (proximity).
- Naive density-based criterion for instance classification
(density).
- Number of children for each node.
Future Features:
- Enhance the comprehensibility of the model.
- Include prunning techniques.
- Develope heuristic functions based on MML/MDL criteria.
- Acces to data bases.
- Program a distributed version of the DBDT algorithm.
© 2004-2005 Vicent Estruch-Gregori .