The talk will introduce the most important
issues of data mining, as well as the context and reasons for its
emergence and
increasing popularity. The motivation for this new discipline is
introduced by
relating it to other database technologies, such as data
warehousing and OLAP,
clearly stating the key differences between data mining and other
database
exploitation tools which (all together) are usually referred as business intelligence. Some assorted
examples will pave the way to a clearer
view and definition of data mining, and its role in the context of
knowledge
discovery from databases (KDD). The KDD process will be illustrated
through its
main stages: data integration, data preparation, data mining, model
evaluation
and interpretation, knowledge deployment and model monitoring. Some
predictive
and descriptive data mining techniques will be briefly discussed:
decision
trees, rule learning, neural networks, linear and non-linear
regression,
Bayesian methods, frequent itemsets algorithms, clustering techniques,
etc.
Finally, some ideas on data mining methodologies, such as CRISP-DM, as
well as
demos and examples of data mining packages will give a more direct
flavour of
what applying data mining really means.
PART I (PDF)
PART II (PDF)
PART III (PDF)
PART IV (PDF)