- Data mining is more clear to be called “knowledge mining from data”. Knowledge extraction, data/pattern analysis.
- The knowledge extraction/discovery is a sequence of:
1. Data cleaning (remove noise and inconsistent data)
2. Data integration (from multi sources)
3. Data selection
4. Data transformation ( to specific form)
5. Data mining (extract data pattern)
6. Patter evaluation
7. Knowledge presentation
- Classification is the process of finding a model/function that describes and distinguishes data classes/concepts, for the purpose of being able too use the model to predict the class of objects whose class label is unknown. The derived model is based on the analysis of a set of training data whose class label is known.
- Cluster analysis. Unlike classification and prediction, which analyze class-labeled data, clustering analyzes data objects without knowing a known class label.
- Data preparing, such as data normalization.