目录
1. Motivation:
A. Explosive growth of data:
Source of abundant data: Business、Science、Society and Everyone.
B. Turn Data into Values and Knowledge:
User Opinions:Blog、Social Network、Query logs
Health Status:Body Temperature、Body Weight、Age、Gender
System Diagnosis:Network Traffic、Software logs、CPU Usage、Power Consumption
diagnosis [ˌdaɪəɡˈnəʊsɪs] 诊断
consumption [kənˈsʌmpʃn] 消耗,消费
2. Definition and Procedure:
A. Definition:
Non-trivial Extraction of Implicit,previously unknown and potentially userful imformation from data.
Definition [ˌdefɪˈnɪʃn] 定义
Trival [ˈtrɪviəl] 琐碎的,不重要的
Non - Trival 无法轻易就能实现,有一定复杂度的
Extraction [ɪkˈstrækʃn] 提取, 抽取
Implicit [ɪmˈplɪsɪt] 内含的
B. Procedure:
数据源 -> 数据预处理 -> 数据勘探 -> 数据挖掘 -> 数据可视化 -> 决策
intergration 整合
Data Warehouse 数据仓库
3. What we are going to learn:
A. Simple Introdution to Data Exploration:
B. Association to Rule Mining:
C. Clustering:
D. Classification:
E. Anomaly Detection:
F. Link Analysis:
G. Recommendation Systems:
H. Decision Support
I. Evaluation of Knowledge
Anomaly [əˈnɒməli] 异常事物
Link Analysis 链接分析
Evaluation [ɪˌvæljuˈeɪʃn] 估值,评价