Unit 1: Components of Decision-making process
Business intelligence, Decision Support Systems, Data ware-housing.
Unit 2: Data analysis and exploration
Mathematical models for decision making, data mining, data preparation, data exploration.
Unit 3: Introduction of Big data and Hadoop Echosystem
Big data definition, Elements of Big data, Big data analytics, Big Data Stack, Virtualization and
Big data, virtualization approaches, Hadoop Ecosystem, Hadoop Distributed file system(HDFS,
MapReduce, Hadoop YARN, Hbase, Hive, Pig and Pig latin, Sqoop, ZooKeeper, Flume, Oozie.
Unit 4: Data mining tasks
Regression and association rules- structure of regression model, single linear regression, and
multiple linear regression.
Classification - classification problems, Classification models, classification trees, Bayesian
methods.
Unit 5: Association rules and clustering
Structure of association rules, Single dimension association rules, Apriori algorithm, General
association rules. Clustering – clustering methods, partition methods, Hierarchical methods.
Unit 6: Exploring R
Basic Features of R, Exploring RGui, Working with vectors, Handeling data in R workspace.
Reading datasets and exporting data from R, Manipulating and processing data in R.