htw saar
Back to Main Page

Choose Module Version:

flag

Data Science

Module name (EN): Data Science
Degree programme: Computer Science and Communication Systems, Master, ASPO 01.10.2017
Module code: KIM-DS
Hours per semester week / Teaching method: 3V+1U (4 hours per week)
ECTS credits: 6
Semester: according to optional course list
Mandatory course: no
Language of instruction:
German
Assessment:
Written exam
Curricular relevance:
KIM-DS Computer Science and Communication Systems, Master, ASPO 01.10.2017, optional course, informatics specific
PIM-DS Applied Informatics, Master, ASPO 01.10.2017, semester 1, mandatory course
Workload:
60 class hours (= 45 clock hours) over a 15-week period.
The total student study time is 180 hours (equivalent to 6 ECTS credits).
There are therefore 135 hours available for class preparation and follow-up work and exam preparation.
Recommended prerequisites (modules):
None.
Recommended as prerequisite for:
Module coordinator:
Prof. Dr. Klaus Berberich
Lecturer: Prof. Dr. Klaus Berberich

[updated 05.12.2019]
Learning outcomes:
After successfully completing this module, students will be able to use suitable methods of data analysis to gain knowledge for decision-making in practical questions. Students will become familiar with important data analysis procedures. They will be familiar with different types of characteristics (e. g. nominal, ordinal, metric) and can preprocess data appropriately (e. g. by normalization or standardization). Students will be able to select appropriate decision-making procedures (e.g. regression or classification) for specific problems. They will be able to implement the procedures they have learned in a suitable programming language (e. g. Python or R) or use an available implementation. Students will be able to systematically determine the parameters of the applied methods on the basis of available data and critically assess the quality of their results. They will be able to prepare the knowledge gained from the data appropriately (e. g. in the form of visualization) in order to make it understandable for a technically trained or non-technically trained audience (e. g. decision-makers in the company).

[updated 24.02.2018]
Module content:
1. Introduction
 
2. Regression
2.1 Linear regression
2.2 Feature transformation
2.3 Regularization
 
3. Classification
3.1 Logistic regression
3.2 Decision trees
3.3 Naive Bayes
3.4 Support vector machines
 
4. Cluster analysis
4.1 Representative method (k-Means und k-Medoids)
4.2 Hierarchical method
4.3 Density-based method
 
5. Neural networks
5.1 Perceptron
5.2 Multi-layer neural networks (MLPs)
5.3 Convolutional neural networks (CNNs)
5.4 Recurrent neural networks (RNNs)
 
6. 5.3 Association rule learning
6.1 Finding frequent item sets (Apriori and FP-Growth)
6.2 Determining association rules
6.3 Finding frequent sequences (GSP and PrefixSpan)
6.4 Finding frequent strings
6.5 Finding frequent subgraphs
 
7. Data visualization


[updated 24.02.2018]
Teaching methods/Media:
Transparencies, practical and theoretical exercises

[updated 24.02.2018]
Recommended or required reading:
Aggarwal C.: Data Mining - The Textbook, Springer, 2015
 
Harrington P.: Machine Learning in Action, Manning, 2012
 
Kelleher J., Mac Namee B. und D´Arcy A.: Fundamentals of Machine Learning for Predictive Data Analytics, MIT Press, 2015
 
Provost F. und Fawcett T.: Data Science for Business, O´Reilly, 2013
 
Raschka S.: Machine Learning mit Python, mitp, 2017
 
Zaki Mohammed J. und Meira Wagner Jr: Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2014

[updated 24.02.2018]
Module offered in:
WS 2020/21 (probably), WS 2019/20
[Mon Jul  6 07:35:39 CEST 2020, CKEY=pds, BKEY=kim2, CID=KIM-DS, LANGUAGE=en, DATE=06.07.2020]