htw saar Piktogramm
Back to Main Page Choose Module Version:
XML-Code

flag flag

Machine Learning

Module name (EN):
Name of module in study programme. It should be precise and clear.
Machine Learning
Degree programme:
Study Programme with validity of corresponding study regulations containing this module.
Applied Informatics, Bachelor, ASPO 01.10.2011
Module code: PIBWI19
SAP-Submodule-No.:
The exam administration creates a SAP-Submodule-No for every exam type in every module. The SAP-Submodule-No is equal for the same module in different study programs.
P221-0085, P610-0536
Hours per semester week / Teaching method:
The count of hours per week is a combination of lecture (V for German Vorlesung), exercise (U for Übung), practice (P) oder project (PA). For example a course of the form 2V+2U has 2 hours of lecture and 2 hours of exercise per week.
2V+2U (4 hours per week)
ECTS credits:
European Credit Transfer System. Points for successful completion of a course. Each ECTS point represents a workload of 30 hours.
5
Semester: 6
Mandatory course: no
Language of instruction:
English
Assessment:
Written exam

[updated 26.02.2018]
Applicability / Curricular relevance:
All study programs (with year of the version of study regulations) containing the course.

KI575 (P221-0085) Computer Science and Communication Systems, Bachelor, ASPO 01.10.2014 , semester 6, optional course, technical
KIB-MLRN (P222-0119) Computer Science and Communication Systems, Bachelor, ASPO 01.10.2017 , semester 6, optional course, technical
PIBWI19 (P221-0085, P610-0536) Applied Informatics, Bachelor, ASPO 01.10.2011 , semester 6, optional course, informatics specific
PIB-MLRN (P221-0085) Applied Informatics, Bachelor, ASPO 01.10.2017 , semester 6, optional course, informatics specific

Suitable for exchange students (learning agreement)
Workload:
Workload of student for successfully completing the course. Each ECTS credit represents 30 working hours. These are the combined effort of face-to-face time, post-processing the subject of the lecture, exercises and preparation for the exam.

The total workload is distributed on the semester (01.04.-30.09. during the summer term, 01.10.-31.03. during the winter term).
60 class hours (= 45 clock hours) over a 15-week period.
The total student study time is 150 hours (equivalent to 5 ECTS credits).
There are therefore 105 hours available for class preparation and follow-up work and exam preparation.
Recommended prerequisites (modules):
PIB115 Fundamentals of Informatics
PIB120 Programming 1
PIB125 Mathematics 1
PIB215 Mathematics 2
PIB315 Mathematics 3
PIB330 Databases


[updated 02.03.2017]
Recommended as prerequisite for:
Module coordinator:
Prof. Dr. Klaus Berberich
Lecturer: Prof. Dr. Klaus Berberich

[updated 10.02.2017]
Learning outcomes:
After successfully completing this module, students will know about fundamental supervised and unsupervised methods from machine learning. This includes methods for regression, classification, and clustering. Students will understand how these methods work and know how to use existing implementations (e.g., in libraries such as scikit-learn). Given a practical problem setting, they will be able to choose a suitable method, apply it to the dataset at hand, and assess the quality of the determined model. In addition, students will be aware of typical data-quality issues and know how to resolve them.

[updated 26.02.2018]
Module content:
Machine learning plays an increasingly important role with applications ranging from recognizing handwritten digits, via filtering out unwanted span e-mails, to the ranking of results in modern search engines. After successfully completing this module, students will know about fundamental supervised and unsupervised methods of machine learning. We will look into how these methods are defined formally, including the mathematics behind them. Moreover, we will apply all methods on concrete datasets to solve practical problems. To do so, we will rely on existing libraries (e.g., scikit-learn) that provide efficient implementations of the methods. This course will be accompanied by theoretical exercises and project assignments. The exercises will help students to deepen their understanding of the methods, while the project assignments will encourage students to solve practical problems by applying their knowledge to real-world datasets.
 
1. Introduction
- What is Machine Learning?
- Applications
- Libraries
- Literature
 
2. Working with data
- Typical data formats (e.g., CSV, spreadsheets, databases)
- Data quality issues (e.g., outliers, duplicates)
- Scales of measures (i.e., nominal, ordinal, numerical)
- Data pre-processing (in Python and using UNIX command line tools)
 
3. Regression
- Ordinary least squares
- Multiple linear regression
- Non-linear regression
- Evaluation
 
4. Classification
- Logistic regression
- k-nearest neighbors
- Naive Bayes
- Decision trees
- Neural networks
- Evaluation
 
5. Clustering
- k-means and k-medoids
- Hierarchical agglomerative/divisive clustering
- Evaluation
 
6. Outlook
- Ongoing research
- Competitions (e.g., Kaggle and KDD Cup)
- Other resources (e.g., KDnuggets)


[updated 26.02.2018]
Recommended or required reading:
P. Harrington: Machine Learning in Action, Manning, 2012
 
G. James, D. Witten, T. Hastie, R. Tibshirani: An Introduction to Statistical Learning - with Applications in R, Springer, 2015
 
A. C. Müller and S. Guido: Introduction to Machine Learning with Python, O´Reilly, 2017
 
M. J. Zaki und W. Meira Jr.: Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2014

[updated 26.02.2018]
Module offered in:
SS 2022, SS 2021, SS 2020, SS 2019, SS 2018, ...
[Sat Dec 10 07:17:02 CET 2022, CKEY=kml, BKEY=pi, CID=PIBWI19, LANGUAGE=en, DATE=10.12.2022]