Semester: 1
ECTS: 5
Lectures: 30
Practice sessions: 30
Independent work: 90
Module Code: 23-02-506
Semester: 1
ECTS: 5
Lectures: 30
Practice sessions: 30
Independent work: 90
Module Code: 23-02-506

Module title:


Data engineering

Lecturers and associates:


Mirko Talajić, Lecturer

Module overview:


For the data analysis to have high quality results, it is necessary to make the preparation of the input data. The aim of the course is to demonstrate basic methods of data preparation that includes methods of cleaning, transforming, introverting, normalizing and aggregating data, time series transformation, work with missing values as well as basic data reduction methods such as feature reduction, sample reduction, and discretization.

Literature:


Essential reading:
1. Crickard, P (2020) Data Engineering with Python: Work with massive datasets to design data models and automate data pipelines using Python, Birmingham: Packt Publishing,
2. Algebra University College (2020), Data Engineering Handbook, Zagreb: Algebra University College
1. Garcia, S., Luengo, J., Herrera, F. (2016) Data Preprocessing in Data Mining, Cham: Springer International Publishing
2. Balamurugan, A.S., Christopher, A.B. (2012) Insight into Data Preprocessing: Theory and Practice: Data Mining Perspective Chisinau: Lap lambert Academic Publishing

Further reading:
1. Chakrabarti, S., Cox E., Eibe, F., Hartmut, RG, Han, J., Jiang, X., Kamber, M., Lightstone, S.S. (2009) Data Mining: Know It All, Massachusetts: Morgan Kaufmann