Skip to main content

Data EngineeringLaajuus (5 cr)

Code: R504D163

Credits

5 op

Teaching language

  • English

Objective

You understand the goals and optimal balance of a dataset in machine learning
You can use common advanced dataset evaluation tools
You can perform common dataset distribution optimization operations
You can perform common feature engineering optimization operations for a dataset
You are aware of the advanced dataset optimization and analysis methods

Content

The role and practices of dataset optimization for machine learning models
Dataset evaluation tools and their usage
Distribution management
Feature engineering
Advanced tools and methods for dataset optimization and analysis

Qualifications

Basics of programming, Basics of common Python data analytics modules/libraries, Basics of conventional machine learning algorithms, Basics of statistics

Assessment criteria, satisfactory (1)

You can assess a suitable amount of optimization for a dataset
You can use some of the common dataset evaluation tools
You can perform the most crucial distribution optimization operations
You can perform the most crucial feature engineering optimization operations
You are aware of the advanced dataset optimization and analysis tools

Assessment criteria, good (3)

You can assess a suitable amount of optimization for a dataset, and use this knowledge to guide your selection of tools and operations for a given dataset
You can use most of the common dataset evaluation tools
You can perform many of the common distribution optimization operations
You can perform many of the common feature engineering optimization operations
You can apply some of the advanced dataset optimization and analysis tools in your datasets

Assessment criteria, excellent (5)

You can assess a suitable amount of optimization for a dataset, and use this knowledge to guide your selection of tools and operations for a given dataset
You can use most of the common dataset evaluation tools, and some of the advanced tools as well
You can perform many of the common distribution optimization operations, and some of the advanced operations as well
You can perform many of the common feature engineering optimization operations, and some of the advanced operations as well
You can apply many of the advanced dataset optimization and analysis tools in your datasets