What does a Data Scientist do? - Facts

From the post about the CRISP-DM we know what phases a data science project consist of. This post analyzes the time spend on the different...

Association Analysis

 The typical question behind Association Analysis or often also called Basket Analysis is:Which products are bought together? This question...

Anomaly detection

A classical Data Science problem is to identify outliers, meaning anomalous behaviours or unexpected high or low values. These unusual...

Principal Component Analysis

Typical Data Science problems have a huge amout of input features, they are often deciding factors for model performance and make it difficult...

Data Scientists - Skills

It is obvious that a data scientist should have an interest in data, a strong analytical, mathematical background and a good intuition on...

The Data Mining Process

The Cross Industry Standard Process for Data Mining (CRISP-DM) is a process model that describes the different steps data scientists use...

k-Means - how to choose k

After you understood how the basics of the  k-Means-Algorithm works, you will be wondering: Into how much clusters should I devide the...

k-Means - How to initialize

k-Means - How to initialize
When you study the k-Means-Algorithm and understand, how it works, a natural question that ariese is: How can you choose the starting points...

The k-Means Algorithm - Basics

Unsupervised Learning tries to find structures in datasets, one method to do so is by clustering the data. The most popular and widely used...