Use Case HR - Reduce employee attrition and make talents stay longer (Part 1: Data Analysis)

by Rupert Schiessl


Retaining key employees is a major stake for each organization. But are there reliable ways to figure out if and why the best and most experienced employees are leaving prematurely?

Most marketing departments have already integrated the benefits of using their data to understand which customers are most likely to churn and are using that information to engage special efforts to retain customers. HR departments still have some progress to make to reach that level of analytics.

If you want to dive a little deeper into predictive analytics for HR, we recently dedicated a post to this topic.

In this case study we show how you can use Verteego Data Science Suite to understand why key people are leaving and send out regular messages to your management teams to help them to set preventive actions.

Data set

Our data set represent 15,000 employees and is composed of both currently running contracts and people who have already left the company.

You can download the whole data set here.

HR data 1

Data discovery

We used the Cleaning module of Verteego Data Science Suite to dig into the different columns of the data set and check out what it's made of.

Fields in the data set include:

  • name
    The last name of the employee.
  • satisfaction_level
    Employee satisfaction level. Ranges between 0 and 1.
  • last_evaluation
    The grade the employee got at their last evaluation. Ranges between 0 and 1.
  • number_projects
    The number of projects the employee is currently working on.
  • average_monthly_hours
    The number of monthly hours the employee is working.
  • time_spent_company
    The number of years the employee has been working for the company
  • work_accident
    Whether the employee has already had a work accident in the past (1 for yes, 0 for no)
  • promotion_last_5_years
    Whether the employee has got promoted during (1 for yes, 0 for no)
  • department
    The department the employee is working for
  • salary
    The current level of salary of the employee (3 categories : high, medium, low)
  • left
    Whether the employee has left. 0 if the employee is still working for the company, 1 if not.

Data discovery

Prior to doing any specific analytics it is interesting to dive a bit deeper into the values of the data set and understand how they might be correlated to each other.

We used the "Predict" module as well as the notebook in the "File" module to obtain some basic statistical insights.

If you want to run the whole analysis by yourself or hack some scripts, you can download the notebook here.

HR data 2
HR data 3

Analyze correlations

Calculating the correlations between all different combinations of data allows us to get first hints on why people leave in order to orient our analysis into the right sense.

Red fields mean negative correlations, blue fields indicate positive correlations.

Example: The field on the crossing point between "left" and "satisfaction_level" is dark red which means that when the satisfaction level of employees goes down, the value of "left" goes up (which means that employees are leaving, as satisfaction_level can only be 0 or 1).

HR data 4

We can see very clearly that the satisfaction level of the employees is strongly related to the fact that they leave the company.
Other significant factors making people leave are the salary level, the work accidents and if they have got a promotion during the last 5 years.
Regarding the correlation between the satisfaction level and the other dimensions we can understand that satisfaction mainly decreases when the number of projects and the time spent in the company increase.

Focus on employee satisfaction

Let's have a closer look on what employee satisfaction looks like for the different departments.
In the left range we see the charts for employees still in the company (left=0), in the right range we find the employees that have already left the company (left=1).

HR data 5
HR data 6
HR data 7

It is interesting to observe that employees that have left can be split up into 3 distinct groups: those who were unsatisfied, those who were very satisfied and those in between. There is no smooth transition between those groups like there is for employees still in the company. It appears quite clearly why unsatisfied people leave the company, but it could be interesting to explore why satisfied employees left.

So, let's have a look on the correlation chart including satisfied employees only (satisfaction_level > 0.7).

HR data 8

This chart shows that satisfied employees leave the company when they work on a high number of projects simultaneously or a high number of hours each month and when they have already spent a long time in the company. Leave decisions are also influenced by a low salary level and when employees haven't got a promotion during the last 5 years.

A closer look on leaving employees

Let's now have a closer look on the other factors that describe leaving employees.

HR data 9

We see that leaving employees tend to have lower salaries, a higher number of projects, higher monthly working hours and fewer promotions. All this sounds logic as the satisfaction analysis provides the same conclusions and satisfaction is closely related to the leave decision.

But there is another key learning that this chart provides: an important part of the employees leaving the company are people with a high evaluation and several years spent in the company. These employees are highly valuable assets that should not be lost.

There are 6,123 employees having spent more than 3 years within the company and evaluations higher than 0.7 and 30.44% (1,864 employees) of them have left the company!

Why do good employees leave?

Before trying to predict which people are most likely to leave the company let's have a quick look on what makes high performers leave.

HR data 10

This last chart shows very clearly that good employees have left mainly because of a high number or simultaneous projects and a high amount of working hours.

Predict which employees are going to leave

Now let's go one step further and predict which employees are going to leave the company.

To be continued...

Newsletter title

Newsletter Subtitle