HR GDPR: Cloning, Scrambling and Anonymization of Employee Data in SAP-HR

HR GDPR: Cloning, Scrambling and Anonymization of Employee Data in SAP-HR

This blog deals with cloning, scrambling and anonymization in SAP-HR.  It builds on the experience that Adessa has gotten in managing HR GDPR on SAP-HR.  HR GDPR deals with the GDPR compliancy in managing data from current and past employees and externals.  Although the technical solutions are written for SAP-HR, a large part of the blog is platform-agnostic.

For a good understanding, it’s better to define up-front:

  • Cloning is making an exact copy of SAP-HR employee data from the productive to the development and test environment.
  • Scrambling means replacing the current value by a random value
  • Anonymizing means changing the data such that it cannot be used anymore to identify a natural person

This blogs contains the following topics:

  • Changes in cloning behaviour due to GDPR legislation
  • Six main questions to ask before cloning
  • Why and how to determine the employee data we store?
  • Data and interfacing challenges when scrambling
  • Features to look for when selecting a cloning tool
  • Scrambling types in a cloning tool


Note: this blogs is part of a series on HR GDPR.  Click here to access the other blogs:

 

Changes in cloning behaviour due to GDPR legislation

Cloning is making an exact copy of SAP-HR employee data from the productive to the development and test environment. There are multiple (good) reasons for cloning:

  • For testing and correction purposes:
    • Developers find it key to recreate personnel data in Development environment that is as close as possible to the data in Production environment, to allow to study the root cause and develop a fix.
    • Once the fix is made, it should be tested if the error has been corrected (“positive testing”), but also if it does not cause any kind of ‘collateral damage’, like messing up calculations or outcomes of other employees or groups (“negative testing”). The outcome of the fix should be limited to the employee (or group) that the fix is built for, not for any other employee.
  • As part of new developments:
    • Developers and testers put a lot of trust in cloning, to allow them to test that the new development is actually working correctly for a set population. The risk is though that not all functionalities are tested by using a clone of the productive environment.
    • Equally here, negative testing remains key.

However, …

Where HR-data on a productive environment is only accessible to employees from the HR-department (restricted authorisation based on roles or function), cloning makes HR-data suddenly available and ‘visible’ to developers and testers. Developers are rarely part of the HR-department.

This is where several conflicts with the GDPR compliance kick in:

  • Without employee consent, personal data can only be used for the purpose for which it has originally been requested. Developing and testing software is not the original process for which the employee has given his personal data to HR.
  • The employee has the right of data privacy. This means that identifying and sensitive data cannot be revealed to “others” without contractual or legal justification.
    • Identifying Data is data which can lead to identification such as personnel number, user-id, name, address, bank account, social security number, license plate, …
    • Sensitive Data is e.g. ethnic origin, union membership, religion, biometric and genetic data, sexual orientation, …

In fact, it is even worse.

When sensitive personal data is made accessible to persons (such as developers or testers), who do not pursue a legitimate interest based upon contract or law, GDPR considers this a data breach, the unlawful disclosure of data. This could lead to a conviction of the company to pay huge fines.

So, if we still want to use productive data outside the productive environment, we have to:

  • Anonymize the identifying data, which means changing the data such that it cannot be used anymore to identify a natural person by using fields or combinations of several fields. A common way to anonymize data is scrambling.
  • Scramble sensitive data such as ethnic origin, union membership, religion, biometric and genetic data, sexual orientation, … In the unlikely event anonymization would not be completely achieved, the sensitive data is still “hidden” and cannot be disclosed.

To simplify matters from now on we talk about scrambling.

 

Six main questions to ask before cloning HR data

Each time I sit with a customer to define the policy and procedures on cloning or scrambling, I use the same approach.  I have developed a framework (or set of questions) to assess where sensitive or identifying data is stored, how it is used and for what purpose. It is simple and very easy to apply.

The final goal of this exercise is to know exactly and in great detail in which tables and fields we store identifying and sensitive data, such that this data can be scrambled.

  1. Do we use modules outside SAP-HR (e.g. SAP Finance, CATS, …)?

First, this question determines the scope of the cloning project. To which modules does SAP-HR data propagate? Second, if a cloning tool has not yet been bought, a selection criterion could be for which SAP-modules the cloning tool is copying the data (see further in my next blog under “Features to look for when selecting a cloning tool”). Not all tools work for all modules. This question needs to be discussed with the functional leads / process owners of the different streams inside the company.

 

  1. Which HR-processes do we have?

A global, not too detailed insight into the processes of the company should reveal:

  • Which other modules of SAP are using which SAP-HR data?
  • Which submodules of SAP-HR your company is using and in which countries?
  • The HR system landscape: Which other HR-systems are interfacing with SAP-HR (inbound, outbound, internal, external)?

To help you to understand what I want to achieve with these questions… Consider the SAP-HR data like a drop of oil on the water. We need to know all the places where the oil is spreading to better clean the water of identifying and sensitive data. This question needs to be discussed with the HR functional lead / process owner.

 

  1. Which other modules of SAP are using which SAP-HR data?

When we scramble SAP-HR data, we have to be sure we scramble this data everywhere and preferably in the same way. But by doing this, we also need to be sure of the impact this has on the other modules. E.g. if we scramble the personnel number or the position, what impact does this have on e.g. CATS, logistics or planning? E.g.: when you use SAP Travel Management there are scenarios where every employee is also a creditor. This means, that if you scramble employee data you should also look whether you should scramble your creditors.

 

  1. Which submodules of SAP-HR your company is using and in which countries?

Firstly, to reach our final goal of knowing exactly which tables / fields contain identifying and sensitive data, we need to know which submodules of SAP-HR the company is using. After this we can determine the database tables to investigate in more detail.
Secondly, again if you have not bought a cloning tool yet, not every cloning tool works for every submodule / every country of SAP-HR (see further in my next blog under “Features to look for when selecting a cloning tool”).

 

  1. The HR system landscape: Which other HR-systems are interfacing with SAP-HR (inbound, outbound, internal, external)?

When you scramble your identifying data in SAP-HR, this could mean that the link between SAP-HR and the other system is lost. E.g. if you scramble the user-id, after scrambling the personnel record in SAP-HR maybe linked to a different personnel record in SuccessFactors (see my next blog the chapter “Interfacing challenges when scrambling”).

 

  1. Where do we store our employee data in detail (which table, which fields)?

This question gives answer to the final goal of the six questions: to know exactly which fields, identifying or sensitive, we have to scramble. Firstly, we have to know all tables and fields where we store personnel data. Only then we are sure we also know all fields containing the identifying and sensitive data. The first step is to go to the functional analysts to get an answer to this question “where do we store what?”. As a second step this answer is checked with some technical tools (see further my next blog under “Which employee data do we store?” for the details).

 

In my next blog, I will delve further in the challenges that Data cloning has on Employee Data storage and interfacing between HR systems.   Stay tuned or subscribe to receive the updates.

Related posts

The Adessa Group was founded in 2005 as a specialized, pan-European Human Resources service provider. The company was founded with the vision of supplying sustainable computer solutions through the development of an international network of subsidiaries, close to their customers and with the aim of growing organically. This vision was translated through the values that shaped Adessa’s corporate culture.   You can follow us on LinkedIn by clicking here.