BE_BENCHMARK

Updated by kevin cardoen


Overview

The BE_BENCHMARK data mart contains benchmark data on the Belgian market. It is calculated on top of the BE_PAYROLL data mart and contains a set of aggregated datasets based on a reference population of employers and employees. Two different levels of granularity exist for these datasets: month (detailed) and window (moving time-window aggregation, e.g. over a period of 12 months).


Metrics

The benchmark metrics fall into 3 domains:

  • Absenteeism: On short term and long term illnesses and work accidents. Calculated with absenteeism metrics from FACT_PAYROLL_METRICS on contract level
  • Turnover : On ins and outs. Calculated with turnover metrics from FACT_EMPLOYEE_COUNT on company career level
  • Diversity : Counts based on attributes from DIM_EMPLOYEE on contract level


Benchmark filters

The benchmark population excludes employers and employees which data is estimated not to be representative. This is materialized as flags in the fact tables, which are then used to filter out records in the creation of the aggregated benchmark datasets.

Benchmark (aggregated) datasets

The metrics computed using a specific dimension into separate target datasets. For each, the benchmark data is computed several times and concatenated, using an additional dimension that we call segmentation. Here are the 5 segmentations currently used:

  • all BE (bench_segmentation_cd='all') : Overall Belgian data (no extra segmentation)
  • joint commission group (jc3): 3 first positions of the joint commission
  • joint commission (jc5): 5 positions of the joint commission
  • Nacebel subsector (nc2): 2 first positions of the Nacebel
  • Nacebel group (nc3): 3 first positions of the Nacebel

Row-level security will apply on the 4 latters, allowing a customer to only see the all BE data and the segments that concern its employees.

The benchmark flags are used to filter out employers and employees that shouldn't be used in the benchmark metrics calculation.

Different datasets are created, corresponding to the extra dimension for aggregation, or combination of those. The 'global' dataset doesn't have an extra dimension.


Any feedback?