Data cleansing

Issue

Cleansing data helps you control and improve data quality compared with the integrity and management rules of your information system or the system you wish to migrate to.
Data quality is often overestimated and can be corrupted for many ―sometimes very good― reasons.

  • An obsolescent system with:
    • Lacking controls over certain standardized structures, e.g. postal addresses that often are not in accordance with post office regulations and require to be standardized
    • A data model lacking referential integrity
    • Application controls lacking
  • Duplicates: Duplicates are quite common for natural persons or legal entities. Users often create them unintentionally but some are be deliberately keyed in as a way to make up for the failings of an application
  • Users taking over parts of the information system to manage new information
  • Data discrepancy due to (application) bugs corrected late
  • Incomplete information, information forcing, avoiding controls, etc.

Poor data quality eventually leads to substantial costs.

Direct costs:

  • Higher postal fares due to bad quality of the addresses, or multiple dispatch to duplicates
  • Application crashes
  • Rough, even false statistics
  • Inability to consolidate information, sometimes regulations
  • Data cleansing is necessary when implementing a new application or system
  • Etc.

Indirect costs:

  • Loss of image
  • Loss of productivity
  • Etc.

Data cleansing can be planned separately or during the migration to a new platform. .

In the first case, you should use the business and integrity rules of the system you are running the data on to check the quality of the data. In order to be efficient, we advise you to integrate the means of control you developed in an iterative process to measure data quality.

If you migrate data to a new system, you should check (source) data with the integrity rules of the target platform and start cleansing data as soon as possible. This sub-project is critical and you may have to perform some long operations yourself, which can heavily affect the global planning.
In any case, we prefer using automatic data cleansing to make operations cost-effective.

 

Methodology and tools

Thanks to our tools, we can automate many operations. Our ‘Recode’ system analysis tools can create control modules from:

  • the physical data model
  • programmes
  • real data
  • use cases

You will be able to use them on a regular basis.

Restitution sessions include (global) project dashboards to see how the cleansing sub-project is progressing. They also include business-oriented reports with a summary of rejected items (with analysis) classified by service and volume.

Detailed lists of malfunctions are enriched with the functional signposting system of the folder to help users find it in source and target applications.

 

 

Environments

We can work in all IT environments with our technology:

OS: MVS, DOS VSE, VM, GCOS 7, GCOS 8, VMS, ICL, UNIX, AS400, WINDOWS, HP3000, etc.

DBMS: DB2, ORACLE, SYBASE, SQLserver, SQL, INFORMIX, DL1, IDMS, IDS2, TOTAL, ADABAS, DATACOM, IMAGE, etc.

We have strong functional skills and many references in the following areas: banking, insurance, pension plans, life insurance, mass retail, human resources, etc.

 

References

Healthcare and insurance companies, banks, etc.

A close up view of Move Solutions

Move Solutions is the reference for companies planning to migrate to the ‘Usine Retraite’ pension software*.
La qualité des données attendues nécessite une étape de fiabilisation poussée qui doit être menée le plus tôt possible afin de ne pas impacter le planning de déploiement.

The expected data quality requires a long cleansing stage that should be started as early as possible in order to avoid affecting the deployment planning. That is why we created an industrial infrastructure to analyse source data by comparing it to the management and integrity rules of the target platform ―the Usine Retraite.

Our customers benefit from a high value-added offer and manage risk thanks to our functional capitalization tools and powerful generation software.

* The new shared platform for AGIRC & ARRCO complementary pension plans in France