Tuesday, March 1, 2016

What do you mean by Data Cleansing and Data Enhancement?



Data decay is a costly problem that no business can escape; the cost of fixing broken databases increases as the data becomes more decayed. Additionally, the cost increases if the record is left unchecked. Around 20 to 30 % of business data decays every year, which means most business data will unavoidably suffer without robust data quality initiatives.


When tackling data decay, there are two key processes that will heal the dataset and ensure that we make better quality decisions. The first step is to cleanse the data. The second is to enhance it to verify its authenticity.

How Data Cleansing Works?

Data cleansing is one of the methods that can be used to heal a database.
Using cleansing techniques, the database is scanned for imperfections and then corrected using a combination of manual and automated processes.

Data cleansing can be used to strip duplicate records from a database; obvious matches are easy to find, and sophisticated data quality software can locate matches that may initially go undetected. Data quality software also presents sections of the database for manual review, ensuring that the data cleansing operation does not accidentally remove false positives.



Since data is continually decaying, data cleansing must be carried out frequently. If the data is also checked when it is added to the database, a large number of dirty data problems can be remedied before the client becomes aware of any mistakes. As a result, the entire business functions more efficiently and builds its decision making on quality information that is timely, relevant and accurate.
If the business can continually make efforts to cleanse its database, it is far less likely to run into problems such as customer complaints, returned marketing communications and even legal claims. Additionally, once the data is clean, the business can then move forward to enhancing it to make it even more useful and relevant.


How Data Enhancement Works?

Data enhancement is an additional process that improves clean data by scanning and supplementing. By drawing on alternative sources of information, your database can be scanned for potential additions and extended to incorporate new fields.

This allows you to combine your records with information you may not have known you needed when the database was first created, and it allows you to verify information beyond simple names, addresses and postcodes.

During data enhancement, existing clean data is augmented with up-to-date information from third party providers. This gives the data quality project a source of external intelligence, and it allows the information to be enriched with data that is in the public domain. Once a record has been enhanced, it is effectively supplemented with data from dozens of other databases, all of which have been well maintained by their originator.


The Enhancement Challenges:

Enhancement presents a challenge for data quality operatives. In the past, enhancement could only be carried out in a separate system, and this usually meant that the business’ cleansed data had to be exported and then re-imported once it had been enhanced.

Whenever data leaves its source system, there are myriad opportunities for corruption. Subtle inaccuracies, formatting errors and inconsistencies can be introduced; the systems may encode data in different ways or interpret values differently. This can cause problems with formatting (for example, in date fields), and with special characters.

The last thing we want is pristine, clean data to be corrupted during the enhancement operation. As such, it is best to keep the data in the system, without export and import operations, if at all possible. This is made possible using web services and APIs that communicate directly without any manual intervention.

Thanks to modern integration techniques, it’s possible to link up our systems in real time without the data ever needing to be exported, imported or manually edited. This drastically reduces the chance that errors will be introduced.

The Perfect Partnership:

Clean data ensures that all customers become viable lifetime prospects. With the right care and management, dirty databases can become functional, accurate datasets that boost performance across the enterprise. Data enhancement is a logical next step in all data quality initiatives, and is yet another tool businesses use to ensure their data is fit for purpose.


Source: www.dqglobal.com