Saturday 14 September 2013

Bad data cleanup activity steps in data warehousing

Step-1:Analysis
Step-2:Backup of bad data and the data which you are deleting from the original table.
Step-3:prepare the script for data deletion
Step-4:Run the delete script
Step-5:Reload the data again which ever u have delete

Step:1 

Analysis:

we need to do analysis on the data i.e we need to find out that where that bad data exists based on the business rules,like missing the relationship,unwanted data

step:2

once analysis is done then take a bad data into one table and compare that records with another table data,so that we can find if there is any bad data exist in other tables also.

Ex:

assume in person(parent table),person_inst_assignment table we found the bad data i.e there are some orphan records are exist in the table(parent is not there but chaild record is exist) for a particular source system.

tables: person,person_inst_assignment....etc

create table bad_person as select *from person where source_system_name='ABC'

checking in the remaining the tables where the same bad data exists.

select *from person_inst_assignment b where person_id in(select *from bad_person)a
where a.person_id= b.person_id

it will give the data if any matching data is there in the person_inst_assignment.like this you  need to check all the tables.

Step-3: Take a backup into new tables which the data you are deleting from the original table.

ex: i am taking a backup of person_inst_assignment

create table 
BKP_person_inst_assignment 
as 
(select *
from person_inst_assignment b 
where person_id in
(select *from bad_person)a
 where a.person_id= b.person_id)

the above query will create the backup table with the data which we want to deleting (i.e bad data)

if any other tables also having the data we need to follow the same steps like above query.

Step-4: Prepare the script for data  deletion 

Ex:

SELECT *
  FROM person_inst_assignment b
 WHERE EXISTS
          (SELECT 'X'
             FROM BKP_person_inst_assignment a
            WHERE a.person_id = b.person_id);

  
DELETE
  FROM person_inst_assignment b
 WHERE EXISTS
          (SELECT 'X'
             FROM BKP_person_inst_assignment a
            WHERE a.person_id = b.person_id);


the above queries counts need to march then only our query is correct.like this we need to prepare all the scripts for the tables which are containing the bad data.

Step-5: from informatica u can reload the data based on the Source Qualifier Query 


Friday 13 September 2013

ERROR:ORA-02292: integrity constraint () violated - child record found in informatica

ERROR:ORA-02292: integrity constraint (FDS.SYS_C00740002) violated - child record found


Reason: you are trying to delete the parent record which is having the chaild record

solution:

execute the below query with the constraint name :

select * from all_constraints where constraint_name='constraint name'

ex:

 select * from all_constraints where constraint_name='SYS_C00740002'

you can find the table name in the result set.for that table you need to disable the constraint or if that table is not needed you can drop  the table.