Specific Aggregation Market Risk System - data volume challenge - part 5

5) Mapping Table

I am reviewing the mapping table that is used to enrich the staging tables. The main structure of the mapping table is quite straightforward, can be simplified as follows:

 

SRC_FIELD

SRC_VALUE

DEST_FIELD

DEST_VALUE

AMENDED_BY

LAST_UPDATE_TIME

 

Here is a simple example of how the mapping tables will be populated

 

SRC_FIELD

SRC_VALUE

DEST_FIELD

DEST_VALUE

CNTRY

GB

COUNTRY

UK

CNTRY

England

COUNTRY

UK

 

The update for enrichment will be done as follows:

 

UPDATE STAGING

SET  COUNTRY = (select dest_value from mapping_table m

                  where m.src_field_name = ‘CNTRY’

                and m.dest_field_name=’COUNTRY’

                and rownum < 2)

where data_feed_id=12345;

 

What I am questioning here is why we need the rownum < 2?

Surely the data should be unique.

 

And unfortunately it is not.

 

select count(*), src_field, src_value, dest_field

from mapping

group by src_field, src_value, dest_field

having count(*) > 1;

 

The above query should return no rows, and unfortunately in this case, it returns quite a lot of rows.

 

In this case, developers were having a problem with the update statement, and decided to work around it by using rownum < 2. This is unfortunately is a lazy workaround, and is not helpful. This needs to be fixed, even though it could be painful and requires interaction with users to find out which rows are actually correct.

 

Considering there are AMENDED_BY and LAST_UPDATE_TIME fields, it seemed to me there was a half hearted attempt to perform an audit trail of changes in the mapping table, however it is incomplete. An IS_ACTIVE flag would help here, there the update can simply be modified as follows.

 

UPDATE STAGING

SET  COUNTRY = (select dest_value from mapping_table m

                  where m.src_field_name = ‘CNTRY’

                and m.dest_field_name=’COUNTRY’

                and isactive=’Y’)

where data_feed_id=12345;

 

Not only this means that we can now create a useful unique index against the mapping table, which would help efficiency, this also means that we will actually be enriching correctly!

 

The be continued …

 

No comments yet.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>