Specific Aggregation Market Risk System - data volume challenge - part 5
5) Mapping Table
I am reviewing the mapping table that is used to enrich the staging tables. The main structure of the mapping table is quite straightforward, can be simplified as follows:
SRC_FIELD
SRC_VALUE
DEST_FIELD
DEST_VALUE
AMENDED_BY
LAST_UPDATE_TIME
Here is a simple example of how the mapping tables will be populated
|
SRC_FIELD |
SRC_VALUE |
DEST_FIELD |
DEST_VALUE |
|
CNTRY |
GB |
COUNTRY |
UK |
|
CNTRY |
England |
COUNTRY |
UK |
The update for enrichment will be done as follows:
UPDATE STAGING
SET COUNTRY = (select dest_value from mapping_table m
where m.src_field_name = ‘CNTRY’
and m.dest_field_name=’COUNTRY’
and rownum < 2)
where data_feed_id=12345;
What I am questioning here is why we need the rownum < 2?
Surely the data should be unique.
And unfortunately it is not.
select count(*), src_field, src_value, dest_field
from mapping
group by src_field, src_value, dest_field
having count(*) > 1;
The above query should return no rows, and unfortunately in this case, it returns quite a lot of rows.
In this case, developers were having a problem with the update statement, and decided to work around it by using rownum < 2. This is unfortunately is a lazy workaround, and is not helpful. This needs to be fixed, even though it could be painful and requires interaction with users to find out which rows are actually correct.
Considering there are AMENDED_BY and LAST_UPDATE_TIME fields, it seemed to me there was a half hearted attempt to perform an audit trail of changes in the mapping table, however it is incomplete. An IS_ACTIVE flag would help here, there the update can simply be modified as follows.
UPDATE STAGING
SET COUNTRY = (select dest_value from mapping_table m
where m.src_field_name = ‘CNTRY’
and m.dest_field_name=’COUNTRY’
and isactive=’Y’)
where data_feed_id=12345;
Not only this means that we can now create a useful unique index against the mapping table, which would help efficiency, this also means that we will actually be enriching correctly!
The be continued …