The importance of raw, unfiltered collaboration

Challenges within an organisation, within reasons, should be made known to everyone within the organisation, without it being filtered to senior management’s perception of what the involved individuals need to know.

Instead of having just a group of select individuals owning a specific problem, companies should try to tap the skills and contributions from a much wider audience within the organisation. Break the exclusivity and the silos, and you should see increased productivity and quality.

Say, an issue arises, which needs immediate attention. Finance asset class needs to consolidate the old disperse legacy general ledger systems that they have in order to reduce cost of maintaining these disperse systems, as well as to have a consolidated, single truth view of their finances. Or credit crunch prompting the need for a new age of risk management, putting pressure on risk systems to be able to calculate risk in a near-time manner.

When issues arise within a department in a large organisation, it is common that each department would call on the same individuals time and time again to address the different problems. These individuals could be very experienced, highly respected in their fields, with broad knowledge. And they might even call on for external help, perhaps a trusted vendor which they have dealt with in the past, or a consultant.

When they need to engage expertise from other areas, it will be something like this.

 “We need to hire a new person with expertise implementing Oracle GL. Engage the DBA team, as we will need a new Oracle database. Engage the hardware team, as we need an application server running Java, a reporting server with WebSphere. Ask for support team to provide example of feeds from each of the old GL system that we are going to replace. Let’s engage the network team, as we will need a 10GB dedicated link between application and database server”

This is what I mean by filtered collaboration. It is following an old fashioned hierarchical line of authority. Information is delivered on “need to know” basis. Somehow, someone higher up the chain decides what piece of information is suitable for each time he/she is collaborating with.

During my various engagements as an Oracle consultant, I have often found it frustrating that information is being drip-fed, filtered, sometimes with genuine intent to save me from confusion; everything has been tailored as if I have no ability to understand anything else that is not related to a database world. In occasions this has led me to make decisions that I would not have made had I been given the bigger picture earlier. 

When companies want to tap to the brightest minds within their organisations, they need to provide them with the whole story, not tailored sub questions, which will only hold them back from contributing to their full potential.

 

Saturday, December 20th, 2008 at 22:46

A report which dynamically ignores all zero value columns

A risk reporting table contains exposure values by counterparty for the last 7 days.

 

The structure of the table is as follows:

 

COUNTERPARTY        VARCHAR2(50)

T                    NUMBER

T-1                  NUMBER

T-2                  NUMBER

T-3                  NUMBER

T-4                  NUMBER

T-5                  NUMBER

T-6                  NUMBER

 

Values for Saturdays and Sundays will always be zero.

T to T-6 will be a moving window, therefore we can not hardcode which days are Saturdays or Sundays.

 

The requirement is to load the data into another table, where the final product will be a CSV, but users do not want to include the Saturday/Sunday all zero columns.

 

How do we do this?

 

A quick way would be to write a code like this

 

 

 

 

 

declare
  v_t number;
  v_t1 number;
  v_t2 number;
  v_t3 number;
  v_t4 number;
  v_t5 number;
  v_t6 number;

  v_sql varchar2(2000);

begin
select sum(t), sum(t-1), sum(t-2), sum(t-3), sum(t-4), sum(t-5), sum(t-6)
into v_t, v_t1,v_t2,v_t3,v_t4,v_t5,v_t6
from table1 ;

v_sql := ‘insert into newtable as select ‘
if v_t <> 0 then append to v_sql
if v_t1 <> 0 then append to v_sql
if v_t2 <> 0 then append to v_sql
etc

v_sql := v_sql ||’ from table1 ;’

execute immediate v_sql;
commit;

end;

Thursday, December 18th, 2008 at 18:14

Scratching the surface of Oracle 11g new features - Part 1

PIVOTING DATA

 

select cust_id, state_code, times_purchased

from customer order by cust_id;

 

   CUST_ID STATE_ TIMES_PURCHASED

———- —— —————

         1 CT                   1

         2 NY                  10

         3 NY                   2

         4 NY                  15

         5 CA                   1

         6 OH                   0

         7 OH                   0

 

select state_code, times_purchased, count(*) as cnt from customer

group by state_code, times_purchased;

 

STATE_ TIMES_PURCHASED        CNT

—— ————— ———-

NY                  10          1

NY                   2          1

NY                  15          1

CA                   1          1

CT                   1          1

OH                   0          2

 

select * from (select times_purchased,state_code from customer t)

pivot

(count(state_code) for state_code in (’NY’,'CA’,'CT’,'OH’))

order by times_purchased

/

 

TIMES_PURCHASED       ‘NY’       ‘CA’       ‘CT’       ‘OH’

————— ———- ———- ———- ———-

              0          0          0          0          2

              1          0          1          1          0

              2          1          0          0          0

             10          1          0          0          0

             15          1          0          0          0

 

INCREMENTAL GLOBAL STATISTICS

Global stats for a partition table can now be updated incrementally based on the changed partition alone.

dbms_stats.set_table_prefs(‘SCHEMA_NAME’,’TABNAME’,’INCREMENTAL’,’TRUE’)

 

And combine this with granularity => ‘AUTO’

 

   dbms_stats.gather_table_stats (
           ownname            => ‘SCHEMA_NAME’,
           tabname         => ‘TABNAME’,
           partname        => 'P_FEED_12345’,
           granularity     => 'AUTO')

 

RESULT CACHING

This thing is very exciting (well, at least to me, anyway). I think this is going to be so useful for long running queries, or queries that involve large amount of data. This is not to be confused with materialized views.

 

Just imagine for a few second that this is a very complicated , resource intensive long running query…

 

explain plan set statement_id=’hilda1′ for select sum(times_purchased) from customer;

 

set pagesize 25

set linesize 100

set long 1000000

col xplan format a80

 

select * from table(dbms_xplan.display(’PLAN_TABLE’,'hilda1′,’TYPICAL’));

 

PLAN_TABLE_OUTPUT

—————————————————————————————————-

Plan hash value: 487422484

 

——————————————————————————-

| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |

——————————————————————————-

|   0 | SELECT STATEMENT   |          |     1 |    13 |     3   (0)| 00:00:01 |

|   1 |  SORT AGGREGATE    |          |     1 |    13 |            |          |

|   2 |   TABLE ACCESS FULL| CUSTOMER |     7 |    91 |     3   (0)| 00:00:01 |

——————————————————————————-

 

Note

—–

   - dynamic sampling used for this statement

 

13 rows selected.

 

Now we are going do it again, but this time with this /*+ result_cache */ hint

 

explain plan set statement_id=’hilda2′ for select /*+ result_cache */ sum(times_purchased) from customer;

 

select * from table(dbms_xplan.display(’PLAN_TABLE’,'hilda2′,’TYPICAL’));

 

PLAN_TABLE_OUTPUT

—————————————————————————————————-

Plan hash value: 487422484

 

————————————————————————————————–

| Id  | Operation           | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |

————————————————————————————————–

|   0 | SELECT STATEMENT    |                            |     1 |    13 |     3   (0)| 00:00:01 |

|   1 |  RESULT CACHE       | 2rafyj4xfv3f725guctywn2n3t |       |       |            |          |

|   2 |   SORT AGGREGATE    |                            |     1 |    13 |            |          |

|   3 |    TABLE ACCESS FULL| CUSTOMER                   |     7 |    91 |     3   (0)| 00:00:01 |

————————————————————————————————–

 

Result Cache Information (identified by operation id):

——————————————————

 

   1 - column-count=1; dependencies=(HILDA.CUSTOMER); attributes=(single-row); name=”select /*+ resu

lt_cache */ sum(times_purchased) from customer”

 

 

Note

—–

   - dynamic sampling used for this statement

 

19 rows selected.

 

You can flush the cache by executing dbms_result_cache.flush

 

 

VIRTUAL COLUMNS

Normally things like this can only be achieved using a trigger or manual population.

 

create table employee (name varchar2(20), salary number, grade varchar2(50)

generated always as

 (case when salary <=10000 then ‘LOW’

       when salary <=20000 then ‘MEDIUM’

       when salary <=30000 then ‘HIGH’

       when salary <=40000 then ‘are you a CEO?’ end) virtual);

 

 

insert into employee (name,salary) values (’Jack’,10000);

insert into employee (name,salary) values (’Jill’,25000);

insert into employee (name,salary) values (’Tom’,5000);

insert into employee (name,salary) values (’Jane’,32000);

commit;

 

select * from employee;

 

NAME           SALARY GRADE

———- ———- ——————————————

Jack            10000 LOW

Jill            25000 HIGH

Tom              5000 LOW

Jane            32000 are you a CEO?

 

Although the name is “Virtual”, this virtual column will take on physical space in the database, just like the other columns.

 

 

INVISIBLE INDEXES

This feature will be useful when trying to determine an impact an index will have on a query plan.

 

Let’s just create a simple index on our existing table.

 

create unique index customer_ix01 on customer(cust_id,cust_name);

 

explain plan set statement_id=’hilda1′ for select * from customer where cust_id=4 and cust_name=’Customer4′;

 

select * from table(dbms_xplan.display(’PLAN_TABLE’,'hilda1′,’BASIC’));

—————————————————–

| Id  | Operation                   | Name          |

—————————————————–

|   0 | SELECT STATEMENT            |               |

|   1 |  TABLE ACCESS BY INDEX ROWID| CUSTOMER      |

|   2 |   INDEX UNIQUE SCAN         | CUSTOMER_IX01 |

—————————————————–

 

Now let’s make the index invisible

 

alter index customer_ix01 invisible;

 

 

explain plan set statement_id=’hilda2′ for select * from customer where cust_id=4 and cust_name=’Customer4′;

 

 

select * from table(dbms_xplan.display(’PLAN_TABLE’,'hilda2′,’BASIC’));

————————————–

| Id  | Operation         | Name     |

————————————–

|   0 | SELECT STATEMENT  |          |

|   1 |  TABLE ACCESS FULL| CUSTOMER |

————————————–

 

 

You now see that the index has now been made invisible, thus not used by the query.

 

To force a session to “see” invisible indexes you can change session parameter OPTIMIZER_USE_INVISIBLE_INDEXES

 

 

alter session set optimizer_use_invisible_indexes=true;

 

explain plan set statement_id=’hilda3′ for select * from customer where cust_id=4 and cust_name=’Customer4′;

 

 

select * from table(dbms_xplan.display(’PLAN_TABLE’,'hilda3′,’BASIC’));

 

—————————————————–

| Id  | Operation                   | Name          |

—————————————————–

|   0 | SELECT STATEMENT            |               |

|   1 |  TABLE ACCESS BY INDEX ROWID| CUSTOMER      |

|   2 |   INDEX UNIQUE SCAN         | CUSTOMER_IX01 |

—————————————————–

 

A new column has been added to the USER_INDEXES system view to show index visibility.

 

select visibility from user_indexes where index_name=’CUSTOMER_IX01′;

 

VISIBILITY

—————————

INVISIBLE

 

 

 

Monday, December 15th, 2008 at 15:19

Relinking an Oracle Home with No RAC option

I encountered this problem trying to install an ORACLE_HOME for a single node (non-rac) database on a 2 node cluster running RAC databases.

cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk rac_off
make -f ins_rdbms.mk ioracle

Monday, December 15th, 2008 at 08:01

Sub ledger consolidation program using Microgen Aptitude

A few years ago, I worked in the past with a niche product called OST Business Rules, which empowers finance business users to develop their own accounting engine rules in a nice user friendly manner, therefore reducing the need for heavy development works from Finance IT.

OST was subsequently bought by Microgen, and where OST Business Rules have now evolved to a product called Microgen Aptitude. The concept is still fairly similar. You take various feeds from your upstream systems, either in the form of flat files, or xml messages. Aptitude via standard loader or MQ, will load the data into the database, and enrich them, prior to applying all the required business rules, to feed into various target, such as the general ledger, financial repository (data warehouse), for reporting purposes.

While Microgen Aptitude feels like a godsend to accountants, which allows them to create and implemented accounting business rules relatively quickly, like any other automatic wizzardy type applications, it can be misused, and create a maintenance nightmare.

Microgen Aptitude requires a database server and an application server.

The key point here is that data is processed row by row.

In order to achieve high throughput, the applications relies on sufficient bandwith between the database server and the application server, as well as relies heavily on concurrent process to be able to increase its throughput. So lets say you have 60k accounting events to process, and you have 1 hour to process it, that means will need to process 1000 events per minute. Depending on complexity of required enrichment, plus the complexity of the accounting rules, this could prove to be a lot of events to process within a minute. In this case you will perhaps be running 10 streams in parallel.

So yes, network traffic will be quite busy. You will be getting hot blocks on your tables, as multiple processes are likely to want to write to the same block at the same time.

 

To be continued …

Sunday, December 14th, 2008 at 16:28

Experience vs Intelligence

I have always wondered what factor is  more important in building a good team. Staff with years of experience, or a team full of intelligent people? I hear you say, you can’t choose one way or the other, you need both. Of course, but if you have to choose, which one do you think is more important?

 

A lot of IT departments in financial institutions advertise their jobs with something along these lines:

 “at least 5 years of experience in Oracle, with experience working in large financial institutions”

 Staff with years of experience (especially with the tag of “have worked in other banks before”) are normally the preferred candidates. This is normal, because employers would like to have these people hit the ground running. Give them minimum training and orientation, and off they go. Don’t forget though, that naturally these experienced candidates also carry higher price tags.

 Can you take a bunch of inexperienced people, and turn them into a valuable member of staff? Two requirements are needed:

  • Intelligent candidate
  • Good mentoring

 Recently I spent three days remotely mentoring a bright twenty-two year old in India. She is a good 10 years younger than the average age of London staff. Her name is Smita, and she works as an outsourced junior DBA, while at the same time still pursuing her master degree in Computer Science. Over the few months I have known her, she really stands out from the crowd. She picks up mundane DBA tasks, and is enthusiastic about it, she follow things up properly and thoroughly. If she/other outsourced staff can’t fix some of the more technically challenging tasks, the senior staff in London would normally take over. Smita would “pester” me relentlessly “please tell me how you fix it!”. Or “how do you know where to look?”

 

The last 3 days I showed her how to perform an Oracle 10g upgrade on a database. This has been postponed for ages by the London staff because it contains an old installation of HTML DB, which needs to be upgraded to Oracle Application Express. There is a large backlog of databases to upgrade, and not enough people in the London team who know how to do it.

 So, here I was, spending hours with her through instant messaging, showing her step by step what to do. Sometimes deliberately letting her make mistakes, and later pointing out how to troubleshoot the error messages and to fix the errors.

 Three short days later (due to the time difference between UK and India ), she did it. She now joins the rank of a few people in our team who knows how to do this particular task. She painstakingly took notes as we go along. “I want to be able to do it in my sleep”, she said. She was ecstatic that she managed to do it, and I am incredibly proud of her.

 On another note, I co-own a software company Claromentis, which provides customizable information management portal and intranet software solutions. Most of the development work is done in Russia, and in the UK we concentrate on project delivery, sales and technical support. Recently we decided to expand our UK staff with customer facing developers, and went through the process of hiring. We already have an experienced development and testing team in Russia, so we decided that we would choose inexperienced candidates, as long as they show intelligence and good attitude. And being based in Brighton definitely helps, as it is a young vibrant city, a university town, so we are not short of candidates. We sieved through tons of CVs, needing to get it right. We finally made our selections. Now this is only the beginning. We will only succeed if we put the effort and time to nurture these young talents - which in itself is a challenge as the experienced Claromentis developers are as I mentioned in Russia. And I know we will.

I also believe that given the opportunity, I would love to take on the challenge to nurture young bright talented people  and teach them as much as I can about Oracle, and other things that matter to shape someone to be a caring, problem aware, customer centric Oracle professional.

 

Saturday, December 13th, 2008 at 19:59

Function based index

I simply hate it when I see join conditions like this

 

WHERE table1.column1= UPPER(table2.column2)

 

If we build the systems ourselves, why can’t we simply ensure that the two columns have consistent data, especially if we are responsible in building application, from the ETL part, processing part all the way to the REPORTING part.

 

If table2.column2 is indexed, the indexed will be ignored in the above join.

 

To get around it, you need to create a function based index on table2

 

CREATE INDEX TABLE2_IX01 on TABLE2(UPPER(COLUMN2));

 

There is no such thing as free lunch though, there is an overhead to using function based index. It’s more costly to create than simple b-tree indexes.

 

While we are on function based indexes, my friend Jimbo has this great trick

 

 

Common problem - you’ve got a zillion row table of trades to process, with a status flag set to ‘PENDING’ in 10 cases and other values in the almost-a-zillion. You want the special rows.

 

Maintaining a convention B-tree index blows as the selectivity is so poor for N. So, how do you do it? I would have always gone for list partitioning with row movement until I devised this trick…

 

Use a CASE statement and a functional index to create an index in which the vast majority of entries are NULL - and so don’t appear. The index is thus tiny.

 

CREATE INDEX trade_trade_status_fidx ON trade ( CASE WHEN trade_status = ‘PENDING’ THEN trade_status ELSE NULL END );

 

Now select based on the condition that you indexed, it’ll use the index and it’ll be very fast.

 

SELECT * FROM trade

WHERE ( CASE WHEN trade_status = ‘PENDING’ THEN trade_status ELSE NULL END

) = ‘PENDING’;

 

 

 

Saturday, December 13th, 2008 at 17:46

Useful 10g AWR historical system views

DBA_HIST_SQLBIND: SQL Bind Information   

DBA_HIST_SQLSTAT: SQL Historical Statistics Information

DBA_HIST_SQLTEXT: SQL Text

DBA_HIST_SQL_BIND_METADATA: SQL Bind Metadata Information

DBA_HIST_SQL_PLAN: SQL Plan Information

DBA_HIST_SQL_SUMMARY: Summary of SQL Statistics

DBA_HIST_SQL_WORKAREA_HSTGRM: SQL Workarea Histogram History

DBA_HIST_STAT_NAME: Statistic Names

DBA_HIST_STREAMS_APPLY_SUM: STREAMS Apply Historical Statistics Information

DBA_HIST_STREAMS_CAPTURE: STREAMS Capture Historical Statistics Information

DBA_HIST_STREAMS_POOL_ADVICE: Streams Pool Advice History

DBA_HIST_SYSMETRIC_HISTORY: System Metrics History

DBA_HIST_SYSMETRIC_SUMMARY: System Metrics History

DBA_HIST_SYSSTAT: System Historical Statistics Information

DBA_HIST_SYSTEM_EVENT: System Event Historical Statistics Information

DBA_HIST_SYS_TIME_MODEL: System Time Model Historical Statistics Information

DBA_HIST_TABLESPACE_STAT: Tablespace Historical Statistics Information

DBA_HIST_TBSPC_SPACE_USAGE: Tablespace Usage Historical Statistics Information

DBA_HIST_TEMPFILE: Names of Temporary Datafiles

DBA_HIST_TEMPSTATXS: Temporary Datafile Historical Statistics Information

DBA_HIST_THREAD: Thread Historical Statistics Information

DBA_HIST_UNDOSTAT: Undo Historical Statistics Information

DBA_HIST_WAITCLASSMET_HISTORY: Wait Class Metric History

DBA_HIST_WAITSTAT: Wait Historical Statistics Information

DBA_HIST_WR_CONTROL: Workload Repository Control Information

 

 

Saturday, December 13th, 2008 at 17:46

AWR - sniffing bind variables - DBA_HIST_SQLBIND and DBA_HIST_SQL_BIND_METADATA

Sometimes when we are asked to review performance of an application, we come across a bunch of codes which uses bind variables, and sniffing the SGA is not all that useful since it will show you the statements with the bind variables (:1, :2), not actually showing what values are being passed on.

Thankfully in 10g, the detective work is made easier.

SELECT * FROM A,B,C WHERE A.COL1=B.COL1 and A.COL1=C.COL1 and A.COL3=:1 and B.COL2=:2 and C.COL3=:3

 

Let’s grab the DBID and SQL_ID

 

select DBID, SQL_ID, SQL_TEXT from dba_hist_sqltext where sql_text like ‘%A,B,C%’;

                                                                        

    DBID   |   SQL_ID    |         SQL_TEXT          |   

 ———-+————-+—————————+-

 1234567890|12345xxxxxxxx|<your statement here>      |                  

                                                                        

 

Then let’s dig in into the AWR to try to find what is being passed on

 

                                                                  select name, position, datatype_string

from dba_hist_sql_bind_metadata

where dbid = 1234567890 and sql_id = ‘12345xxxxxxxx’

                                       

  NAME | POSITION |  DATATYPE_STRING   

 ——+———-+——————–

 :1    |1         |DATE           

 ——+———-+——————–

 :2    |2         |VARCHAR           

 ——+———-+——————–

 :3    |3         |NUMBER            

                                       

 

This will highlight whether there are some implicit conversion done, lets say

col2=’1’ instead of col2=1 as it should

 

Another view that is useful would be

 

select *

from dba_hist_sqlbind

where dbid = 1234567890 and sql_id = ‘12345xxxxxxxx’

 

Saturday, December 13th, 2008 at 17:46

AWR - historical execution plan investigation

Sometimes you will receive complains from users that something that has been running ok for ages on a production server, suddenly is running very very slowly. In 10g, AWR gives you the ability to go and check and possibly compare the execution path of a query at different times in the past.

select DBID, SQL_ID, SQL_TEXT from dba_hist_sqltext where sql_text like ‘%your query%’;

DBID      | SQL_ID      | SQL_TEXT |

———-+————-+—————————+-

1234567890|12345xxxxxxxx|UPDATE STAGING SET BLAH    |

We know that yesterday it was running fine, but today it’s not, and users have confirmed that data volumes are identical between the two dates

select s.begin_interval_time, s.end_interval_time , q.snap_id, q.dbid, q.sql_id, q.plan_hash_value, q.optimizer_cost, q.optimizer_mode

from dba_hist_sqlstat q, dba_hist_snapshot s

where q.dbid = 1234567890 and q.sql_id = ‘12345xxxxxxxx’

and q.snap_id = s.snap_id

and s.begin_interval_time between sysdate-2 and sysdate

order by s.snap_id desc;

When you find this, take note of the PLAN_HASH_VALUE from the 2 different times that you want to compare

Actually the 3 parameters that you need are:

·         DBID

·         SQL_ID

·         PLAN_HASH_VALUE

 

And to check the execution plan, run the following query:

 

select id, operation, options, object_name, cost

from dba_hist_sql_plan

where dbid = 1234567890 and sql_id = ‘12345xxxxxxxx’

and plan_hash_value = <plan_hash_value1>

compare that with the second one

select id, operation, options, object_name, cost

from dba_hist_sql_plan

where dbid = 1234567890 and sql_id = ‘12345xxxxxxxx’

and plan_hash_value = <plan_hash_value2>

 

 

Saturday, December 13th, 2008 at 17:46