Monday, December 08, 2008

Stars and Snowflakes

PART 1

We like to talk about the weather in England, and today it is fine - Sunny and Cold. Here in the West at night you can see all the stars and over in the East they are having light snow falls.
Nature has not yet figured out how to combine a clear, star lit night with snow fall.
You can have one but not the other at the same time.
But can OBIEE defy the laws of nature?

A
recent debate got me re-thinking about why I design like I do in OBIEE / Siebel Analytics.


Let’s look at the arguments:
There are two main contenders, Star Schema verses Normalised. Inmon vs Kimball. (I’ll not go into the details of these, I’ll assume you know what each looks like.)

Bert Scalzo says in his book on Warehousing "You should not have a star schema with snowflakes in", but he does say 'AVOID snowflakes' which suggests they are not banned completely.

However, many clients I have worked for actively encourage snowflakes. They're not actually requesting them, but they can't see the point of spending lots of money waiting for the warehouse to be ready for reporting. "The data is in there, so just report on it", they say! This is not helped by sales men and women telling clients how easy it is to develop on your existing structure. One client I worked at was convinced that the layers in the database with complex ETL was purely Intellectual stimulus for bored contractors (Who also got paid by the day!).

Then there is the real killer, the POC – Proof of Concept. Certainly proves that you can report on their data without the hassle of ETL, schedules, downtime, etc.!!!

Another project I know of is full of snowflakes, and just like the real thing, each one is unique! Again the client is happy with the design because works for the current reports.

Up until now I have been a convert to the Star Schema, slavishly making sure every table fitted the philosophy, but I need to keep the client happy, which effectively means developing cheaper. When they send their development offshore they also tend to abdicate from significant design decisions, they just want the system up and running as quickly and cheaply as possible. But that’s life, this is the new game, get over it. Compete.

Before I revisit my design principles, Can Siebel Analytics/OBIEE really cope with snowflakes? Or is it smoke and mirrors? The answer depends upon whether you are looking at a Logical Level or Physical, and how well you implement a snowflake.

On first glance you look at the Business Model and Mapping Layer as a pure Logical Level. You normally have logical tables, arranged in a logical star. And in this logical layer the training documents have always decreed that you have a star.

But can you use another logical model? What would happen if you tried to organise your logical tables in a snowflake? Do you get an error when you check for consistency? No? Why not? Surely OBIEE does not understand snowflakes and it should tell you if there is a bad design!?.
Lets have a look at a snowflake, or normalised, logical layer.


Here we have one dimension joining to the fact, and two other dimension joining to it.









So, running a report,
First the list from W_PERSON_D on its own,








Now add the Fact,










NO problem!

Including normal aggregation




Now we add one of the ‘flakes’







And still no problem.

But what if we only query the Flake (W_PERSON_DX) and the fact?







The result is …..





It works !!!


Here is the SQL being produced.


select T519."ATTRIB_11" as c1,
sum(T547."BLAH_VALUE") as c2
from
"W_PERSON_DX" T519,
"W_PERSON_D" T301,
"W_PERSON_F" T547
where ( T301."ROW_WID" = T519."ROW_WID" and T301."ROW_WID" = T547."CONTACT_WID" )
group by T519."ATTRIB_11"
order by 1



It added in the W_PERSON_D table all by itself.
Remember there is no join between PERSON_DX and PERSON_F, not physically and not logically.

So, a normalised set of tables arranged with the joins in normal form does work.

So why bother with stars?

(part2 to follow)



No comments:

The Cowes

The Cowes
Cowes Racing