Friday, February 24, 2012

Browse using hierarchies of different dimensions together

Hi,

I have 2 hierarchies in 2 different dimensins (account and product). The dimensions are related to the same fact. The product dimension does not have any custom attribute relationships defined. (because there are none, all attributes are properties of each product in the dim. I have about 60 attributes!)

I have defined hierarchies in both the dimensions. The drill down required is:

Dim_Account's Hierarchy (level1,level2,level3...) then Dim product's hierarchy (level1,level2......)

Using this gives slow performance when the user reaches the drill down to the second hierarchy.

The second dimension (hierarchy) contains about 400,000 rows !

It take about 15-20 seconds.

I have tried in Visual studio, in Proclarity and now in OWC.

1. Will using two hierarchies together give correct results?

2. How to improve the performance?

3. Can I somehow create relationships between the two dimensions, so that I can bring the second hierarchy inside the first dimension and then make it a part of the first hierarchy?

I hope my last question is understandable :-)

Please do reply if any other info is needed.

Regards

Could you explain, with examples, what the Product drill-down hierarchy levels (level1,level2....) represent, since you say that all attributes are directly properties of Product (which is presumably the key attribute)?|||

Hi Deepak,

I will elaborate,

especially because it seems to many as a unreal business scenario.

I have been trying to improve this since months, hope you can help me.

Also, apologies if the post gets too long.

Here is the entire scenario:

Cube:

The cube is made of only one flatened fact table with about 100 measures. plus there are about 100 named calculations in the dsv.

The named calculations were used because to improve query performance, we did not use calculated members.

We have MOLAP partitions for each day (query binding). Each day consisting of about appx. half a million rows. (300000 - 500000).

We need to keep 2 months of data in a rolling window.

The data contained in the fact are stock and share trading numbers.

Dimensions:

There are 4 dimensions:

a. Product

b. account

c. currency

d. date

a. product dimension:

This contains the different products as stocks and shares, their prices, their Issue dates, ratings, description, rank, category, value..and many more which describe the product. There are about 80 attributes. All these are needed because the user might want any of the information while analyzing the information.

Threre is not hierarchy or levels that can be defined because all the attributes give some detail about the product. They all describe the product in different ways.

I know that this may sound weird, but that is how the database is, and that is what the user use and want.

infact the actual database has about 200 attributes which describe the product. we picked the most important ones that were needed.

From a point of a financial application needing real time updates, write-backs , this seems valid

and without which I believe the application would have no value.

I did create a hierarchy of about 4 attributes as per the navigation needs. but the actual data does not relate to different leves, but infact they are all on the same leve.

But I did this to try and improve the query performance. - It still did not improve.

Actually I cannot measure the query performance in this dim, when using hierarchies or only attributes for browsing,

because both take more than 30-40 seconds!

Q1) How to design this dimension? should I break up this dimension into multiple ones? how can I improve query performance.

Q2) while defining attribute relationships, There is a property called 'Cardinality'.

I have tried to define as follows: wherever each a attribute value can have multiple related values of another attribute, then I set it as 'many' otherwise as 'one'.

but I have read somewhere that this property has no effect, and is there for future use . Is this right?

if not how to define this property correctly among the attribute relationships being defined?

currently the 'product' dim contains about 450,000 dimension rows.

b. Account dimension:

This dimension contains account information of traders, stockist, companies, division info, Owner, Areas, multiple Account levels, account category.etc

There are about 50 attributes. Among these some of them have relations. example: each account level is related to its higher account level.

I have defined as many attribute relationships as possible,

Q3) but I have not been able to set the key column with multiple items (key collection). I dont know if this is needed, and how to determine this need. (like in the date, month, quarter, year > date dim example. we define the key collection to contain month and date both)

I could define about 8 Hierarchies according to user navigation needs.

But again I have a design problem here:

The lowest level, key attribute is the Account number.

The account levels, categories, trader accounts all drill down to the lowest account number.

till here is seems alright. but There are about 5 other attributes in this dim, which for each of their values have multiple account numbers.

for example there is an attribute called ManagementId. each managementId can have multiple accountnumbers associated with it.

(this is what I saw from the database er diagram and the model)

So it becomes like this:

multiple account levels, each having one account number.

and multiple managementids each having one account number.

I dont know if multiple account levels can have multiple account numbers!!!! ( so is it a many to many dimension model?)

Q4) what should i do about this? how should I model this? should I break up this dimension into two or more?

but the performance of this dimension is fairly good,

Every drill down takes about 2 seconds to 4 seconds.

currently the 'account' dim has about 200,000 dimension rows .

3. The other two dimensions are date and currency.

date is the usual date with 2 hierarchies (calendar and fiscal) and attribute relationships defined.

The currency dimension currently only contains codes for different currency types, like Yen, Usd, Rup, Eur. ....etc.

but we just use this as a filter. It contains about 200 data items.

Q5) hope this is correct, because there was only one attribute, I dont know if we could ahve

If we think logically, the account and product dimensions are related ,

but I dont know how to relate them.

meaning the users want to drill down from the (account) account levels to (account) accountNumber and then to product level (product dim). (productId, product category, value....etc.)

It would mean that among the dimensions, the product dimension is the lowest granularity dimension.

Q6) How do we model this? How to achieve best performance?

I believe that I am doing something wrong somewhere or in multiple places because,

half a million rows is not a very huge data set for ssas 2005.

I appreaciate your detailed comments and help.

off lately I have not been able to implement some of the replies/help I recieved in this forum.

from which I get a feeling that my questions are ignored. I apologise for confusion. my bad.

Regards

|||

You are correct that the cardinality property on AttributeRelationships currently has no impact on the server behavior.

Using two hierarchies together (in the same query) will give correct results.

You can't create relationships between dimensions in the way you suggest. However if they contents are truely related such that product is the key to account, then you can create a single dimension. Generally, you'll want to create a single dimension for each logical business entity and include in it all the logical properties that are of interest which belong to that entity. Note that you needn't include all such properties in hierarchies or even have them all as browsable.

The best practices document at http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/olapdbpssas2005.mspx and the perfomance guide at http://download.microsoft.com/download/8/5/e/85eea4fa-b3bb-4426-97d0-7f7151b2011c/SSAS2005PerfGuide.doc should prove helpful to you in improving your performance.

|||

Hi,

Thanks for the reply.

Firstly, yes I have gone through some portion of the performance guide, and probably assimilated some of it. I will try to work on it.

Next some things given in the performance guide or any microsoft guide take some things for granted, for example the database design. In my case the backend is very complicated.

Data comes from multiple databases to sybase (as a flattened fact).

but they are not joined in SSAS!, they are joined in the Universal DB2 (federated system), also called as UDB. This data is not in the form of relational tables,

but as a huge flattened fact table.

we have then created different views on top of the flattened fact. (for dimensions, for the single fact table)

The reason we could not directly use the sybase but had to put in UDB was because,

the sybase driver had restrictions to the length of the fields being pulled to ssas. we had to rename the fields to short forms.

currently UDB (DB2) driver also gives a lot of problems, and the processing time has increased from 2 min to 5 min.

The reason I explained the backend was because, since the data is in a flattened table , which we are splitting into views we do not know much of the hidden relationships in data.

due to which I always get the feeling that the data design behind the cube is itself wrong. but I have not way to determine this.

I mean should a set of attributes become one dimension, or be split up into 2 or more.

coming back to my prev questions,

some more help on them please,

1.

<......I did create a hierarchy of about 4 attributes as per the navigation needs. but the actual data does not relate to different leves, but infact they are all on the same level ...... >.

The doubt I have here is: when we create a hierarchy, the different levels of the hierarchy need not be really depicting data at different levels, do they?

I can have 3 levels in an hierarchy, but all the three levels actually are on the same level in the logical view.

for example I might be showing the foll three: dim_car > car_color, car_size, car_weight.

since all the three are just properties of a car, and thus are actually at the same logical level.

but putting them in 3 levels of a hierarchy would actually define the navigation path for the user.

and also there is no relation between the size, color and weight (no attribute relatinships)

thus the hierarchy does not become a natural hierarchy.

2. can I break up a dimension into multiple ones? by moving the logically related data to different dimensions.

will this improve the query performance? if so what should I consider before doing it (like key attributes, their relations with fact)?

3. <........... but I have not been able to set the key column with multiple items (key collection). I dont know if this is needed, and how to determine this need. (like in the date, month, quarter, year > date dim example. we define the key collection to contain month and date both) ...............>

any ideas on this?

4. Determining a many-to-many dimension,

This is easier if they are all different dimensions.

but in my case they might be just attributes of the same dimension,

so I might have to seperate them and build a new dimension and then use the many-to-many dimension design.

so the question is how to indetify this from the underlying data?

and can this impact the performance?

sorry I know some of these might be answered in the performance doc,

but the scenario becomes different to relate to sometimes.

Thanks in advance for helping

Regards

No comments:

Post a Comment