Mathematical Interpretation of Dimensional Modeling ‐ Venkata Prasanna Kumar Ganduri Dimensional Modeling is a key concept of a Business Intelligence project where the efficiency and effectiveness lies. It will be quite confusing for one to work out the relations between the facts and dimension to get the data desired by the end customer. Of course easy when you deal with small tables, yet with terabyte of data it would turn complex. This mathematical interpretation of dimensional modeling is an attempt to bring down the complex understanding of Measures and Facts into simple mathematical equations. This paper focuses To make amateurs understand the technique in dimensional modeling explaining what way it can be done. To ease the job of dimensional modeler to check the possibilities to achieve desired measure across any number of facts and dimension tables. Introduction A Dimensional model usually consists of Fact and Dimension tables irrespective of relations existing in the data. Picking up of right data and arranging it into respective dimensions and facts from the tables of OLTP is an obvious challenge for any data modeler. These dimensions and facts are defined in different ways in different books. Let us bring down all the complexities involved in the definitions using simple vocabulary. Facts : Facts are brought by posing a simple question “What?” to the existing raw data. So it is so simple definition of a fact/measure by knowing what data we have to analyze and watch across different cases. Note : Sticking to the rule that all facts are quantifiable either additive, semi additive or non additive. Dimensions : In the same lines, Dimensions can be queried with “How?” across ’What’ (Fact). How a fact can be achieved is a dimensional attribute. All those dimensional attributes that fall under same category/group form a Dimension table. The above is a sample figure depicting a ‘Student Academic Fact database’ where dimension tables are built upon category of use across which facts like end_of_year_status, degree_completed are measured. This is a precise description on the general trends followed in designing a dimension model. Mathematical Interpretation Mathematical interpretation of a dimensional modeling is itself a strange thought sprouted from the word ‘Dimensions’ that sounds near to ‘axes’ in coordinate system. Yes, this is neither profound nor hard to implement. The dimension tables are similar to the dimensions in simple maths and the facts are the locative points those are derived from the intercepts of respective axes linked to it. Consider a three dimensional co ordinate system, where a point P can be located with X, Y and Z axes. It is represented as P(x, y, z) where x, y, z are co ordinates of X, Y, Z axes respectively. Unless all the co‐ordinates exist, it is hard to trace the location of the point in a 3D Cartesian co‐ordinate system. In a mathematical equation it is represented as (x)X + (y)Y + (z)Z = P ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐ [ Eq‐1 ] Now in reference with above figure depicting ‘Student Academic Fact database’, the figure can be changed as Now, imagine the above is a 6D co ordinate system, from which you can get ‘what (fact) ever’ you want to trace. For each combination of dimensional attributes you can get the unique value of quantifiable facts end_of_year_status, degree_completed. In the lines of Co‐ordinate system, each fact is represented as E_O_Y_S(SDT,H,T,P,ST,SC), D_C(SDT,H,T,P,ST,SC). In Mathematical Equation form (a)SDT + (b)H + (c)T + (d)P + (e)ST + (f)SC = E_O_Y_S ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐ [ Eq‐2 ] Where a – Particular child of any dimensional attribute in Student Dimension b – Particular child of any dimensional attribute in Household Dimension c – Particular child of any dimensional attribute in Time Dimension d – Particular child of any dimensional attribute in Project Dimension e – Particular child of any dimensional attribute in Status Dimension f – Particular child of any dimensional attribute in School Dimension This way of representing a dimension model may look quite weird yet it can explain many concepts underlying dimension tables answering why are we doing so and how are they been worked out in a real time scenario. Applying to Schema.. The constructional pattern in a dimensional model involves in two forms usually Star schema, Snowflake schema. Interpretation of both the schemas can be done mathematically as follows. In a Star schema, all the dimensions are centrally linked up to a fact table whose general mathematical equation resembles [ Eq‐2] For a Snowflake schema, refer to the model given below, which is a sales fact table. The product dimension snowflakes with product_class dimension and all this can be represented in mathematical form in the lines of former representation as (a)T_B_D + (b)C + (c)PMT + (d)S + (e)X = S_F_1997 ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐ [ Eq‐3(a) ] (l)PDT + (m)PDT_C = (e)X ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐ [ Eq‐ 3(b) ] Where T_B_D ‐ time_by_day dimension C ‐ customer dimension PMT ‐ promotion dimension S ‐ store dimension X ‐ integrated dimension PDT ‐ product dimension PDT_C ‐ product class dimension a ‐ Particular child of any dimensional attribute in time_by_day dimension b ‐ Particular child of any dimensional attribute in customer dimension c ‐ Particular child of any dimensional attribute in promotion dimension d ‐ Particular child of any dimensional attribute in store Dimension e ‐ combinational attribute value of the integrated Dimension l ‐ Particular child of any dimensional attribute in product Dimension m ‐ Particular child of any dimensional attribute in product_class Dimension A combination of the above two equations can be interpreted as a desired output of such a snowflake design. The new dimension X here is a combination of product and product_class dimensions which is quite possible to make it a single dimension in SSAS during the course of constructing a cube. Factless Fact Scenario… A Factless Fact table is a typical scenario of dimensional modeling where usually there is no measure to count on. Even such cases can be explained through mathematical expressions. The below given is a typical example of a factless fact table to check for the presence of students to the class course. Usually attendance of a student can be ‘yes’ or ‘no’. However, there is nothing to be calculated across all the mentioned dimensions except the presence of them; in other words, the count of the combinations. The factless fact table is totally not nullified of facts as the name goes. It has a columns with the count of combinations occurring i.e. to count the number of students present for the day it is enough if you count number of rows across a particular course, teacher and time if the student attended it or not. This is the only way used effectively in such contexts. The mathematical expression is (a)S + (b)Te +(c)Ti + (d)C = 0 ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐ [ Eq‐4(a) ] But as we are saying there is at least one measure ‘Count’ here ‘Attendance’, the mathematical expression looks (a)S + (b)Te +(c)Ti + (d)C = A ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐ [ Eq‐4(b) ] Where A ‐ Fact measure for counting attendance a – Particular child of any dimensional attribute in Student Dimension b – Particular child of any dimensional attribute in Teacher Dimension c – Particular child of any dimensional attribute in Time Dimension d – Particular child of any dimensional attribute in Course Dimension This cutting down of fact and dimension tables into mathematical equations is not a short cut but it cut shorts the time in doing the same without any confusion. When we take the same to a higher end application, (say) with terabyte of data it is not that easy to manage hundreds of tables and their respective relationship across other tables. To bring all such tables under the same data source view creates some sort of confusion in the modeler itself. Thus, every concept of dimensional modeling that looks scary on a data source view can be brought down into simple mathematical equations to fetch the required data or to find a solution to form a link between Fact/Dimension across any number of the same.
© Copyright 2025 Paperzz