In this chapter, we will create and deploy a multidimensional cube in SQL Server 2019. We will use the existing data in the WideWorldImportersDW database that we uploaded in Chapter 1, Analysis Services in SQL Server 2019. This data has already been organized using dimensional modeling techniques. As you work through this chapter, you will learn how to create the Analysis Services project and build out a functional cube. Once it has been built out, we will deploy it and review more advanced techniques that will automate cube processing.
In this chapter, we're going to cover the following main topics:
In this chapter, we will be using the WideWorldImportersDW database from Chapter 1, Analysis Services in SQL Server 2019. You should connect to the database with SQL Server Management Studio (SSMS). You will be using the schema and views you created in Chapter 3, Preparing Your Data for Multidimensional Models. If you are starting with this chapter, you will need to apply the view from Chapter 3, Preparing Your Data for Multidimensional Models, to the WideWorldImportersDW database before we start.
This chapter will also require the use of Visual Studio 2019 Community Edition to create the Analysis Services project.
Analysis Services databases are created in Visual Studio. In this section, we will create the project and connect to the database views we created previously. This will create a data model in Analysis Services that will serve as the basis for the rest of the work we'll do in this chapter. Let's get started:
a) Project Name: WideWorldImportersMD
b) Location: This will be the location where you want to store the project
c) Solution Name: WideWorldImportersSSAS
Naming your Visual Studio project and solution
There are a couple of comments to be made on the names. We can put this project and the tabular project we'll create later into the same solution. This will simplify the process as we can add a project in later sections.
Congratulations! With that, you have created the project. If you are new to Visual Studio, now is a good time to review the integrated development environment (IDE). When you have a new project, the two key features you need to know about are the design surface and the Solution Explorer. In the following screenshot, we have highlighted the design surface (1) and the solution explorer (2):
As we continue the process of getting the project ready so that we can build out our first cube, we will be working with the solution explorer, which will add tabs or pages to the design surface.
The next few sections will cover the base items that will support the entire project. First, we will create the connection to the WideWorldImportersDW database that we created. Then, we will create the data model, which will be the basis for cube development.
SQL Server Analysis Services (SSAS) supports connecting to many different data systems. For our hands-on example, we will be connecting to SQL Server 2019 and the WideWorldImportersDW database.
Database connections are created in the Data Sources section of Solution Explorer. Let's get started:
a) Provider: Choose Native OLEDBSQL Server Native Client 11.0.
b) Server Name: Enter your server name here.
c) Authentication: I am using Windows authentication. However, you may need to use SQL authentication and the username and password if you have issues with Windows authentication. You can use the username and password you created when installing SQL Server in Chapter 1, Analysis Services in SQL Server 2019. This is not an uncommon experience when performing local development with SSAS.
d) Connect to a Database: You can either select the WideWorldImportersDW database from the list or type it in. Either option works.
What if you have more connections in your list?
It is possible to have more connections show up in your Data Connections list. This usually occurs if you have other Visual Studio projects that have used similar connection processes. Be sure to connect the connection to the WideWorldImportersDW database before proceeding.
Congratulations! With that, you have created the data source. This is now in the Data Sources folder. Now, we are ready to create our data source view.
Data source views (DSVs) serve as the translation layer between the source of your data and the SSAS model. In Chapter 3, Preparing Your Data for Multidimensional Models, we created views in the database to serve in a similar capacity. Why both?
DSVs in SSAS serve as an abstraction layer from the underlying data source. This means that data not in a DSV cannot be accessed by the multidimensional model. If you want to use the data for design, it has to exist in the DSV.
Database schemas and security
In the previous chapter, we created a schema and added the views that will support our model build. It is a good practice to use a specific user or system account in production deployments. When combined, these two activities limit access to the schema and data for the model. If you implement this during design, you can create an Active Directory group and give it permissions to the schema. This will limit access to data for developers as well.
DSVs become more important when the multidimensional model designer has no access or influence on the underlying database. As an example, multidimensional models are often created to make data in data warehouses more accessible. This includes non-Microsoft database systems such as Oracle.
In situations where the data warehouse team and the business intelligence team are not the same, multidimensional model development can be significantly hindered if database views cannot be easily created. In a case such as this, a DSV can help with the process. It allows you to add the table and then make changes to support the multidimensional model design.
As we work through this section, we will be creating our core DSV based on the Cube schema we created in Chapter 3, Preparing Your Data for Multidimensional Models. However, we will also add a couple of tables using the DSV directly with the tables. We will then illustrate some changes that can be made in the DSV, such as using a relational view.
The following steps will help you create a DSV from the Cube.Customer view in the Cube schema. Let's get started:
Adding related tables to the Data Source View Wizard
The option to Add Related Tables is the button below the Included objects list. This button will add all the tables related to the table you add. For example, if we selected the Sales (Fact) table and then clicked the Add Related Tables button, it would add the Employee (Dimension), Customer (Dimension), Date (Dimension), Stock Item (Dimension), and City (Dimension) tables to the Included objects list. This does not work when using relational views. While it can be inconvenient to use relational views, using views is still a better practice in most cases. We do not recommend trading development convenience for maintainability.
Double-click the DSV you just created to open the Design view for the DSV. As shown in the following screenshot, we do not have any relationships between our tables. This is what we will look at next:
We will be adding diagrams for each fact table we included. The diagrams help us keep the views organized around themes. We will also add relationships to finalize our DSVs for the next steps.
Before we start the next section, let's do a quick refresher on our data model. We have created eight dimensions, two of which are current. We have also created two fact tables. The Sales fact view matches the Fact.Sale table in the data warehouse and the data is at the line item level. The Invoice Sales fact view aggregates the date in Fact.Sale to the invoice number level. Therefore, we will create two diagrams that have some dimension table overlap – Sales Diagram and Invoice Sales Diagram.
a) The first way is to drag the column from the Sales table and drop it on the matching column in the dimension table. For example, you can drag City Key from Sales to City Key on City.
b) The other option is to right-click the Sales table and choose New Relationship.
Both of these techniques will open the Specify Relationship dialog. If you drag and drop the column, the columns will be filled in. In that dialog, you can update the source and destination tables that will be used in the relationship. The Source (foreign key) table is the fact table. In our case, select Sales from the drop-down list. The Destination (primary key) table is the dimension table. In this case, select City from the drop-down list.
To complete this diagram, repeat the previous process and create the remaining relationships. Use the following list to create the remaining relationships. The first two lists are followed by the action to be taken after they are created. The list is organized as follows: Source table, key field > Destination table, key field. Here is the first list:
When creating this relationship, you may be prompted to create a Logical Primary Key in the Item-Current table. Agree to this change by clicking Yes in the dialog that opens as follows:
Here is the second list:
When creating this relationship, you will be prompted to create a Logical Primary Key in the Salesperson-Current table. Agree to this change by clicking Yes in the dialog that opens.
Here is the third list:
Role-playing dimensions
A role-playing dimension is a dimension that has multiple relationships with the fact table. From a data warehouse implementation, it is more space-efficient to create separate relationships from the same table. In our design, the date is used for Invoice Date and Delivery Date, which have distinct relationships but use the same dimensional data. This type of relationship happens often in a data warehouse.
Now that the relationships have been created, your Sales diagram is complete. It should look as follows:
Next, we'll create a diagram for Invoice Sales and its relationships.
Next, we need to create the Invoice Sales diagram. This process is similar to the one we followed for the Sales diagram. Here is the list of relationships you will need to complete. The following list contains Source table, key field > Destination table, key field:
The resulting Invoice Sales diagram should look as follows:
We will now review the DSVs.
Before we move on to building out the dimensions and measure groups, we need to review some additional capabilities available in our DSVs. We are going to update an existing table and add some additional fields to it. To do this, we will replace the table with a New Named Query. Let's get started:
SELECT [WWI Invoice ID]
, [Invoice Date Key] as [Invoice Date]
, CAST(DATEPART(year, [Invoice Date Key]) AS VARCHAR) + '-'
+ CAST(DATEPART(month, [Invoice Date Key]) AS VARCHAR) AS [Invoice Month]
, DATEPART(year, [Invoice Date Key]) AS [Invoice Year]
FROM Cube.Invoice
Now that we have created the Data Source View, we can start adding dimensions.
In SSAS, dimensions are comprised of a group of attributes and hierarchies that define the dimension. Before we get into how, we need to discuss three key topics; that is, dimensions, attributes, and hierarchies:
Now is a good time to remind you that multidimensional databases have been around for many years. As a result, they have been modified significantly and have a wealth of capabilities that are outside the scope of this book.
In the next few sections, we will create our dimensions through the use of the Dimension Wizard window. This is the most efficient method to do this. Once the dimensions have been created, we will create the hierarchies and attribute relationships. This is the most basic development task required to create dimensions. The remaining sections will call out specific, more advanced techniques that can be used improve the usability and performance of the dimensions. Let's get started.
Let's begin with creating dimensions with the dimension wizard with the help of the following steps:
Other dimension creation methods
As shown on the Select Creation Method screen of the Dimension Wizard window, you can create dimensions by generating tables. Two of the options are designed for situations when you need a Date dimension by generating a table in the DSV or underlying data source. While convenient, we typically recommend against this process. The reason for this is that most of the time the wizard does not generate the attributes you need in your model. Most data warehouses, such as ours, have a date dimension created to support this functionality. The third option uses a generic template to create dimension tables. For example, you can create a customer dimension using the Customer Template option. However, you will still need data to support this. If you are looking for design examples for various business needs, this may be a good way to experiment. Typically, you will follow the pattern we have been using and create a relational star schema to support multidimensional model design.
Choosing your dimension attributes
Every field can be chosen as an attribute. Here are some of the key considerations when choosing an attribute:
a) Does the business need the attribute? System values are typically not necessary for development. Examples commonly include lineage IDs and other fields designed to support the load process.
b) Is the field used at all? Just because the name sounds good does not mean it is a candidate for an attribute.
c) The primary consideration is that the attribute needs to have value to the business or is necessary for a calculation. Don't include the attribute just because it is there.
Congratulations! You have created the first dimension in your multidimensional model. You should now see the Design tab for your dimension open in Visual Studio. Now, we need to add our remaining dimensions. For each of the dimensions, you will go through similar steps to what we followed here. We have listed each dimension here, along with any changes you should consider while creating them:
Here is the completed dialog box for your reference. You will also find attribute types for the standard calendar dates, fiscal dates, and ISO:
You should see all your dimensions in the Dimensions folder in Solution Explorer. You should also have all the dimension Design tabs open in Visual Studio.
Using the Business Intelligence Wizard to set attribute types
We could have used the Business Intelligence Wizard on both the City and Date dimensions to set the attributes. The Business Intelligence Wizard is designed to support tasks such as attribute types. Because we did this while creating the dimension, it is not necessary. If you want to view the wizard, right-click the Date dimension and choose Add Business Intelligence. Select Define Dimension Intelligence to view the mapping you just completed. This works with the City dimension as well. This is just one other option you can use to set the attribute types for your dimension.
Before we start working on the hierarchies, we need to combine the dimension key with its name. This is important as we do not need the surrogate key available in the hierarchy or to users as it is meaningless. This process will result in the name or other meaningful attributes being visible to the users while the data is optimized using the key in the background. Let's begin:
This will set the City attribute as the key column for the City dimension. We have highlighted the areas you need to be concerned with in the following screenshot:
This process should be completed for each dimension, before you add the hierarchies. The following is the list of keys and names for the remaining dimensions that need to be modified (not all the dimensions need this update):
Reminder
Be sure to save your work on a regular basis.
Hierarchies are key to good design in all business intelligence models. However, in multidimensional models in SSAS, they are even more important. They enhance the user experience, improve performance, define aggregation levels, and refine storage patterns for the data.
Let's dig into some basics about hierarchies that you should know about. First, we need to differentiate between natural and unnatural or artificial hierarchies:
The natural hierarchy might contain Type (outdoor, indoor), Style, and Product or the item of clothing. However, the business may want to create an artificial hierarchy using different attributes, such as Color and Size. The business now wants a hierarchy that uses the following pattern: Season, Style, Color, Size. The following diagram illustrates the imbalance between natural and artificial hierarchies based on the data:
This is not ideal as it cannot be optimized for SSAS. While it is possible to create these in SSAS, it is typically not recommended. This is particularly true now that tools such as Power BI can handle those requirements easier in their design tools. We will be creating natural hierarchies in our dimensions in the next section.
Once a dimension has been created, every attribute is created as a two-level hierarchy. The levels are All and the attribute itself. As we discussed previously, this is to support aggregations and storage. When measures are added later, each attribute will be aggregated at the All level and the individual attribute level. In some cases, this is what we want. However, we typically do not deal with most attributes in isolation, which is why we create hierarchies.
We will start with the City dimension. Go to the Design tab for the City dimension. If you have closed your tabs or Visual Studio, reopen the project, expand the Dimensions folder, and double-click the City dimension to reopen the Design tab. You should be looking at a screen similar to the one shown here:
We will be working with the Attributes and Hierarchies panes to create our hierarchies. Typically, we should know the hierarchy options through data or business analysis. In this case, we will be creating two hierarchies:
We will create each hierarchy using the following steps:
Staying in the City dimension, you might have seen the warnings in the hierarchies letting you know that the attribute relationships are not in place. That is the next area of focus for our dimension build-out.
Attribute relationships in SSAS support additional query and storage optimizations. SSAS uses the defined relationships to consolidate processing operations, which makes loading data into a multidimensional model (processing) and querying data in the model more efficient. SSAS optimizes storage by using more compression when attribute relationships are defined.
When we created the hierarchies, SSAS assumed that the relationships could be optimized based on the relationships in the hierarchies. Let's add the relationships that support the hierarchies we have created.
When you open the Attribute Relationships tab on the City dimension's Design window, you will see that all the attributes that have been added to our hierarchies have been mapped to City. The remaining attributes are considered direct attributes that have a 1:1 relationship with City:
You can change the mapping in a couple of ways. The easiest and sometimes the most frustrating option is to drag and drop. You can create the relationship between Country and State Province by dragging State Province onto Country. The resulting relationship is State Province > Country. If you do this the other way around, you will need to fix the relationship. You can also adjust or create the relationship by right-clicking on the attribute relationship in the pane on the right, under the mapping window. This will open the following dialog, where you can add or create the relationship you need:
Whichever pattern you choose, you should end up with a set of relationships that match the hierarchies we created.
Flexible versus Rigid attribute relationships
The default type of relationship for attributes is Flexible, which assumes that the relationship could change over time. This is the default and most flexible option, as the name suggests. However, this is less efficient if Rigid is a valid option. Rigid assumes the relationship will not change over time. A great example of a Rigid relationship is in the Date dimension. Years have quarters, quarters have months, and months have days. This will not change. However, an employee dimension could see people promoted to managers. This would require the Flexible relationship type to support the movement of attributes and their relationship over time. Be aware that if you choose Rigid and a change does occur, the model load may fail as a result.
Here is what the attribute relationships should look like when they are mapped correctly:
You need to have followed the preceding steps for each of the dimensions we have created. Let's take a look at the hierarchy definitions for each of the remaining dimensions. Remember to create the hierarchy and then update the attribute relationships for each of these dimensions. We have included the attribute relationship image for each dimension as a reference.
Hierarchy and dimension names must be unique in the multidimensional model
When designing a multidimensional model, it is common to have names repeated in the design. When working with hierarchies and dimensions, your hierarchy names need to be unique within the model; otherwise, you will get a build error. For example, the Current Item and Item dimensions have the same structure, while the Brand hierarchy we are creating will need to be named differently to prevent conflicts. We will also use Customer Hierarchy in the Customer dimension to differentiate the hierarchy from the dimension.
Here is the attribute relationship for the Current Item and Item dimensions:
Hierarchy Name: Current Item Brand
Hierarchy Levels: Brand > Stock Item
Hierarchy Name: Item Brand
Hierarchy Levels: Brand > Stock Item:
Here is the attribute relationship for the Customer dimension:
Hierarchy Name: Customer Hierarchy
Hierarchy Levels: Category > Buying Group > Bill To Customer > Customer:
Here is the attribute relationship for the Date dimension:
Hierarchy Name: Calendar
Hierarchy Levels: Calendar Year (Calendar Year Label attribute renamed Calendar Year) > Calendar Month (Calendar Month Label attribute renamed Calendar Month in the hierarchy) > Date
Hierarchy Name: Fiscal
Hierarchy Levels: Fiscal Year (Fiscal Year Label) attribute renamed Fiscal Year in the hierarchy) > Fiscal Month (Fiscal Month Label attribute renamed Fiscal Month in the hierarchy) > Date
Hierarchy Name: ISO Week
Hierarchy Levels: ISO Week Number > Date:
Once you've created the hierarchies, you should set the relationships up like so:
Here is the attribute relationship for the Invoice dimension:
Hierarchy Name: Invoice Hierarchy
Hierarchy Levels: Invoice Year > Invoice Month > Invoice Date > Invoice Number:
Here is the attribute relationship for the Salesperson dimension:
Now that we have all the dimensions built, along with their hierarchies, we can load and preview the data. The next section describes how you can process your dimensions, which will load the data into SQL Server 2019 Analysis Services.
Processing in SSAS multidimensional models is the method of loading the data into the multidimensional database. At the end of this chapter, we will dive into processing techniques in more detail. The focus of this section is processing our dimensions. We will walk through processing the City dimension in this section. By doing this, you will be able to apply these steps once more so that you can process the rest of the dimensions. Some of the setup here is for the entire project and will be repeated in the processing section at the end as well.
The following steps only need to be done if you have not already set up the deployment properties for your project. For these steps to work, SSAS in multidimensional mode should be running:
Your project should now be ready to deploy to Analysis Services. Let's take a look:
One of the most common and annoying issues with SSAS in a development, non-enterprise network environment is due to impersonation. You may recall when we created the data source at the beginning of this chapter that we used Use the credentials of the current user for impersonation. This impersonation has served us well for development and my work for you as well.
In my setup, this did not work when processing because the current user is the SSAS NT Service account, which is running the service. This is a default service that was user created when my account was set up. To work around this issue, I added that user account to my WideWorldImportersDW database in the db_datareader role. If you have issues with processing, you can use this option. I would not recommend this for production use. A system account with the appropriate permissions should be used to manage this scenario. We will discuss this in detail in Chapter 11, Securing Your SSAS Models. If you want to implement the solution I used quickly, run the following scripts in SQL Server Management Studio when it's connected to your SQL Server Data Engine instance.
First, you will need get your service account name from our Services console or SQL Server 2019 Configuration Manager. In my case, the account name is NT ServiceMSOLAP$DOWSQL2019. Here are the scripts you need in order to add this user to your data warehouse:
USE [master]
GO
CREATE LOGIN [NT SERVICEMSOLAP$DOWSQL2019] FROM WINDOWS WITH DEFAULT_DATABASE=[master], DEFAULT_LANGUAGE=[us_english]
GO
USE [WideWorldImportersDW]
GO
CREATE USER [SSASMDSys] FOR LOGIN [NT SERVICEMSOLAP$DOWSQL2019]
GO
USE [WideWorldImportersDW]
GO
ALTER ROLE [db_datareader] ADD MEMBER [SSASMDSys]
GO
Depending on your environment setup, this option may not work. In some cases, using a Windows account may work. These issues typically affect development environments that are not connected to an Active Directory domain. If you are connected to a domain, service accounts will support a more cohesive solution.
Now that we have successfully processed the City dimension, you can choose to process everything we have created so far by right-clicking the project name and selecting Process All. This will confirm everything is working. If you have any errors, fix them and process everything again. You can also choose to process them one at a time so that you can deal with issues on a smaller scale. You must remember that the process queries the source database and replaces the data with new data. When working with larger multidimensional models, this can be a time issue. If your development environment is not very powerful, you could even experience issues with our project.
Whether you process the entire database or one object at a time, a processing log is displayed at the end. This provides the processing time and row counts for all the dimension attributes and hierarchies that have been processed. You should take a moment to explore the log to see the details of the work you have done. You should also browse the dimensions to see how the data will be presented in end user tools. This is an excellent way to handle issues early in the design process.
If you have taken the time to browse the data, you may have noticed some issues with our dimensions. The obvious issue is ordering dates. We would like them ordered by date, not name or label. We will fix that order in this section, as well as the order for the Salesperson dimension.
Another issue is with the Customer dimension. Bill To Customer and Customer have duplicate values in the Customer hierarchy for the head office of both Tailspin Toys and Wingtip Toys. These are the two customers that are currently available in our data. We will implement ragged hierarchy principles here to hide duplicate values. Let's get started with fixing the sort order in our dimensions.
By default, dimension and hierarchy attributes are sorted by the key in the attribute. You can sort attributes by other attributes in the dimension. Let's fix one so that you have an example to work with. We will start with the Date dimension. SSAS knows how to sort dates properly, so the lowest level, Date, is sorted correctly. However, we need to fix the Month levels. Both the Fiscal and Calendar hierarchies have issues with the Month level sorting. The Year and Date levels are fine. Let's get started:
We can also change our Item and Salesperson dimensions so that they use Name instead of Key for the sort order. Let's walk through changing the Salesperson dimension:
You can apply the same pattern if you have other attributes you would like to change the order of.
Ragged hierarchies are used when a hierarchy has levels that are skipped or end early. This happens often in geography dimensions, for example. If you have customers in Europe and Canada, you may have different levels of hierarchy.
Let's look at a Canadian hierarchy:
Now, let's look at a customer hierarchy in Norway:
How is this handled in the data? Often, we repeat the city name in StateOrProvince if it is not in the underlying data or structure. So, Olso would look as follows in the hierarchy:
The issue with this is that no one wants to see this in their report tools. We can remove the second Oslo in the hierarchy by selecting the StateOrProvince level in the hierarchy and changing the HideMemberIf property for the level to OnlyChildWithParentName. This will make the unused level invisible to the report tools.
Let's put this to use with our Customer dimension. We know that head office for both Wingtip Toys and Tailspin Toys is repeated in the Bill To Customer and Customer levels of Customer Hierarchy. Here is how we resolve this issue:
Impact of ragged hierarchies
In the geography example, we introduced an artificial level that will have no data associated with it. This means that Oslo levels will return the same values , no matter which level we viewed. However, with our example, data may exist at the Customer level for either head office. This means we will not see the specific order of data for the head offices if it exists. We will leave this in place in our model to demonstrate ragged hierarchies. However, you will need to evaluate the user experience to ensure you display the data as expected to your users.
We do not have an example of Parent-Child relationships in our model. However, as you continue to develop your skills with multidimensional design, you will find that implementing this type of relationship is fairly easy. Two of the most common examples are chart of accounts (general ledgers) and employee reporting structures. Often, that data has an unknown number of levels in it. This results in the underlying table having both a key and a parent key. The following is a partial screenshot of the properties window for a dimension with the various properties that support Parent-Child dimensions:
The two key properties to consider are RootMemberIf and MembersWithData. RootMemberIf helps SSAS determine where the top of the hierarchy is. For example, if the top level of your corporate hierarchy is the president and the parent key is null, then you would use ParentIsBlankSelfOrMissing.
MembersWithData helps determine if we display data at intermediate levels. In our previous example, you may have President > Vice President > Director in your hierarchy. You have the option to show data at intermediate levels, which assumes you have data at those levels. In some cases, the only data that matters is at the leaf level. We encourage you to experiment with this property to confirm you have the user experience you need.
One last callout that is unique to multidimensional models is UnaryOperatorColumn. When working with a chart of accounts, the aggregation and signs change as you traverse the hierarchy. For example, expenses are represented as positive numbers until they are at the same level as revenue. Unary operators specify how to aggregate and sign data in the model. This information will have to be part of the dimension table and can be specified here. This is a very powerful implementation and one of the reasons financial analytics are often easier to implement in multidimensional models.
We have done a lot of work with our dimensions already. However, there is one more cleanup task we need to implement to make the user experience better for our users. This task involves removing or hiding attributes that are not required in the dimensions.
For example, the Date dimension should only have the hierarchies exposed. As shown in the Dimension Browser tab, all the attributes that are not used in the hierarchy are exposed to the users. This is not a great user experience as those attributes are not very valuable when they're not contained in a hierarchy:
You can remove these hierarchies from the list by hiding them. Let's take a look at how to do this:
Complete this task for every attribute in the Date dimension you want to hide. We will plan to hide all of them, leaving only the hierarchies we explicitly made visible to users. Hiding these hierarchies only affects the user experience. We can still use these attributes when creating hierarchies or custom calculations.
This concludes our work on dimensions. Next, we will be adding measure groups to see how dimensions slice and dice our data.
In multidimensional models, cubes typically map to fact tables in our data source. They have relationships to all the relevant dimensions and contain measures that can typically be aggregated. Before we dig into creating our cubes and measure groups, let's talk databases and cubes.
In multidimensional models in SSAS, a database generally refers to the overall structure of the SSAS model. When SSAS was first introduced, only one measure group was supported, so the entire structure was referred to the common term cube. As the product matured, the structure became more complex. When we talk about the multidimensional model, we are usually referring to the database. The model or database is made up of dimensions, cubes (which contain multiple measure groups), the DSVs, and data sources. In SSMS, you will see a similar structure to Visual Studio. The project in Visual Studio is the equivalent of the database in SSMS. That being said, cube can also refer to the entire database, even though cubes are specific structures in the model and used by users and developers alike.
Measure groups in SSAS are typically organized around fact tables and share a common set of dimensions. This pattern follows the dimensional model paradigm and is why it is important to have star schemas to support multidimensional models.
In this section, we will create our cube so that it contains our measures. Follow these steps:
You should now see the Wide World Importers Cube's Design window. Congratulations – your first cube with two measure groups has been created! You should see something like the following in your Cube Design window:
The rest of this section will explore the tabs in the Cube Design window. While we may not change each section here, the goal is to make sure you understand their purpose and can implement what you need to deliver specific solutions in your business.
Before moving on to the next section, process the project and resolve any errors.
The specific panes in the Cube Structure tab are Measures, Dimensions, and Data Source View. You can add or modify measures in the Measures pane. If you click a measure, you can review the properties for that measure, including its aggregation, format, and source. In the Dimensions pane, you can add a dimension or go to the Design window for a dimension from that pane. If you need a new dimension or need to remove or add a dimension to the cube structure, you can do that here. The other cube pane is Data Source View. This is the visual diagram of the underlying structure that supports the cube and its measure groups.
Now, we are going to modify some measures. Cubes present measures based on the settings here. The primary focus will be on the aggregation functions and their formats. Let's modify the Quantity, Total Including Tax, and Tax Rate measures in the Sales measure group. You should update these for each measure in both measure groups before leaving this pane. These examples should help you understand the basics.
The Quantity measure is the quantity of items ordered. We will review the key attributes and add a format string in these steps:
With that, you have just formatted the Quantity measure. If you prefer to use parentheses for negative numbers, the format string would be #,##0;(#,##0).
The Total Including Tax measure is the total sales amount for the order, including tax, as the name suggests. We will review the key attributes and add a format string in these steps:
The Tax Rate measure is the tax rate that's applied to each line. We will review the key attributes and add a format string in these steps:
You should be able to use these patterns to update the remaining measures in both measure groups. There are other aggregations that can be used here, including Count, Min, Max, and even None.
Distinct counts in multidimensional models
Distinct count aggregations should only be used in their own measure groups. This is recommended to prevent adverse performance on queries that do not require this measure. If you need to add a distinct count, you should add a new measure group and select Distinct Count for the aggregation type. Then, you should select the attribute to perform the distinct count on. This will create a new measure group in the cube that will manage the distinct count measure. Refer to Microsoft's and other community documentation for additional information on managing distinct count aggregations in multidimensional models.
This tab highlights the importance of the Bus Matrix we referenced in Chapter 3, Preparing Your Data for Multidimensional Models. This tab visualizes the actual implementation of measures with the dimensions. Here is what you should expect to see on this page if your cube has been organized correctly, as per the steps outlined in this book:
Now that we have the dimensions set up, we can add more capabilities to our model.
The next two tabs are the focus of Chapter 5, Adding Measures and Calculations with MDX, where we'll dig into MDX and calculations. For now, you can skip these.
Actions are a great feature available in SSAS cubes. An action will allow you to drill through to details, open a report, or open a URL based on where you are in the data. It uses the data available to build out the action.
Note
Be aware that actions are not supported in all end user tools. For example, they are supported in Excel, but not in Power BI.
We will add a drillthrough action to our cube, as follows:
On the Partitions tab, you will see the default partitions that have been created for our measure groups. Partitions physically separate the data in our measure groups into different buckets. These can be used to reduce processing time and effort for measure groups. While our cube does not require these since processing performance is fine in single partitions, let's walk through adding year partitions to the Sales measure group.
Our Sales measure group contains data from 2013 through 2016. We will create four partitions:
We will use the Partition Wizard to help us create our partitions. In production environments, you need to plan on creating new partitions as new data comes in. Some of those techniques will be covered later in this book. Let's get started:
When you are done, your Partitions tab should have the following partitions for the Sales measure group:
We did not change the Storage Mode option. This option enables support for Relational Online Analytical Processing (ROLAP), which is used for direct relational querying from the cube. This is typically done to support real-time techniques so that we can view data changes as they occur. Multidimensional Online Analytical Processing (MOLAP) is typically the best option to choose for performance reasons. This is the default option and how multidimensional models are stored. Aggregations will be covered in the next section.
Process your model and resolve any errors. Next, we will add aggregations to our model.
Aggregations are used to improve the query performance of your model. A balance has to be struck between too many aggregations, which can bloat the size of the model and ultimately hurt your processing and query performance, and too few aggregations, which keeps the cube smaller but makes performance an issue. When first creating a cube, the best plan is to let the Aggregation Design Wizard help you design them. Once you have deployed the model and there has been a lot of usage, you can use the Usage Based Optimization wizard to target aggregations to improve the user experience directly. For our model, we will use the Aggregation Design Wizard to create our initial aggregations.
Let's add aggregations to our Sales measure group:
We can repeat this process for the Invoice Sales measure group. If you take a look at the Specify Object Count screen, you should notice that the counts are partially filled out based on the work we did with the Sales measure group. Now, process your project.
Perspectives are like views in SQL Server databases. They allow you to create analytic views, which can make browsing the cube or creating reports easier. Let's create a simplified view of our Invoice Sales data:
Translations allow you to supply a specific language for the names of dimensions, measures, and other viewable objects. You can also manage translations for dimension attributes on the Translations tab. You can do this for each dimension.
We have been building out a lot of functionality, and you have likely already used this to view your changes. We recommend processing the cube one more time to make sure all your changes are in place:
Here are some key things to be aware of in the browser:
This concludes the build portion of the multidimensional model.
At this point, you have successfully created an Analysis Services multidimensional project. You have added dimensions and cubes to the project. You have also deployed your project as an Analysis Services database and processed that database so that you can load it with data. You now have a cube that supports basic analytics, which means you can browse the cube right now using tools such as Excel or Power BI. We wrapped up our build and deployment by browsing the data we deployed to the cube.
In the next chapter, we will continue to expand the cube by adding calculations and KPIs to it. We will also explore our data using SSMS and MDX.