Checking and troubleshooting calculations
Now that we have our calculations seemingly correct, it is important to take a closer look so that we can discover any anomalies or instances that might throw off the calculations.
Boundary cases
A common method of performing checks is to look at boundary cases, or the maximums and minimums of our calculations. To do this, follow these steps:
- Start by clicking on the % Utilization column header in our table visualization. This sorts the table in descending fashion, where our highest % Utilization number is at the top of the list. We can see what some anomalies of very high utilization might be, such as % Utilization at the top, which is 175.00% for Cole, Camille.
- To check this number, click on the Data view in the Views bar.
- Click on the Hours table in the Fields pane.
- Find the EmployeeID field in the Hours table and click on the drop-down arrow in the column header.
- Uncheck the (Select all) checkbox to unselect all of the values and then scroll down and check the box next to CCOLE. You could also search for the value.
-
Click the OK button. We can see that 4 hours were billed on a Sunday and another
10 hours on a Friday. Therefore, this calculation is correct since there are only 8 potential billable hours between the two days since Sundays are considered non-working days.
- We can check the next highest case, which is for Perkins, Caitlen, by using the same process and deselecting CCOLE and selecting CPERKINS. Again, we can see multiple billable hours on Saturdays and Sundays. Hence, it appears that the maximums for our % Utilization calculation are actually correct.
-
Clicking back on the Report view, notice that while we have filtered the table in the Data view, we can still see all of the data in the Report view. This is because filtering data in the Data view does not impact the Report view in any way. The filtering
and sorting capabilities that are provided in the Data view are solely for the report author to use when verifying the data.
- Now, click the header for % Utilization in the Table visualization again. The data is now sorted in ascending order, with the lowest values for % Utilization shown at the top.
- There are numerous instances of 0.00% values for cases where Total Billable Hours is blank. However, there is one case for Irwin, Rachelle that shows 0.00% for % Utilization but Total Billable Hours as 6.00. We can filter for RIRWIN in the Data view of the Hours table by using the Search box just above the list of values that can be filtered.
- First, enter CPERKINS in the search field and uncheck CPERKINS from the list of search results that are displayed. Then, enter RIRWIN in the Search field and check the box next to RIRWIN. We can see that the 6 hours were billed on a Sunday. Hence, our calculation is technically correct, and it is a good thing that we used the DIVIDE function with its ability to return a value in the case of a divide-by-zero!
- Back in the Report view and continuing down the list of rows in the table visualization, the next potential anomaly is Beasley, Joni, who has only 2 Total Billable Hours entries but 368 Total Hours entries. Again, by checking the Data view for JBEASLEY, we can see in the footer that there are 71 rows of reported hours but that all of these rows except one are to a Category value other than Billable. A similar circumstance exists for BMACDONALD and SHESTER.
With these checks now complete, we can reasonably conclude that, overall, our calculation is returning valid results.
Slicing
Another common technique that’s used for troubleshooting calculations is slicing. To use this technique, you need to create multiple slicers to slice the data based on different criteria and then check the results and boundary cases for the sliced data. To employ this technique, follow these steps:
- In the Report view, click March on the Month slicer. Here, we can see the curious case of Wagner, Letitia. This employee has an Average of TotalBillableHours entry of 224 but a % Utilization entry of 0.00%.
- Check the data in the Data view by filtering the Hours table for LWAGNER and then right-clicking and sorting the Date column by Sort descending. We can see that LWAGNER had no billable hours in March but did have billable hours in February and January, hence explaining what we are seeing.
- With March still selected, sort % Utilization by descending and observe Cox, Rex at the top of the list with 151.17% for % Utilization. Checking the data for RCOX, we can see that this individual was regularly billing more than 8 hours per day in March and even billed on a Saturday, hence explaining the high utilization.
- Back on the Report view, resize the Month slicer if necessary to free up space on the canvas.
- Click on a blank area of the canvas, select the Slicer visualization, and this time, place the EmployeeType column from the People table into the slicer. This displays a slicer that includes ADMINISTRATION, ASSOCIATE, CONSULTANT, HOURLY, SALARY, and SUB-CONTRACTOR.
Continuing with the scenario we introduced in Chapter 2, Planning Projects with Power BI, Pam immediately notices a problem. ADMINISTRATION, ASSOCIATE, and CONSULTANT should not appear in the list. ADMINISTRATION is a designation for internal employees such as herself who never bill, while ASSOCIATE and
CONSULTANT are old designations for employees that have not been used in years. Pam doesn’t want these employees to be considered on any of the pages of the report. To fix this, follow these steps:
- In the Report view, expand the Filters pane.
- Drag EmployeeType from the People table to the Filters on all pages area of the report. An EmployeeType filter will be created.
- In this filter, select only HOURLY, SALARY, and SUB-CONTRACTOR.
-
Now, select HOURLY in the slicer. Again, Pam notices an immediate issue. Some of the values for % Utilization displayed are not 100.00%. Because HOURLY and
SUB-CONTRACTOR employees are paid hourly, their % Utilization values should always be 100.00% unless they have not billed any hours. This is because HOURLY and SUB-CONTRACTOR employees are only paid for the time they work. This can be fixed by tweaking our Total Hours measure, as follows:
Total Hours =
SWITCH(
MAX(‘People'[EmployeeType]),
“HOURLY”,[Total Billable Hours],
“SUB-CONTRACTOR”,[Total Billable Hours],
“SALARY”,
IF(
HASONEVALUE(People[Name]),
[Total Hours by Employee],
SUMX(
SUMMARIZE(
‘People’, People[Name],
” Hours”, [Total Hours by Employee]
),
[ Hours]
)
),
BLANK()
)
Here, our original calculation for Total Hours is listed on lines 7-17. What we have done is simply wrap this calculation in a SWITCH statement. The first parameter in a basic SWITCH function is an expression that returns a value. The SWITCH statement can then have as many conditions and return value pairs as necessary before ending in a return value that is essentially none of the aforementioned.
To see how this works, take a look at the first parameter, MAX(‘People'[EmployeeType]). This returns the EmployeeType value of the current row of the visualization. Remember that, within measures, we must use some aggregation function when referring to columns. Hence, we could have also used MIN instead of MAX since our expectation is that each employee is listed individually on a row of our visualization. The next line checks to see whether the value that’s returned by this expression is HOURLY and, if so, simply returns the measure for Total Billable Hours.
The next line checks whether the value that’s returned is SUB-CONTRACTOR and, if so, also returns Total Billable Hours. The next line under that checks to see whether the
value that’s returned is SALARY; if so, our original code is used. Finally, the last condition is the default condition if no other conditions are met. In our case, this returns nothing (BLANK) in the event that none of the other conditions are met.
Now, when we select either HOURLY or SUB-CONTRACTOR in our EmployeeType slicer and sort our table visualization in either direction according to % Utilization, we can see that all of the rows are 100.00% except for a few HOURLY employees that display 0.00%. These HOURLY employees are essentially unutilized, which is useful information to know. In addition, we can see that our Total line is also 100.00% for both HOURLY and SUB-CONTRACTOR employees, which is as expected.
Grouping
As a final check, Pam wants to make sure that the % Utilization values roll up by
Division. To do this, follow these steps:
- Clear all slicer selections.
- Select a blank area of the canvas and then, from the Hours table, select Division. This creates a new table with the different divisions listed, such as 1001 Technology, 2001 Accounting, and so on.
- With this new table visual selected, add % Utilization to the Values field well.
Again, it is easy to see that there is some kind of issue. % Utilization for each Division is either 100.00% or 0.00%, and the Total value is 100.00%. Worse, now that we are looking closer, when we don’t have any EmployeeType column selected in our slicer, the total in our original employee table visualization is also
100.00%, which is also not correct. Clearly, we have introduced the measure totals problem into our % Utilization measure as a result of the change we made to the Total Hours measure.
-
A change to our % Utilization measure is required, as follows:
% Utilization =
VAR utilization =
DIVIDE([Total Billable Hours], [Total Hours], 0)
VAR days = COUNTROWS(‘Hours’)
RETURN
SWITCH(TRUE(),
ISINSCOPE(‘People'[Name]) && ISBLANK( days),
-
Checking and troubleshooting calculations 151
BLANK(),
ISINSCOPE(‘People'[Name]), utilization + 0,
VAR tempTable =
SUMMARIZE(
‘People’,
‘People'[Name],
Hours],
” billableHours”, [Total Billable
” totalHours”, [Total Hours]
)
VAR
billableHours])
billedHours = SUMX(
tempTable, [
totalHours])
VAR
hours = SUMX(
tempTable, [
RETURN
DIVIDE( billedHours, hours,0)
)
In the preceding formula for % Utilization, we have taken the last iteration of our
% Utilization measure and added the measure totals problem logic to return the correct aggregated value whenever our visualization does not display data based on individual employee rows.
In this case, we have added a SWITCH statement to our RETURN block. This is a special version of the SWITCH statement that uses the TRUE function as the first argument or parameter. When using this version of the SWITCH statement, we can construct logical TRUE/FALSE statements as the conditions that are evaluated instead of individual values, as is the case in the current Total Hours measure.
The first two condition/value pairs use the ISINSCOPE function instead of HASONEVALUE for ‘People'[Name]. ISINSCOPE is a fairly recent function and was specifically designed to aid in measure calculations for instances of matrix hierarchies, as well as the measure totals problem. The ISINSCOPE function returns TRUE when the specified column is the current level in a hierarchy of levels. Hence, ‘People'[Name] is not in scope for Total lines within table and matrix visuals but is in scope for the individual rows of our table visualization that list % Utilization by employee.
The first condition checks if ‘People'[Name] is in scope as well as checking if the days variable is blank. If both of these conditions are TRUE, then BLANK is returned. The next condition only checks if ‘People'[Name] is in scope and, if so, returns our standard % Utilization calculation using the utilization variable plus 0.
The rest of the SWITCH statement covers the case when ‘People'[Name] is not in scope. In this case, we are in a Total line or some other aggregation/grouping, such as Division. Therefore, we create a table using SUMMARIZE and assign this to a variable called tempTable. This table contains a summary of our table
by employee Name and calculates the values of Total Billable Hours and Total Hours for each row in this table, assigning these values to columns named
billableHours and totalHours, respectively. We then use SUMX to calculate the sum of both the billableHours and totalHours columns within tempTable and assign the result to the billedHours and hours variables. Finally,
we return the value from dividing billedHours by hours.
-
With this change made to our % Utilization measure and by looking at both table visualizations, we can see that the Total lines now agree with one another and that we have correct % Utilization results by Division as well as by employee Name!
As demonstrated in the final calculation for % Utilization, VAR/RETURN pairs can be nested infinitely—well, as close to infinitely as you will likely ever need.
Summary
In this chapter, we took a tour of the Model view of Power BI and learned how to build a data model by creating relationships between separate tables. In doing so, we learned about the different types of relationships, as well as the concept of cross filter direction. Next, we explored the data model we created to understand how we can use fields from different tables in the same visualization to gain insights into our data. Then, we created calculations to fulfill our goal of reporting on utilization as well as building relationships between tables. First, we created utilization calculations using calculated columns and began to understand the limitations of calculated columns and when they should and should not be used. Then, we created utilization calculations using measures in order to enable truly dynamic calculations that respond to user interaction.
Finally, we troubleshot our measure calculations by using a variety of techniques such as boundary condition checking, slicing, and grouping. Now that we have built the data model and required calculations, we are ready to start analyzing our data in order to understand how the data is organized.
In the next chapter, we will explore the many powerful features within Power BI that allow us to break down, analyze, and unlock insights about our data.
Questions
- As an activity, try to answer the following questions on your own:
- What are the seven major areas of the Model view in Power BI?
- What are the four different types of relationship cardinalities that Power BI supports?
- What is cross filter direction and what are the two types of cross filter direction supported by Power BI?
- What are the two ways to create relationships in Power BI?
- What are calculated columns?
- When are calculated columns calculated?
- What are measures?
- When are measures calculated?
- What is the measure totals problem?
- What are three ways of troubleshooting calculations?
Further reading
To learn more about the topics that were covered in this chapter, please take a look at the following references:
- Create and manage relationships in Power BI Desktop: https://docs. microsoft.com/en-us/power-bi/desktop-create-and-manage-
relationships - Work with Data view in Power BI Desktop: https://docs.microsoft.com/ en-us/power-bi/desktop-data-view
- Create calculated columns in Power BI Desktop: https://docs.microsoft. com/en-us/power-bi/desktop-calculated-columns
- Create measures for data analysis in Power BI Desktop: https://docs. microsoft.com/en-us/power-bi/desktop-measures
- Apply many-many relationships in Power BI Desktop: https://docs. microsoft.com/en-us/power-bi/transform-model/desktop-many-
to-many-relationships