Categories
PowerBI Snowflake Uncategorized

Snowflake and Power BI

In the article I will try to give you a small introduction in how to create tables and views in SNOWFLAKE.

Snowflake is basically a Data Warehouse where you can have many databases.

For this article let’s assume that you have already a created user, and also you user has been added to one or many specific roles that have grant permissions of creating tables. When you enter the first time in you may have the following view:

We are going to click on the option “Classic Console” that is on the left pane of the page, as you can see above.

You’ll be redirected to sign into Snowflake again, that is normal, after you’ve signed the classic console view will have 7 tabs.

For this article only Databases and Worksheets will be described. The “Databases” section shows all the databases you have permissions to see, “Worksheets” section is where the queries can be executed, in this link of Snowflake Documentation there is a very detailed explanation of how it works.

Is SQL Supported in Snowflake?

Snowflake supports a common standardized version of SQL: ANSI. This means that most of the common operations needed in warehousing operations, like create, update, insert, select are supported.

CREATING TABLES

Here’s an example of creating a table:

Before you run the query, make sure you’ve chosen the right Data warehouse, database and schema where the table is going to be:

Then you can write your SQL statement:

Now that our table exists, we can load data from a csv file :

1- Go to the Databases section

2- Choose The database of the table:

3- Choose the table and click on “Load Table”

4- Follow the steps of the pop-up window:

  • Its important to set the property of the “File Format” step: “Header lines to skip” to 1 in order to avoid loading the columns headers of the csv.

5- Check that your data has been uploaded.

CREATING VIEWS

Let’s create a simple view in snowflake.

1- As same as tables, make sure to select the correct, data warehouse, database and schema

2- Insert the SQL statement in the worksheet console:

3- Refresh and check the table has been created:

4- Preview data to check the view is the desired one.

SNOFLAKE AND POWER BI

Now that we have our table and view let’s connect our source to Power BI.

1- Go to “Get Data” and then search for Snowflake source:

2- Need to know the server url and the name of the warehouse, also for advanced options you can insert a SQL statement as a part of the connection to the source, in our case we won’t use that.

3- We must enter the credentials and once the identity is verified we will have access to choose the table or view we need.

4- You can choose “Load” or “Transform Data” according to your convenience.

5- Finally you can use the data from Snowflake in Power BI

“So… this has been a very quick and practical guide to start familiarizing with Snowflake, hope this article can be useful.”

Categories
Dedicated SQL Pool (Data Warehouse) PowerBI

Data Transformation at Different Stages in PowerBI 

There are numerous ways to apply data transformations when creating a dashboard. One of the most important aspects is when to apply these transformations. This decision may impact not only the performance but also the complexity of the dashboard.  

Let us review 4 different options and analyze which one is better under certain conditions. 

Option 1Option 2Option 3Option 4
LanguageDAX (used in PBI presentation layer) M Language (used in PBI data layer) SQL (query database from PBI) SQL (View in database) 
Transformation needs user input YesNoNoNo
ComplexityLow to high LowHighHigh
ReusabilityNoNoNoYes
SampleAverage delivery time in user selected period Time to deliver a product. (Delivery date – order date) Time to deliver a product Time to deliver a product 

Option 1

This is the most frequent option. Transformations are applied at PBI presentation layer by using DAX in measures or calculated columns. These kinds of transformations are only recommended when they depend on some user input. Otherwise, since in previous steps the user input is unknown, every possible choice would need to be considered.  

Note that these kinds of transformations are recalculated every time the dashboard UI changes and therefore may negatively influence the report performance. 

For example, this approach would fit the need to calculate an average value according to a time period selected by the user. 

On the other hand, this option is not recommended for transformations that do not depend on the user context. For example, in a fact table about product deliveries the delivery time could be calculated by subtracting the order date minus the delivery date. 

Option 2

Transformations are applied at the PBI data layer by using the M Language in the Power Query editor. This is a very powerful tool since it has a wide range of transformation tools available with little to no code. This option is ideal when the given transformation does not involve user interaction and neither of the two following options are available. 

Note that these transformations are recalculated by the Power Query Engine every time the report’s dataset is refreshed. 

For example, referring to the example of bad use of the first option, the column “DeliveryTime” can be added with a Power Query step. This is a better approach than using the first option but still could be better with the next two options. 

Option 3

Transformations are applied by using a custom SQL query within PowerBI when the data source is being imported into the report. This option is ideal when the given transformation does not involve user interaction and we cannot create views in the database.  

Note that these queries are executed by the database engine itself every time the report’s dataset is refreshed. 

For example, to achieve the same result as in the previous example, the following import configuration could be used. 

Option 4

Transformations are made directly in the data source by creating a table view that contains the data with the desired transformations. This is the best option as long as the transformations do not involve user interaction and we have enough rights in the database. Also, once the view is created, it can be reused in multiple dashboards. 

Note that these queries are also executed by the database engine itself every time the view is accessed. 

For example, to achieve the same result as in the previous example, a “delivery_time” view should be created with the corresponding grouping. 

Conclusion

There is a key aspect to have in mind when deciding how or when to apply data transformations, the user interaction. If the transformations depend on any user interaction, then it must be performed as in the first option. Otherwise, it must be performed as in the other three options.  

Among these other options the fourth is the most preferable, but sometimes we do not have the required rights to create views in the database and therefore option three becomes useful. If option three is not available due to the type of data source, then the second option is the way to go. 

Categories
Power Apps PowerBI

PowerApps integration in PowerBI

The use of PowerApps in a dashboard is a great advantage as it offers a greater user experience when interacting with our app, and is also a method of reading, writing, and editing information directly to the data sources of the report.

To integrate a PowerApps app on a report, firstly you need to make sure all the data sources required for the app are also in the report and then insert a PowerApps visual. It can be inserted either in the ‘Insert’ ribbon and the ‘Power Platform’ section or in the quick visual menu.

Now we have an empty PowerApps visual that needs to have data assigned. In order to do it, just select the columns your app needs to run while the visual is being selected and it will change, letting you either create or choose an already existing app from one of your environments.

 We recommend that you use the option ‘Create new’, as this option is the only one which provides the possibility of passing values between the power app and power bi. As soon as you select to create a new app, a message will appear to redirect you to the PowerApps application editor. This new app will have a component called ‘PowerBIIntegration’, that will contain all the data you imported from the tables in the PowerBI report.

This new component allows the creator to use the data from the columns included in the visual of the report. The best way to use this feature is with a gallery, a list, or any equivalent way to represent information like this, if any field is filtered in the report, the item or items that meets that filter will be displayed. As an example, an app with a gallery and once in the report without filtering and filtered will look like the following images.

This is the report before applying any filter.

And this is how it looks after the filters.

As you can see when the row ‘Product 2’ is selected from the table, the gallery that contains all of its elements is filtered to only show that element. This can be also used with PowerBI slicers like a list or a dropdown the same way as selecting a record of the table, which will change the gallery from the app. Now there will be three example images with slicer filtering.

Even though the ‘PowerBIIntegration’ component sounds amazing, actually it only allows to read the data imported from PowerBI so if you want to add, delete, or edit any data, you will have to import the complete table where it comes from to the PowerApps app.

To conclude, an important thing to consider is that once you have selected the columns to include and create the new app which has the PowerBIIntegration component, you will not be able to modify the data inside this component. So be sure to have selected all the columns you need or be prepared to start from the beginning again after selecting your data correctly.

Categories
PowerBI

Privacy Levels in Data Sources

Privacy levels of a data source are a good way to increment our reports’ security as they specify the isolation level that defines the degree that one data source will be isolated from other data sources, in other words, how the different sources restrict their information to certain users or does not restrict, apart from how the sources interact with each other by being folded following a hierarchy. Although a restrictive isolation level blocks information from being exchanged between data sources, it may reduce functionality and impact performance.

DIFFERENT PRIVACY LEVELS

The different options to set the privacy level of a data source in PowerBI are None, Public, Organizational and Private levels, each one of them with their own characteristics and contributions to security. From this point we will go through every level explaining the benefits and the case of use.

A None privacylevel data source does not have any privacy which is a bad practice and should be avoided, for cases where no privacy is required, a good practice is to set the level to Public instead of None.

Public data source gives everyone visibility to the data contained in the data source. Only files, internet data sources, or workbook data can be marked Public, and its data may be freely folded to other any other source.

An Organizational data source limits the visibility of a data source to a trusted group of people. Data from an Organizational data source will not be folded to Public sources, but might be to other Organizational sources, as well as to Private sources.

Private data source contains sensitive or confidential information, and the visibility of the source may be restricted to authorized only users. Moreover, data from a private data source will not be folded to other sources, not even to other private sources.

To sum up and give some good practices, it is highly recommended to avoid None level and set the privacy to Public instead, which gives data visibility to every user. It should be configured a Private level only to those sources with sensitive information to avoid leaks or any unwanted changes, but it is not recommended to set every data source that requires any level of visibility restriction to Private, it is better to configure it to Organizational and include the people in which you are interested in having data visibility to the group you trust.

CONFIGURING PRIVACY LEVELS

You can configure the privacy level of the data sources of the report from PowerBI Desktop or Power Query, not from PowerBI Service.  If you are in PowerBI Desktop, open the ‘File’ menu on the top left corner, go to ‘Options and settings’ and then select the ‘Data source settings’ option.

While in the case of Power Query editor, look for and click on the ‘Data source settings’ option on the Home ribbon.

Once you get to the Data source settings, a new panel will appear with a list of all the data sources that are being used in the current report.

To configure the Privacy level of the data source you are interested in, select it and click on ‘Edit Permissions’ and another panel will open up.

In this menu you only need to select the Privacy Level you want from the dropdown marked in red and click on the OK button. Then just close the Data source settings panel and you have changed the Privacy Level of a data source!

Categories
PowerBI

Specify a type when creating new columns Power BI

When working with data in Power BI you often find the need to create a new column, which can be done from the transform data label “add column”. Once in there you choose which way to create the column is more useful in your case, to serve as an example I will be using the custom column option.

After selecting our option, a window will appear to let you enter whichever formula you want to use for your new custom column. Here you can choose the name and the data of the new column, but there is no button to choose the type so, what do we do?

Well, since there seems to be no other option, we click OK and we will find the new column does not have a proper type. We could add it on a new step, but that’s not what we want. Instead, we are going to specify the desired type on our formula bar as shown:

With this your data type should be successfully changed.

Categories
PowerBI

Composite keys in power BI

Data must be treated in a clear, organized way that makes it easy to create relationships between different data tables, or directly to make the visuals that use our data to show information.

In order to give a good structure to our tables, we must identify its primary key. Doing so will allow us a way to clearly refer to every row of data in our table, making it easy to create relationships with other tables using our table’s primary key as a foreign key in the new table to connect them.

This can be obvious in cases that have clearly identified data (for example a list of people with its identifier, even if two people have the same name, surname, and income, each one of them will have its own unique ID so you can tell they are not the same person). But in some other cases, it may get a bit tricky.

I have created fake data to create an example of this tricky situation[1]. Specifically, a cinema data of the expected income per week and film on one table and the track of the tickets sold in one concrete month. Please notice that the data may not be realistic, but that is not our current concern. The power BI data would look like this:

Expected income table
Month sessions track table

We’d probably have more tables (for example one that relates the movie ID and session with the price of the ticket), but we only need this two for the current explanation.

Imagine that with this data, the client asks for a way to see whether the expected income for each movie is being obtained in each week, you will need to use both columns (movie ID and week) to identify the expected income on one table, and you’d need to do the same to count how many tickets were sold in that week for each film (with which you could, later on, relate the tickets to their price). It is easy to notice that we’ll be working with composite keys.

Identify primary and foreign keys

In our “Expected Income” table the primary key would be made of the movie ID and the Week since there’s only one expected income per movie each week.

In the “Month sessions track” table the ticket ID would be the primary key since it is unique for each row of data, but we cannot use it to create the connection we are looking for. Knowing this we look for a key that we can use to create a connection with the “expected income” table.

The first thing that is clear is that we are looking for a one-to-many relationship, where the table with the income has only one row for each pair of Movie ID and Week but the Month sales track will have multiple instances for each pair.  To make it more visual, I’ve made a representative schema with just one pair (of course both tables have more fields as shown with the previous images, but just so you get the idea):   

In some working environments, this kind of relationship is detected by the app or web, and you are given the option to create the relationship between these two tables making use of this composite primary key, but power BI is not one of these environments.

Power BI procedure

For creating a relationship

In order to properly work with this table’s primary key in power BI, we must create a new column combining all the columns that compose the primary key.  This can be done by adding a custom column with the concatenate operator or by adding a column from examples.

If you choose the custom column option, the expression for the new column should look like the following.

There are two crucial details to notice here:

  • You must transform the number into text in order to use the concatenate operator, if you try to use it directly with a number Power BI will complain with errors.
  • The “- “does not add any information but will act as a delimiter for any human who works with this data to easily understand where one field begins and the other ends. I used – but you can use the one you prefer.

Then you should go to the other table and apply the same procedure.

Once done, you have a column that acts as a primary key for the “Expected Income” table and the same column acting as a foreign key in “Month sessions track “. This allows you to tell Power BI to create the one-to-many relationship that we know exist between both tables. To do so close and apply the changes you’ve made in power query editor and get into the power BI’s model environment, add a relationship connecting the new fabricated columns and you are done.

Your new relationship should look like this:

For merging the queries

Depending on your situation you may want to merge both queries into one instead of creating a relationship between them. If this is your case, then power BI does allow you to merge two queries by as many fields as you want.

To do so the first thing is clicking over the merge queries option on power query editor (I will merge them as a new query, but you can choose to do a normal merge if it suits you).

After clicking a helper window will appear, once there you select the tables you want to merge and then select as many columns of them as you need by holding “ctrl” while you click over them. Once the fields are selected, choose the kind of join that better suits your objective and click OK.

After clicking OK your new merged query will be ready for you to modify or visualize as it pleases you.

Merged query
Merged Query after expanding Month sessions track table

[1] To do so I used https://www.mockaroo.com/

Categories
PowerBI

Grouping your queries

When managing data on Power Query Editor we load it from our sources and modify it as we need to please our clients, but sometimes we get errors in the process or need someone to give us some advice, which is a lot easier if our queries are organized in a way that they are easier to go through. I will explain how I recommend organizing your queries to have a better working environment that allows you to find things faster.

Before we start I want to share the link to the page where I got my sample excel files to make the examples in the article: https://docs.microsoft.com/en-us/power-bi/create-reports/sample-datasets#download-sample-excel-files

What is a group?

A group is what Power Query Editor offers you to have your different queries stored in an organized way, you could say that they are similar to a folder. When you first upload your data and click over transform data, the Power Query Editor will start with just the query containing the source of data you have just uploaded.

As it is shown in the picture, there are no groups of any kind, just your query with the data you got from your source. This seems completely fine now, but when you have over 5, 10, 20, or more queries and parameters it becomes messy:

Even with the representative names that I tried to give to my queries, my environment looks unorganized. By grouping those queries, it’s more intuitive where to look for concrete data, either for further development or debugging:

 As you can see, every sheet is grouped depending on whether it has been modified or not, which source path it came from, and so on.

You could freely choose more or less grouping as you find it easier to locate yourself in the environment while working but creating at least two or three groups is definitely going to help your work look more organized.

Also, you can hide the queries you are not interested in working with by clicking over the black arrow, which makes the environment more compact even if you have a million queries (which is very extreme, but you get the idea).

How to create, delete and ungroup groups

To create a group, you have to right-click over the grey space in your Query tab, then press “new group”, after doing so, fill up the information of its name (the description is optional but can help you remember what the data in the folder was for):

After creating your first group, it will be empty, and all the queries will be stored in an automatically generated query. To get the queries you want into your group, just click over them and drag them to your group. After doing so, you will obtain something like: 

This automatically generated group is a feature of Power BI, and you cannot do anything about it. You are able to rename, ungroup and delete every group created by yourself, but none of the above actions can be performed towards the automatically generated group called “Other Queries”.

When working with any other group, to delete it will erase the folder and all the queries it contained, so be very careful about the groups you decide to delete. To ungroup on the other hand will make the folder disappear, but the queries will not disappear with it.

Apart from that, you can also create new groups inside a group if you find it fit.

To either delete, ungroup, rename, add a new group or query, and many other things, you should only right-click on the group you want to perform over as shown in the next image:

As you can see, there are multiple actions you can perform, which are intuitive enough as to not have to explain what they are for.

The most confusing one may be “Properties…” since it will only show you a pop up with the name and description the group was given when created.

Which groups should you create

Of course, this is up to you and the standards of the organizations you are working with, but I will give you the pattern I recommend.

In my opinion, you should always keep your sources just as they came in and make all the changes to them into a query duplicated or referenced to it. The reason for it is that if you notice you are missing data or your query is giving an error, you can check whether it was you who accidentally deleted a necessary column or row, or on the other hand, such information wasn’t included on the query from the beginning.

Therefore, I recommend having at least two groups, one named Sources to store the queries that are just as they were loaded, and a second one named Transformations where you can find the queries that have steps applied to them.

Apart from those two, sometimes you will need a third group to store parameters, this is so because parameters can be referred to either your sources or the transformations in your queries, so if you don’t create an extra group for them and have them mixed with the queries they relate to, it can end up getting messy, which we want to avoid at all costs.

This is the base structure that I’d recommend to everyone, leaving the other Queries table either empty or filled with queries created by yourself (meaning they do have steps but do not come from any source in your Sources group).

Categories
PowerBI

Workspace Access in Power BI

When you add a person to a Workspace the first thing that should be kept in mind is what level of permissions you want to grant. In Power Bi exists 4 roles of access that allow different types of modifications in the Workspace, being Admin the role with more permissions and Viewer the less.

How to add a user and assign a role in your workspace

In order to add a user, you need to be in your workspace and find the button “access”, click it.

After it, a window will open and there you can select the level of access you want to give. This is possible to select in the window below the email.

Once the role is assigned, you can change it later in the access window and clicking the more options icon of the user.

Best Practices:

  • Fewer Admins as possible.
  • Assign the lowest role permission possible for the task to do.
  • Assign Admin role if needed to have all the permissions in the workspace.
  • Assign Member role if needed to change permissions but not change the admin permissions.
  • Assign Contributor role if needed to upload or change data in the workspace.
  • Assign Viewer role if only needed to read data.
Categories
PowerBI

Row Level Security

In Power BI Service, roles can be assigned to users, filtering data they can access it. This doesn’t restrict the access to the Datasets.

To do this, firstly, you need to define roles and his appropriate rules in Power BI Desktop, specifying what data is going to be filtered to that type of user. In the modeling tab, click on Manage Roles -> Create

After creating the role, you need to click in every table you want to filter with a DAX expression (only can be TRUE/FALSE). In this example, I will create a role for Spanish users, filtering only sales made in Spain. Once we are done, in the bottom right corner, we click Save. Roles are assigned in the Power BI Service to each user.

This Row Level Security filter, by default, uses single-directional filters. To able bi-directional filters with RLS integrated, you only need to click on “Apply security filter in both directions”

To test it how will look, in the modeling tab, click on View As -> “Your role” (In this example case, Spain). Also, you can set several roles simultaneously, as a user can have more than one or set a specific user.

Assign roles to user it’s very simple. You can add members external to your organization, but you can’t add Groups created in Power BI. In the Power BI Service go to your Dataset -> Security and assign each user to his role.

To validate everything works fine and the role has been assigned properly, click more options right to the role and “Test as role” (pics)

Categories
PowerBI

Create Alerts

Alerts are a powerful tool which brings a very important feature, get notified when data changes. In order to be activated, these alerts will need to meet a certain condition, for example, an alert which will be triggered when total sales are more or equal than 500$.

To create alerts, we firstly need to have/create a dashboard where we will be putting our tiles with data that we want to notify us when changes. Alerts can be set in your personal workspace. Also, it can be set on workspaces with premium capacity or if you have the pro license. Other thing to have in mind, only KPI, Cards and Gauges can be alerts.

Creating a dashboard is very simple. First, you need to enter in your personal workspace (or a shared one), click on New -> Dashboard and create it with the name of your choice, for this article I will call it “Dashboard Tutorial”

Once the dashboard is created, we will be adding our tiles there with the data we want to notify us when changes.

First, we need to pin a tile in the dashboard. To do this, we need to go inside our report, choose the Card/KPI/Gauge we want and pin it to our dashboard clicking the “Pin” icon. 

If we, did it correctly, it will be in our dashboard as a tile

Now, with our tile in the dashboard, the next and most important step is creating the alert. If we hover the mouse over the tile, we can see in the top right corner three dots, click them. A menu will appear, go to “Manage Alerts”

Now, in the Manage Alerts tab, we need to add a rule for the alert, in my example, I want to get notified when my sales surpass 15,000 with a maximum notification of 1 per day. Once we set the rule how we want, our last step is clicking on Save and Close.

When the alert is triggered, a bell will appear above our alert in the dashboard (and if we clicked the “Send me email, too” option, an email will be sent to us too.)