#powerquery – How to handle different decimal separator when importing csv files

Recently I have been working on a project where the solution should import a csv file exported from a SQL server. For some reason sometimes the data comes with a , (comma) as the decimal separator and other times with . (dot) as the decimal separator.

This meant that when importing the different files, I had to find a way to dynamically change the culture code setting when importing the file.

Let’s try and open the SalesWithDot.csv file in Power BI Desktop

As my Power BI Desktop uses Danish settings and danes uses , as a decimal separator Power BI desktop will think the values is 30.000 instead of 300 etc. – and we have to tell Power Query that the source is from another Culture – so I click Edit

As we can see from the documentation Table.TransformColumnTypes – the function has an optional parameter called Culture –

And by adding “en-US” to our formula bar we can have the right value in the SalesValue column

But when we use this when importing a file where the sales value is formatted with a , as the decimal separator and we use the same culture value (“en-US”) then we have the problem.

And by changing the culture to da-DK it shows the right values

So how can we make Power Query dynamically determine which local to use ?

In this case I know that the column SalesValue will either contain a comma or a dot – and by checking the first value of the file imported I can select which culture to use – this can be done like this

The step before we “Changed Type” is called “Promoted Headers”

To check whether the Sales Value contains a comma – we can use the Text.Contains function

And we can refer to the first value in the SalesValue column like this

#”Promoted Headers”[SalesValue]{0}

Then

Text.Contains(#”Promoted Headers”[SalesValue]{0}, “,”)

Will give us true if the cell has a comma in the cell before its changed to a decimal value.

When we know this we can change the

= Table.TransformColumnTypes(#”Promoted Headers”,{{“Product”, type text}, {“SalesValue”, type number}, {“SalesUnits”, Int64.Type}}, “da-DK”)

To

= Table.TransformColumnTypes(#”Promoted Headers”,{{“Product”, type text}, {“SalesValue”, type number}, {“SalesUnits”, Int64.Type}}, if Text.Contains(#”Promoted Headers”[SalesValue]{0}, “,”) then “da-DK” else “en-US”)

The if statement will then use the da-DK culture if the SalesValue contains a comma.

And in the example file with the dot

You can download an example file here.

Hope you find this useful – Power On!

Time your Power Queries – #powerbi #powerquery

Sometimes your power queries can get quite heavy and you might need to optimize the steps in your query but how can you calculate the time it takes for your query to load.

Wouldn’t it be nice if you could have Power Query to do it for you

Use Power Query to time Power Query

Well – the Query dependencies window gave me the idea – what if I had a query with the start time of the refresh and then made the sales table dependent on that and then a duration table that where dependent on the sales table

Steps needed

First a query that calculates when the refresh is started

let

Source = DateTime.LocalNow()

in

Source

This will use the DateTime.LocalNow() to set the start time

Now in this example I am loading a Excel file on my local harddrive on 22,3 mb with appx. 365.000 rows.

After a navigation step and a promoted header step – I add a Custom Column where I refer to the Query “Start”

This will add the start date to all rows in a separate column and will make the sales table dependent on the Query “Start”.

Next, we need to calculate the duration in a query that is dependent on the “Sales” table.

Step 1

Create a calculation of Time now.

Step 2

Convert it into a table

Step 3

To make it dependent on Sales I add a calculated column that retrieves the MIN value of the values “Start” in the table “Sales”

Step 4

Rename the columns

Step 5

Now we can calculate the duration in seconds by using the function Duration.Seconds() and subtracting [End] and [Start]

Step 6

And finally convert it to a decimal value

The full query is now

let

Source = DateTime.LocalNow(),

#”Converted to Table” = #table(1, {{Source}}),

#”Added Custom1″ = Table.AddColumn(#”Converted to Table”, “Custom.1”, each List.Min(Sales[Start])),

#”Renamed Columns” = Table.RenameColumns(#”Added Custom1″,{{“Column1”, “End”}, {“Custom.1”, “Start”}}),

#”Added Custom2″ = Table.AddColumn(#”Renamed Columns”, “Query Duration”, each Duration.Seconds([End]-[Start])),

#”Changed Type” = Table.TransformColumnTypes(#”Added Custom2″,{{“Query Duration”, type number}})

in

#”Changed Type”

Then I disable the load of Sales table in order not to have Power Query read the file several times – (OBS be careful if you already have created measures on the table as the disable of load will remove these measures !!!!)

To time the refresh I click the Refresh button

And the card I have inserted in my report will show the number of seconds the query took.

Now let’s see what a conditional column cost

So, in the sales Query I add a conditional column that does evaluation on every row using the contains operator

And click refresh again.

Depending on your scenario you properly run the refresh several times in order to see the effect on your query.

Comments

Please let me know if you have comments or have solved how to time your power queries in another way.

Happy querying

#PowerQuery – Replace Values in one column with the value in another column

Just wanted to share a method in Power Query that might also help you. Today I had a table where the column contained information about both the parent element and child element

In the example the CategoryInfo column contains both the Category and sub Category information.

And I wanted it to be transformed into this

Let’s first add a conditional column that checks whether the row contains a Category and if so insert the text from the CategoryInfo Column

Next up is the Replace Values magic.

The Replace Values dialog doesn’t support referencing a column

In the example I just add a step where I search for C and replace it with “”.

This step is not good enough so I modify the function parameter to this

= Table.ReplaceValue(#”Changed Type”, each [SubCategory], null, Replacer.ReplaceValue, {“CategoryInfo”})

The second argument uses

each [SubCategory]

to use the value in SubCategory in each row at the “Value to Find”

The third argument uses

null

as the Replace With

The fourth argument is changed to

Replacer.ReplaceValue

Instead of Replacer.ReplaceText so it will replace the whole value of the column to search in and replace it with null

This will give us

And now we can use fill Down to insert the CategoryInfo in all the null cells

And finally filter all rows where SubCategory is null to end up the final result

You can download the example file here – link

Power Query On !!!

Extraction of number (or text) from a column with both text and number – #PowerQuery #PowerBI

When you are working with data in Excel or PowerBI the data often contains columns that is a combination of text and numbers.

One example could be like this

If you have this challenge you shouldn’t use Split Columns or Text.Range to do this but check out

Text.Select

Documentation here

And Chris Webb has good example using it for text – here.

My example demonstrates how to work with text but also works with numbers and capitals letters and symbols etc.

Here is how we can extract the House number and Zip Code – use the Custom Column from the Add Tab in the Query Editor window

= Table.AddColumn(Source, “Housenumber”, each Text.Select([Street], {“0”..”9″}))

= Table.AddColumn(#”Added Custom”, “Zip Code”, each Text.Select([Zip], {“0”..”9″}))

And now we have

And one other benefit is that the Function doesn’t return an error when there is no number in the string.

Here is an example file

Hope you find this useful

Power Query is awesome – dynamic table of dates – #powerbi

Many blogposts has been written about how you can create a generic date table in your Power Pivot / Power BI datamodels.

Here is a few examples

https://blog.crossjoin.co.uk/2013/11/19/generating-a-date-dimension-table-in-power-query/

http://www.mattmasson.com/2014/02/creating-a-date-dimension-with-a-power-query-script/

http://www.powerpivotpro.com/2015/02/create-a-custom-calendar-in-power-query/

https://www.powerquery.training/portfolio/dynamic-calendar-table/

But my challenge with these has always been that I had to specify from date and to date either in a worksheet, text file or directly in the Query Editor.

Normally your fact table actually contains the max and min date that you actually want your date table to contain – so in the last model I created I decided to see if I could make the from and to date dynamic based on the max and min date in the fact table.

By using a bit of M magic functions it turned out to be very easy.

Let’s begin

In my example I load the FactInternetSales table from the AdventureWorks database

Now as we can see in the query editor we have a column containing the Order date and the min and max of that column should be used to generate the table of dates.

Convert the OrderDate to Date instead of DateTime

Then add a blank query via the New Source button

In the formula bar you enter the following formula

= List.Dates(List.Min(FactInternetSales[OrderDate]),Duration.Days(List.Max(FactInternetSales[OrderDate])-List.Min(FactInternetSales[OrderDate])),#duration(1,0,0,0))

And you get a list of dates – 🙂

Now we can add columns with the date information we want in our date tableby using the add Column Date – From Date & Time

The date table now depends on the Query FactInternetSales – and with the September update of the Power BI Desktop we can see that via the Query dependencies window

Now Power Query apparently knows that it has to update FactInternetSales before it updates the Date table, so when our facttable is updated the Dates get updates afterwards.

You could by modifying the expression to get the list of dates calculate the first day of the min year and last date of the max year etc. – all this depends on how complete you want the datetable.

Power Query is AWESOME J

Inserting a picture/logo using the Power BI online designer #PowerBI

When you design reports in PowerBI using the online designer and not the Power BI Desktop designer you will notice that you cannot insert a picture via the interface.

With the development speed of PowerBI we will properly get it soon – but until that you might be find this workaround to do it via the datamodel interesting.

Let us switch to the Power BI Desktop designer and open a new report to create a datamodel we can publish to PowerBI.

Inspired by Chris Webb on how to create a table using M – (link) I have created a list of pictures using the #table statement

And this gives me a table with a number of Power BI celebrities J and a link to their twitter picture.

Now set the column ImageURL Category to Image URL

Now I can save the Desktop file and publish this to Power BI.

And now you can use the table or matrix and insert the ImageURL field on to your canvas.

Remember to use the Visual level filters to pick the image you want.

 

 

 

 

 

 

 

Annual accounts for Microsoft Denmark using #PowerBI and XBRL

The Danish Central Business Register -https://datacvr.virk.dk/data/?language=en-gb& contains a lot of data including the possibility to find the annual account for all companies in Denmark.

These are available in two formats – pdf and XBRL format – so for instance for Microsoft Denmark ApS we can find the latest Annual Accounts (financial statement) via this search

https://datacvr.virk.dk/data/visenhed?enhedstype=virksomhed&id=13612870&soeg=microsoft&type=Alle&language=en-gb

As you can see, we can get the document via the link XBRL and get the annual account in XML format like this

You can read more about the XBRL format here – https://www.xbrl.org/

So knowing that we can search for companies using their CVR number and find the latest account let’s try to retrieve the data using Power BI Desktop.

Find the company data

Open the Power BI Desktop and choose get data – and the source is Web

This will give you the HTML document

But we want to find the links to the XBRL accounts so we actually don’t want it opened as an HTML document – so click Edit and in the steps pane click the source to change it to open the document as a Text file

And delete the navigation step after Source.

Now we can find all the rows where XBRL is

This gives us the three rows with information about the link to the accounts in XBRL format

Then trim the column and Split the value by ” – gives us a column 1.8 containing the link to the file.

Now we can delete unwanted columns, add an index and filter by 0 in the index column to get the latest account

Now we have a table with a link to the latest XBRL account for a given company.

Retrieve the account information

To get specific data in the accounts we can use this XBRL link to do a new query retrieving the data

So click on Home tab and New source and select Web again

https://datacvr.virk.dk/data/offentliggorelse?dl_ref=ZG9rdW1lbnRsYWdlcjovLzAzLzQwLzg1LzJkL2ZhL2VjZTItNDViMC1iOTFjLWQzOGUzZTIzZTRhYQ

This gives us the navigator to the XML document

If you are interested in specific elements you can tick those – but click for instance Context and click OK and via the Query Settings click source

Click on table in the node – http://xbrl.dcca.dk/fsa

And you get all the account lines

Then expand the Table column and choose all elements

Now in order to get the current years data you can filter by the contextRef c1

Now after some filtering and removing of duplicate rows – we end up with the lines for the profit and loss as seen here

The values are all positive so in order to handle this – we can pivot the data and do multiplication with -1 and unpivot the columns again

Then we can load this data and visualize the Profit and loss in a waterfall chart

By default the waterfall sorts the data by A-Z which isn’t exactly what we want – luckily there is a workaround for this

Choose Edit Queries and add an Index column to the Context query.

Load the data and select the Attribute in the field pane – and in the Modeling select Sort by Column and choose to sort by Index

And after some formatting etc. we can illustrate the Profit & Loss for Microsoft Denmark like this

Now we can wait a few month to the next Account is done and just click refresh.

Using functions and/or parameters you can make this highly flexible – if this post gets more than 10 likes J – I promise to show you how that can be done.

Hope you liked this – if you want to play around with you can download the pbix here.