#PowerQuery – Get the population (or any other) data from Statistics Denmark into Power BI

It turns out that the blogpost I wrote 10 years ago about getting data from Statistics Denmark into Power BI with Power Query still is being used – 😊 – link.

But as the API has changed a bit since then I was asked to do an update of the blogpost – so here is how you can get the population of Denmark imported into Power BI with Power Query.

The source

Statistics Denmark provides you with a lot of information about Denmark. These data can be accessed via this site – Statistics Denmark (dst.dk). When you manually want to download the data you will find all the tables in Statistikbanken (statbank.dk)  – which provides an interface to setup a report that you then will be able to download in different formats.

In this example, I set up a simple report and you can see the export formats here.

As a Power BI developer we don’t want to do this – we want to automatically retrieve new data – API to the rescue

The Statistics Denmark does provide an API that we can use to extract the data automatically.

Its located here – Databank API – Statistics Denmark (dst.dk)

Let’s extract the Population for Denmark

When we click the button β€œSTART CONSOLE” we get this

We want to retrieve data so pick that under 1) Choose API call

Under 2) Choose settings for call we first need to select whether to do a get or a post call – we can do both but a post call requires a bit more manual code so I will stick to a get call. (If you want to know how to create a post call in Power Query – I can recommend this article from Chris Webb – Chris Webb’s BI Blog: Web Services And POST Requests In Power Query (crossjoin.co.uk) or my own here – Specifying JSON Query in Power Query – Example Statistics Sweden – Erik Svensen – Blog about Power BI, Power Apps, Power Query (wordpress.com).

Next is to specify the table name

In the Table id I have specified β€œFolk1a” – you can find the different table names in Statbank (id highlighted in green)

Next – select the Format β€œCSV” as this is the binary format the API will return to Power Query.

When you click more settings, you can change the delimiter if needed.

Next up we will click the Variable- and value codes and this will give us the opportunity to select what data we want to extract from the source.

You can click the individual items/codes or click on the * to get all elements from the variable.

In this example I have picked all KØN (gender), all ALDER (ages), total for CIVILSTAND (marital status), All – TID (time) and all OMRΓ…DE (regions and municipalities)

Under Other settings I leave it to default but you could change the language to English for instance

Under 3) Call the API with the chosen settings

You can see that a URL with the query is created for you.

You can then click Execute to see the result

OOPS – the API is limited to 1.000.000 cells

So, it’s telling me to extract the data either by using BULK or another streaming format instead.

Go back to 2) Choose setting for call

And select BULK under Streaming.

The URL under 3) will now be changed

And you could try and execute the query once again and the result will be shown after a while.

Copy the URL …. And then

Let’s do it in Power Query

Open Power BI Desktop and click get data and choose Web

Paste the URL into the dialog box and click OK

Very quickly you will get this based on the first 200 rows

If the Query is what you want, then click Load but you should really click Transform Data to control what data is imported to your model – so click Transform Data

And wait a couple of seconds for the preview data to load.

The data contains total lines, text after ages, not a proper date for the time and a few other transformations before we are ready to import the data in to our data model.

I won’t go through each of them but have a look at the file.

You can modify the URL to change what time is imported and make it dynamic depending on what you want imported instead of just all periods.

The API is not the fastest so the appx. 1.2 million rows this selection returns takes a couple of minutes to update.

And then we can start to create explicit DAX measures and visualize our data.

In the demo file I have created this page to create an overview of the change in the population between two different periods in top x municipalities in Denmark.

You can download the demo file here – and just click refresh to get the latest data 😊

Let me know what you think and if you like it give the post a like please.

Stay Queryious !!!

Document your #powerbi model with #chatGPT and a #tabulareditor script and browse it in your model

Inspired by Darrens Gosbell’s excellent blog post – Automatically generating measure descriptions for Power BI and Analysis Services with ChatGPT and Tabular Editor – Random Procrastination (gosbell.com) – and my session at SQLBits 2023 where I showed how to call the chatGPT from within Power Query and how you can use the New chat (openai.com) – to describe your M code.

I thought it would be nice if I could add this as a description to my tables in Power BI model.

How hard can that be πŸ™‚

Well, it turns out that we are not allowed to modify the table object via the Tabular #ditor and this includes adding a description to the table

So, if I created a script that added a description – it worked fine but returning to the desktop I was asked to refresh the queries and the descriptions where removed by Power BI desktop –

Well, what is supported as write operations to a model?

Calculation groups is supported – hmmm… could I use this and then actually create a calculation group with a calculation item for each of the tables in my model – that would also mean that I could create a table in my report and expose the descriptions in a report page instead of the description should be hidden away on the tooltip of a table in the field list – This could actually be quite awesome!

C# Script here we go

I know very little about how to write C# code so Darrens example and the documentation of Tabular editor was a big help and please bear in mind that I am a C# novice when reading my code

The steps needed to this is as follows (I have used Tabular Editor 3 but it should be possible to us the free version of Tabular Editor as well

Step 1 – Open your Power BI desktop file and find the external tool ribbon and click the Tabular Editor

Step 2 – Add a Calculation Group called “Documentation”

Step 3 – Choose to create a New C# Script

Step 3 – Use the following script and run it

#r "System.Net.Http"
using System.Net.Http;
using System.Text;
using Newtonsoft.Json.Linq;

// You need to signin to https://platform.openai.com/ and create an API key for your profile then paste that key 
// into the apiKey constant below
const string apiKey = "<YOUR_API_KEY>";
const string uri = "https://api.openai.com/v1/completions";
const string question = "Please describe this power query code for me:\n\n";

using (var client = new HttpClient()) {
    client.DefaultRequestHeaders.Clear();
    client.DefaultRequestHeaders.Add("Authorization", "Bearer " + apiKey);

    foreach (var t in Model.Tables)
    {
        
        foreach ( var p in t.Partitions)
        {
            // Only uncomment the following when running from the command line or the script will 
            // show a popup after each measure
            //
            //var body = new requestBody() { prompt = question + m.Expression   };
                string _type = Convert.ToString(p.SourceType);
                string _exp = Convert.ToString(p.Expression);
        if ( _type == "M" )
            {var body = 
                "{ \"prompt\": " + JsonConvert.SerializeObject( question + p.Expression ) + 
                ",\"model\": \"text-davinci-003\" " +
                ",\"temperature\": 1 " +
                ",\"max_tokens\": 256 " +
                ",\"stop\": \".\" }";

            var res = client.PostAsync(uri, new StringContent(body, Encoding.UTF8,"application/json"));
            //res.Result.EnsureSuccessStatusCode();
            var result = res.Result.Content.ReadAsStringAsync().Result;
            var obj = JObject.Parse(result);
            var desc = obj["choices"][0]["text"].ToString().Trim(); 
          
            //Reference to your calculation group that should hold the calculation Items
            var x =(Model.Tables["Documentation"] as CalculationGroupTable).CalculationItems[t.Name];
            
            //deletes the old version
            x.Delete();

            var calculationItem1 = (Model.Tables["Documentation"] as CalculationGroupTable).AddCalculationItem();
            
            //removes any quotes in the chatGPT description
            var s = desc.Replace("\"", "");

            calculationItem1.Expression = "\"" +  s + "\"";
            calculationItem1.Name = t.Name;
            
            //Info("Processing " + t.Name) ;
            
            }
        }

    }
}

This will create a calculation item for each of your Power Query table and add a Expression that contains the chatGPT description of your M code

Example of one of the tables

Step 4 – Save the changes back to your model and you will be prompted to refresh your calculation group

Step 5 – Add a table where you take the Calculation group name and a measure that I call Query description – the value of this measure will in the table be changed to the expression of the calculation item.

https://tenor.com/embed.js

We could use the same method to document our DAX measures and put that into a calculation group as well – thereby documenting our full model and exposing the information directly in the report and not just in the tooltips of the fields.

I will try to find time to do a blog post on this as well.

Let me know in the comments if you find this useful – would very much like to hear from you.

#PowerQuery – Create a date column from Year number and a month name

Sometimes it’s the little things that can help you minimize the number of steps in your Power Queries.

Here is another example that might help you.

Let’s imagine your data contains two columns with year and the month name and you want to create a date column.

Start by selecting both columns and under the Add Column tab select Merge Columns

Specify the separator as a space and give your column a name (to avoid a rename column step )

This will result in a step

Now modify the step to

And then in one step you have a date column – as I have a Danish format in the format is DD-MM-YYYY.

Happy Querying

#PowerQuery – Convert a referenced query column to a table without Table.FromList() – Magic trick

This week I watched the video from Guy in A cube with Patrick and Alex Powers – You thought DAX? Alex shows you Dynamic M-M-Magic! – YouTube.

And one trick I noticed Alex used was something I didn’t know about Power Query and had annoyed me for a long time.

As you might have seen in my previous posts I like to make the queries as short as possible – and this tip will help me do that even better than my previous tip – How to avoid a rename step in #PowerQuery when you use Table.FromList – #PowerBI – Erik Svensen – Blog about Power BI, Power Apps, Power Query (wordpress.com)

Let’s imagine I have a query with a column called Reportid – and I want to create another query with the unique values from the column.

Normally I would create a new blank query and than reference the column with – queryName[ColumnName]

And then we have a new query that contains a list.

I want a table, so I use the “To Table” in the ribbon or the shortcut menu

Click Ok in the dialog

And a new step is added and I end up with a table

THE TRICK –

Instead use double square brackets and you will not only get a table, but the column name is also as I want it. – Query[[ColumnName]]

That is awesome – hope you find it useful too

Happy Querying!!!

#PowerQuery – Add Year, Month and Day to your date table with Date.ToRecord – #PowerBI

When you build a date table in Power Query you might use the functions under Date to add year, month and day

And this will give you three steps in your Query

But we can do this a bit faster, and you will save a few clicks with your mouse

If you add a custom column using this formula

= Table.AddColumn(#”Changed Type”, “Custom”, each Date.ToRecord([Date]))

You will get a column containing a record

Then next step is to expand the record

And Voila – Year, month and day in 2 steps instead of 3.

Update – 25/10-2021

If you want to specify the datatype for each of the Year, Month and Day you can modify the formula for addColumn to

= Table.AddColumn(#”Changed Type”, “Custom”, each Date.ToRecord([Column1])
, type [ Year = number, Month = number, Day = number]
)

This will make the columns numbers

Happy Querying

How to connect to CDS from #PowerBI – Or where the h.. can I find the server URL

The last few month I have used CDS a few times in my solutions – and connected the data to PowerBI.

But one of the things I always search for is the Server URL

Above is the screen shot using the Common Data Service connector

And even the beta connector requires me to specify URL –

And even though the dialog says “Environment domain” – it is in fact the URL the connector wants – BUT without the https://

OBS – If you want to test the Beta connector – remember to enable TDS Endpoint under the Environments – Settings – Features

So where can I find the address

If you have access to the Admin center – Power Platform admin center (microsoft.com) – you can go into the environment and see URL.

If you do not have access to it – then open the model driven app and the URL is available in the address bar.

Hope this can help you.

Power On !

Tips to limit the number of steps in #PowerQuery – #PowerBI

When returning to some of my first queries built in Power Query, I always get a bit sad.

My old queries contain some unnecessary extra steps that makes it harder to evaluate what is happening in the query and in most cases I haven’t even bothered to rename the steps which makes the queries very hard to read for me and for others.

So, in order to improve here are some tips that can make mine and your queries shorter and easier to read.

Always Rename your Steps

Always spend time to rename your steps and avoid spaces in the step name.

Makes it easier to read and is documentation.

Especially important because the step name does not necessarily describe what is happening.

In this example I used the Remove Other columns function, and the step name refers to that function

– but what is actually happening is I am selecting specific columns – notice the formula bar – Table.SelectColumns

So, rename the step.

Check if you can avoid a column rename step

Using the interface in the Query Editor makes it very easy to get the “Renamed Columns” step but it can sometimes be avoided – For instance expanding a record column – in this example the DimGeography Column

– in this example the DimGeography Column and then double clicking the headers returned will lead into 2 steps but notice the last argument in the formula bar.

That is a list of column names that will be given and by modifying that we can do the expansion and renaming in one step.

Avoid the empty filter steps ie to not filter by (Select All)

When testing my queries, I sometimes filter the result to check the data

And then to clear it I sometimes use the (Select All) to remove the filter

This however leaves a filter step in your query where the Table.SelectRows function just is “each true”.

If you use the clear filter instead

The Filter step will be deleted

Limit the number of Rename Columns

Doing column renaming during the query steps can lead to several column renaming steps.

Try to do only 1 renaming step.

Limit the number of steps with Remove columns

Try to consolidate the column selection or deletion of columns in one step.

And remember the optional parameter in the Table.SelectColumns and Table.RemoveColumns where you can specify what should happen if a field in the column list is missing.

Don’t reorder your columns

You can change the column order in the Query editor but there is no need to do it

The columns will appear in alphabetical order in your data model anyway.

If you have a large table and can’t find the column use the Go to Column to locate it

A good bad example

Yes, this is one of my queries

This is the query after a clean-up – it does exactly the same but in fewer step and with a proper explanation in the step name.

Do you have any tips ?

This was a few tips to limit the number of steps in your query.

Let me know if you have any in the comments.

Stay queryious.

#PowerQuery – Dynamically reference a table via a Parameter

The other day I had a fun challenge from my co-worker – SΓΈren Faurum – that wanted to change the refence to a column name from one table to another table with the same column name.

OBS – The solution is not supported in the Power BI Service.

In his case it was

Let
   x= List.Distinct(tableName1[StoreKey])
in
   x

And he wanted TableName1 to be a value of a parameter.

Here is an example (data is from AdventureWorks) – in my model I have two tables – DimReseller and DimCustomer – both have a GeographyKey

A parameter called TableName should now be used in a query to retrieve either the unique list of values of the key from either DimReseller or DimCustomer.

If we just use

List.Distinct(TableName[GeographyKey])

We do get an expression error as the parameter is a text value and not a table that has fields.

Then I thought we could the function Expression.Evaluate() – link to documentation –

But no – apparently, we can’t reference a Table name like this.

#shared to the rescue πŸ†˜

Then I thought why not use the #shared function – as I knew that #shared will list all the functions, tables, parameters etc. that are available in the current pbix file – including all loaded or not loaded tables.

Then we can turn the record into a table and filter it based on a parameter

This will give us a table with one record.

Where the Value column will contain the table we want to be dynamic

Then by adding the following to the formula

  • {0} to get the first row/record
  • [Value] to retrieve the Table in the record
  • And [GeographyKey]

We can get all the Geography Keys in the table

And by putting all of this into the argument of the List.Distinct function

= List.Distinct(Table.SelectRows(#"Converted to Table", each [Name] = TableName){0}[Value][GeographyKey])

Which returns 510 rows.

And now I can change the parameter to DimCustomer

And see it returns 336 rows

However if we publish the model to the Power BI Service it won’t work if you want to schedule refresh as #shared is not supported in the service.

So we can only use it in the desktop

Stay Querious

Converting xls files to xlsx file using #powerautomate and avoid the pitfalls in #powerquery using xls

As described in this post – https://www.ehansalytics.com/blog/2020/2/15/avoid-using-excel-xls-files-as-data-source
there are issues to be aware off when you use xls files instead of xlsx in Power Query. See also this thread
https://social.technet.microsoft.com/Forums/en-US/41f2c8ec-1f2c-4591-ac6a-54764b2a90a7/bug-in-excelworkbookwebcontents-powerquery?forum=powerquery
.

Answering the twitter started by Imke Feldmann (https://twitter.com/TheBIccountant) thread by Ruth Pozuelo (go follow her excellent youtube – channel – link) encouraged me to write this post – – as I claimed we can convert the xls files to xlsx using Power Automate.

So here is a guide on how to do it

Convert xls files to xlsx

In the scenario I will use a trigger when an e-mail is received and use a rest API provided by https://cloudconvert.com/.

OBS – This is a paid service where you pay by the minute the conversion takes – price from $0.02 to $0.01 per minute.

First we start by selecting to build an automated flow and select the trigger “When a new email arrives (V3)”

Set the advanced options to only trigger when Attachments is included and include the attachments in the following steps in our flow.

As the next step I use an Azure Blob storage to store the file in the e-mail. When selecting the output from the previous step – power automate will automatically create an Apply to each container in which we can refer to each attachment in the mail.

In the Create Blob Action I connect to a blob storage and load the attachment into a preload folder

Now add a step where we create an HTTP request within the Apply to each container

In order to use the cloud convert REST API we need first to create a Process and use the process ID to create the conversion – documentation here

In the body property you specify your own APIKey and tell the process you want to do is a conversion from xls format to xlsx.

Next – add another HTTP request

We use a POST request again

And in the URI we use the result from the previous step as it returns a unique address to the process id from cloud convert

In the Body property we specify where

{

"input": {

"azureblob": {

"storageaccount": "<NAMEOFBLOBACCOUNT>",

"storageaccesskey": "<AccountKey>",

"container": "preload"

}

},

"file": @{body('Create_blob')?['Name']},

"timeout": 10,

"output": {

"azureblob": {

"storageaccount": "<NAMEOFBLOBACCOUNT>",

"storageaccesskey": "<AccountKey>",

"container": "converted"

}

},

"outputformat": "xlsx"

}

Here is the final overview of the steps needed.

Now send an email to the inbox you have connected your trigger to run.

And in our preload folder we can see the files

And in the converted folder we have the converted xlsx files

Hope this can help you converting your xls files to xlsx.

This will also make it much easier if you want to be able to setup

Power On!

#PowerQuery – Using SQL scalar value functions in PowerQuery

Currently I am working with a project where we extract data from a SQL server – some of the business logic is built into scalar value functions (documentation).

Now the magic of PowerQuery enables us to reuse these functions within PowerQuery and Query Folding is supported – more about this at the end of this post.

Let’s step through how to do this

I will use the famous AdventureWorksDW2014 database.

In my database I have added a function called IsoweekwithYear

This function uses a date a parameter and returns an Integer representing the year and the week. In this example it returns 202018 as 28 april 2020 is in iso week 18 of 2020.

In the Power BI Desktop we can now connect to the SQL server

And in this example I pick the DimDate table from the database

And if we scroll down you will also see the functions, we have in the database

Notice that the data preview pane will show the function and the parameters it can take.

Then I click Transform Data and the Power Query Editor will appear

By default it will Invoke the function and in this case it will return Null

Notice the Applied Steps in the Query Settings pane

If you remove the last step where the function is invoked

You will see it as a function that has a parameter called @DATE which is optional.

Now let’s use the function

In the Query DimDate we can the call the function – as we now we need to call the function using the column FullDateAlternateKey as the parameter I select this column first and then click Invoke Custom Function via the Add Column section in the ribbon.

We can then choose the Function Query IsoweekWithYear and use the column name as the parameter value

This will add a new column at the end of my query

Containing the ISOWeek – yuppieee πŸ™‚

Is Queryfolding supported ?

It sure is

As we can see as we can select “View Native Query” and see that the SQL statement is added.

This means that the server does the work instead of the M-engine.

What’s next

I must admit that I wasn’t aware of this possibility before – and I feel like it opens up a lot of scenarios to reuse already built and tested functions that your database developers have built.

Let me know if you find this useful as well by leaving a comment or give the post a like

Happy querying !!!