#PowerQuery – Get the population (or any other) data from Statistics Denmark into Power BI

It turns out that the blogpost I wrote 10 years ago about getting data from Statistics Denmark into Power BI with Power Query still is being used – 😊 – link.

But as the API has changed a bit since then I was asked to do an update of the blogpost – so here is how you can get the population of Denmark imported into Power BI with Power Query.

The source

Statistics Denmark provides you with a lot of information about Denmark. These data can be accessed via this site – Statistics Denmark (dst.dk). When you manually want to download the data you will find all the tables in Statistikbanken (statbank.dk)  – which provides an interface to setup a report that you then will be able to download in different formats.

In this example, I set up a simple report and you can see the export formats here.

As a Power BI developer we don’t want to do this – we want to automatically retrieve new data – API to the rescue

The Statistics Denmark does provide an API that we can use to extract the data automatically.

Its located here – Databank API – Statistics Denmark (dst.dk)

Let’s extract the Population for Denmark

When we click the button β€œSTART CONSOLE” we get this

We want to retrieve data so pick that under 1) Choose API call

Under 2) Choose settings for call we first need to select whether to do a get or a post call – we can do both but a post call requires a bit more manual code so I will stick to a get call. (If you want to know how to create a post call in Power Query – I can recommend this article from Chris Webb – Chris Webb’s BI Blog: Web Services And POST Requests In Power Query (crossjoin.co.uk) or my own here – Specifying JSON Query in Power Query – Example Statistics Sweden – Erik Svensen – Blog about Power BI, Power Apps, Power Query (wordpress.com).

Next is to specify the table name

In the Table id I have specified β€œFolk1a” – you can find the different table names in Statbank (id highlighted in green)

Next – select the Format β€œCSV” as this is the binary format the API will return to Power Query.

When you click more settings, you can change the delimiter if needed.

Next up we will click the Variable- and value codes and this will give us the opportunity to select what data we want to extract from the source.

You can click the individual items/codes or click on the * to get all elements from the variable.

In this example I have picked all KØN (gender), all ALDER (ages), total for CIVILSTAND (marital status), All – TID (time) and all OMRΓ…DE (regions and municipalities)

Under Other settings I leave it to default but you could change the language to English for instance

Under 3) Call the API with the chosen settings

You can see that a URL with the query is created for you.

You can then click Execute to see the result

OOPS – the API is limited to 1.000.000 cells

So, it’s telling me to extract the data either by using BULK or another streaming format instead.

Go back to 2) Choose setting for call

And select BULK under Streaming.

The URL under 3) will now be changed

And you could try and execute the query once again and the result will be shown after a while.

Copy the URL …. And then

Let’s do it in Power Query

Open Power BI Desktop and click get data and choose Web

Paste the URL into the dialog box and click OK

Very quickly you will get this based on the first 200 rows

If the Query is what you want, then click Load but you should really click Transform Data to control what data is imported to your model – so click Transform Data

And wait a couple of seconds for the preview data to load.

The data contains total lines, text after ages, not a proper date for the time and a few other transformations before we are ready to import the data in to our data model.

I won’t go through each of them but have a look at the file.

You can modify the URL to change what time is imported and make it dynamic depending on what you want imported instead of just all periods.

The API is not the fastest so the appx. 1.2 million rows this selection returns takes a couple of minutes to update.

And then we can start to create explicit DAX measures and visualize our data.

In the demo file I have created this page to create an overview of the change in the population between two different periods in top x municipalities in Denmark.

You can download the demo file here – and just click refresh to get the latest data 😊

Let me know what you think and if you like it give the post a like please.

Stay Queryious !!!

Document your #powerbi model with #chatGPT and a #tabulareditor script and browse it in your model

Inspired by Darrens Gosbell’s excellent blog post – Automatically generating measure descriptions for Power BI and Analysis Services with ChatGPT and Tabular Editor – Random Procrastination (gosbell.com) – and my session at SQLBits 2023 where I showed how to call the chatGPT from within Power Query and how you can use the New chat (openai.com) – to describe your M code.

I thought it would be nice if I could add this as a description to my tables in Power BI model.

How hard can that be πŸ™‚

Well, it turns out that we are not allowed to modify the table object via the Tabular #ditor and this includes adding a description to the table

So, if I created a script that added a description – it worked fine but returning to the desktop I was asked to refresh the queries and the descriptions where removed by Power BI desktop –

Well, what is supported as write operations to a model?

Calculation groups is supported – hmmm… could I use this and then actually create a calculation group with a calculation item for each of the tables in my model – that would also mean that I could create a table in my report and expose the descriptions in a report page instead of the description should be hidden away on the tooltip of a table in the field list – This could actually be quite awesome!

C# Script here we go

I know very little about how to write C# code so Darrens example and the documentation of Tabular editor was a big help and please bear in mind that I am a C# novice when reading my code

The steps needed to this is as follows (I have used Tabular Editor 3 but it should be possible to us the free version of Tabular Editor as well

Step 1 – Open your Power BI desktop file and find the external tool ribbon and click the Tabular Editor

Step 2 – Add a Calculation Group called “Documentation”

Step 3 – Choose to create a New C# Script

Step 3 – Use the following script and run it

#r "System.Net.Http"
using System.Net.Http;
using System.Text;
using Newtonsoft.Json.Linq;

// You need to signin to https://platform.openai.com/ and create an API key for your profile then paste that key 
// into the apiKey constant below
const string apiKey = "<YOUR_API_KEY>";
const string uri = "https://api.openai.com/v1/completions";
const string question = "Please describe this power query code for me:\n\n";

using (var client = new HttpClient()) {
    client.DefaultRequestHeaders.Clear();
    client.DefaultRequestHeaders.Add("Authorization", "Bearer " + apiKey);

    foreach (var t in Model.Tables)
    {
        
        foreach ( var p in t.Partitions)
        {
            // Only uncomment the following when running from the command line or the script will 
            // show a popup after each measure
            //
            //var body = new requestBody() { prompt = question + m.Expression   };
                string _type = Convert.ToString(p.SourceType);
                string _exp = Convert.ToString(p.Expression);
        if ( _type == "M" )
            {var body = 
                "{ \"prompt\": " + JsonConvert.SerializeObject( question + p.Expression ) + 
                ",\"model\": \"text-davinci-003\" " +
                ",\"temperature\": 1 " +
                ",\"max_tokens\": 256 " +
                ",\"stop\": \".\" }";

            var res = client.PostAsync(uri, new StringContent(body, Encoding.UTF8,"application/json"));
            //res.Result.EnsureSuccessStatusCode();
            var result = res.Result.Content.ReadAsStringAsync().Result;
            var obj = JObject.Parse(result);
            var desc = obj["choices"][0]["text"].ToString().Trim(); 
          
            //Reference to your calculation group that should hold the calculation Items
            var x =(Model.Tables["Documentation"] as CalculationGroupTable).CalculationItems[t.Name];
            
            //deletes the old version
            x.Delete();

            var calculationItem1 = (Model.Tables["Documentation"] as CalculationGroupTable).AddCalculationItem();
            
            //removes any quotes in the chatGPT description
            var s = desc.Replace("\"", "");

            calculationItem1.Expression = "\"" +  s + "\"";
            calculationItem1.Name = t.Name;
            
            //Info("Processing " + t.Name) ;
            
            }
        }

    }
}

This will create a calculation item for each of your Power Query table and add a Expression that contains the chatGPT description of your M code

Example of one of the tables

Step 4 – Save the changes back to your model and you will be prompted to refresh your calculation group

Step 5 – Add a table where you take the Calculation group name and a measure that I call Query description – the value of this measure will in the table be changed to the expression of the calculation item.

https://tenor.com/embed.js

We could use the same method to document our DAX measures and put that into a calculation group as well – thereby documenting our full model and exposing the information directly in the report and not just in the tooltips of the fields.

I will try to find time to do a blog post on this as well.

Let me know in the comments if you find this useful – would very much like to hear from you.

#PowerQuery – Dynamically reference a table via a Parameter

The other day I had a fun challenge from my co-worker – SΓΈren Faurum – that wanted to change the refence to a column name from one table to another table with the same column name.

OBS – The solution is not supported in the Power BI Service.

In his case it was

Let
   x= List.Distinct(tableName1[StoreKey])
in
   x

And he wanted TableName1 to be a value of a parameter.

Here is an example (data is from AdventureWorks) – in my model I have two tables – DimReseller and DimCustomer – both have a GeographyKey

A parameter called TableName should now be used in a query to retrieve either the unique list of values of the key from either DimReseller or DimCustomer.

If we just use

List.Distinct(TableName[GeographyKey])

We do get an expression error as the parameter is a text value and not a table that has fields.

Then I thought we could the function Expression.Evaluate() – link to documentation –

But no – apparently, we can’t reference a Table name like this.

#shared to the rescue πŸ†˜

Then I thought why not use the #shared function – as I knew that #shared will list all the functions, tables, parameters etc. that are available in the current pbix file – including all loaded or not loaded tables.

Then we can turn the record into a table and filter it based on a parameter

This will give us a table with one record.

Where the Value column will contain the table we want to be dynamic

Then by adding the following to the formula

  • {0} to get the first row/record
  • [Value] to retrieve the Table in the record
  • And [GeographyKey]

We can get all the Geography Keys in the table

And by putting all of this into the argument of the List.Distinct function

= List.Distinct(Table.SelectRows(#"Converted to Table", each [Name] = TableName){0}[Value][GeographyKey])

Which returns 510 rows.

And now I can change the parameter to DimCustomer

And see it returns 336 rows

However if we publish the model to the Power BI Service it won’t work if you want to schedule refresh as #shared is not supported in the service.

So we can only use it in the desktop

Stay Querious

Using Power Query and Power Map to visualize public data of the windmills in denmark

While preparing a training session for a customer within the Windmill industry I found public data that lists all the windmills in Denmark with their size and location.

The data can be found here – http://www.ens.dk/info/tal-kort/statistik-noegletal/oversigt-energisektoren/stamdataregister-vindmoller

So I decided to see what Power BI could do with this dataset.

The data

The file with the latest data (http://www.ens.dk/sites/ens.dk/files/byggeri/anlaegprodtilnettet.xls is named the same way and is structured in a way that makes it easy to make custom analysis and visualisations.

So first tool to use in the Power BI stack…

Power Query to the rescue

First – switch to the Power Query tab and choose to load data from File and Excel File and paste the address to the file in the file open – it does take some time but it will work.

And Power Query will open and list the tables that exists in the file

The data we are interested in starts at row 18 and contains a lot of columns that we aren’t interested in.

And there is also some formatting of columns from text to numbers etc.

Here is the total list of steps shown

Create a Power Query function to create Latitude and Longitude

The data file has doesn’t contain Latitude or longitude but the coordinates is listed with the “European Terrestrial Reference System” – so I had to find a way to convert the values into Latitude and longitude.

The Danish Geodata Agency provides a free webservice that can do the conversion for us – http://geo.oiorest.dk/documentation/api/koordinat.aspx – examples.

So for instance the first wind mill is located at

http://geo.oiorest.dk/wgs84.html?etrs89=6171175,720898.4

And inspired by Rafael Salas – blog post – http://www.rafael-salas.com/2013/08/power-query-creating-function-to.html – I decided to create a Power Query function to do the calculation.

So I created a blank Query and added the following query

The function takes two arguments the east and north coordinates, and uses those coordinates to get and XML table from the Web service. As I run with a comma as decimal separator on due to my Danish regional settings I have to replace the comma with a dot as the web service requires the decimal separator to be a dot.

The result is loaded into a table with the “bredde” – latitude and “LΓ¦ngde” – longitude

And the function returns both of these columns.

Then I can use that function in my first query by adding a Custom Column.

And expand that to get both latitude and longitude as columns in my query.

And voila the custom calculated columns.

Then some number formatting of some columns and renaming to English heading and we are ready to send the data to Excel

The data back in Excel

Returning the query to Excel will then give me the list of all 5.126 running windmills in Denmark per September 2013.

Pretty awesome – it does take about 2-3 minutes for it to update/lookup all the geo location – but only one isn’t matched and that is due to an empty row.

Using Power Map to visualize the data

With the data nicely washed and geotagged we can use Power Map to visualize the data.

Power Map’s first guess on the geomapping is actually very good. The “Kommune” is set to county – which is correct and because I named the columns Latitude and Longitude these are automatically also linked correct.

With the Geography properly matched we can move “Next” to visualize the data.

So for instance the KWH by Supplier

Or the KWH by Region/County

Or the world biggest sea wind mill park

Or a heatmap

This is so much fun and finally let make a video of the development over time.

Power map video

Power map also enables us to create a timeline and play the timeline. The data has the date of when the windmill was established so we can use that date to visualize how the windmills have evolved over time.

So by adding the established field to the Timeline – power map can visualize the development over time.

This can be created as a video.

Creating the 45 sec video at the medium format took about 10 minutes to create – so be patient.

Here is a link to the video – http://sdrv.ms/17cBIrx

The file

You can download the example file – here

Comments please

You are more than welcome to add a comment whether you find this cool or not J