#PowerQuery – Replicate doing an Excel VLOOKUP in M

Power Query has a lot of built in functions, but it doesn’t have a function that exactly matches the VLOOKUP in Excel – but by using the M language and other M functions we can certainly replicate some of the VLOOKUP functionality and make an even more flexible lookup function.

Now the example data is like this

In Excel we would lookup the price for at specific productkey by using this formula

– in this case ProductKey 1 with a price of 100.

In order to replicate this in Power Query we can use the function List.PositionOf

So I add a new blank query

And then use the function List.PositionOf – with the following arguments

List – Is the column ProductKey from my lookuptable Products – refer to like this Products[ProductKey]

Value – Is the value to look in this case the value 1

Occurrence – Is set to 0 to only return one value

This will return the position of the value in the list – sort of like using the MATCH function in Excel

Now to return the price – we can use this result to lookup the price like this

= Products[Price]{List.PositionOf(Products[ProductKey], 1, 0)}

And we get 100 returned which is the price of productkey 1.

The formula is structured like this

=NameOfTheTable[NameOfTheColumnToReturnTheValueOf]{PositionReturnedByListPositionOf}

But we why not change it into a function in PowerQuery so we use the function on all rows in a table or on any table.

The function can be created like this

The code

(lookupValue as any, lookupTable as table, lookupColumnName as text, returnColumnValue as text) =>
let
// lookupTable= Products,
// lookupColumnName = "ProductKey",
// returnColumnValue = "Price",
// lookupValue = 1,
 colLookup = Table.Column(lookupTable, lookupColumnName),
 colToReturn = Table.Column(lookupTable, returnColumnValue),
 lookup = List.PositionOf(colLookup, lookupValue, 0),
 Result = if lookup >=0 then colToReturn{lookup} else "Not found"
in
 Result

The function takes 4 arguments –

lookupValue – The value to find – can be any type

lookupTable – The Table/Query to lookup in

lookupColumnName – The name of the column to lookup the value in

returnColumnValue – The name of the column from the table to return

The colLookup is a variable that uses the function Table.Column to return a list of values in the lookup column.

The colToReturn is a variable that uses the function Table.Column to return a list of values from the values you want to return column.

The lookup variable uses the List.PositionOf to find the position/index of the search value in the lookup column.

Result will use an if statement to test whether a position is found and if so returns the value at the position in the colToReturn list – other wise returns the text “Not Found”.

After this we can use the function in other tables to lookup the for instance the Product price like this – by added a invoke Custom Function Column

OBS – I haven’t tried this on a large table so be aware of any performance issues.

Hope you find this useful – and happy Querying

Here is a link to an example file –Β Link

Guide – How to import data from Eurostat directly into #PowerBI

I follow EU Eurostat on twitter (link – https://twitter.com/EU_Eurostat ) and often see a lot of interesting facts and infographics like this one.

And I have for a long time wanted to see if I could use the webservices that Eurostat also provides (link – https://ec.europa.eu/eurostat/data/web-services) to import the data directly into Power BI.

So here is a guide on how you can do it – and the example will try to extract the data for the Orange production Infographic.

There is a LOT of different datasets in Eurostat and this guide should work on most of them – you just need to find the dataset (https://ec.europa.eu/eurostat/data/database) in the catalogue.

Construct the Query

The REST request we need to construct is defined like this

So, we need to find the datasetCode and specify the filters.

You can find the dataset code by browsing the data catalogue – and the dataset code is stated at the end.

If you need specific items the data explorer you need to specify the code of the items in the request and the Dataexplorer is a good way to find these.

Using the codes we have found we can now use the Query builder to construct our query (click on the picture to try it out)

So after entering the dataset code we get the option to specify the filter and select measures under strucpro

Notice that I have selected Yes to Exclude EU aggregates

The query option will vary from dataset to dataset but the principles are the same.

Clicking the “Generate query filter” will give you

And you can copy the dataset code to clipboard

apro_cpsh1?filterNonGeo=1&precision=1&crops=T1000&strucpro=AR&strucpro=PR_HU_EU&time=2014&time=2015&time=2016&time=2017

Now we have the filter part and this can of course be parametrized in your Power Query.

And we must add

http://ec.europa.eu/eurostat/wdds/rest/data/v2.1/json/en/

before so the full web query is

http://ec.europa.eu/eurostat/wdds/rest/data/v2.1/json/en/apro_cpsh1?filterNonGeo=1&precision=1&crops=T1000&strucpro=AR&strucpro=PR_HU_EU&time=2014&time=2015&time=2016&time=2017

In order to get the data into Power BI we choose get data and select the Web connector

And press OK

This will return the following to the query editor

Convert the JSON response to tables in Power Query

We get a JSON response returned from the Web query and this has to be transformed into a table – and in order to get that we need to understand what and how the data is returned.

When we use the query in the browser you can see that its not a normal structured JSON document.

And what we want is a table with the following fields – Country, Year, Area, Production

I start by renaming the query to “QueryResult” and disable the load – and the following query will use this as a reference

So lets create the Geo dimension

By clicking the Record on the dimension field

Drill down to geo

Down to category

And to label

Turn it into a table

And then add an index column starting from 0 and rename the columns

I then do the same for the other dimensions and end up with the following queries

Getting the values

Now in order to get the values we need to do a little more

The web query returns a one long list of values when converted into a table.

Values contains all the “values” in the grid when

Each cell in the table is referred by a running index starting from 0 to Facts multiplied by Geo by Time.

So when we have 2 facts, 38 countries and 4 years will give us 2 * 39 * 4 = 304 – this should be the number of rows in our Values table.

But when we look at the Values before we do anything we have only 282 rows.

The missing Values in because cells with missing values (represented by a : or :z ) is located in the Status field in the QueryResult.

So we have to add the missing rows from this Status – this gives us the missing 22 cells (304-282)

And we then convert these Values to null

In our values Query we want to append these rows to our table – and we can do this by modifying the step – by modifying the expression to

= Record.ToTable(value) & MissingRows

And we rename and change the Name column to Index and change the data type to an integer.

The index column is our number reference to each cell in the table/matrix from EuroStat.

Next step is to calculate the keys for time, geo and Facts.

To calculate the TimeKey we add a column

And divide it by the number to time periods

This gives a step with the following expression

= Table.AddColumn(#”Added Index”, “TimeKey”, each Number.Mod([Index], 4), type number)

And after click OK we can make it more dynamic by modifying the 4 to the number of rows in

= Table.AddColumn(#”Added Index”, “TimeKey”, each Number.Mod([Index], Table.RowCount(Time)), type number)

And now we have the TimeKey for each row.

To add the FactKey we add another calculated column

= Table.AddColumn(#”Inserted Modulo”, “FactKey”, each Number.IntegerDivide([Index], Table.RowCount(Geo)*Table.RowCount(Time)), Int64.Type)

This will give us the factkey and we can see it shifts when we reach row 152 – time count = 4 and geo count = 38 = 152

Now we need the final key column is the geoKey

= Table.AddColumn(#”Inserted Integer-Division”, “GeoKey”, each Number.IntegerDivide([Index], Table.RowCount(Time) ) – [FactKey]*Table.RowCount(Geo), Int64.Type)

And we are now ready to load the data into our data model.

The data is ready to model

After closing the query editor we get the tables returned to Power BI Desktop and we can finalize the datamodel

And create the relationships between our tables.

And create some measures

And start to visualize the Oranges in Europe

One interesting fact is that the Orange trees are highly productive in Albania.

And the cool part is

When 2018 figures is available in EuroStat – we just modify our query to

http://ec.europa.eu/eurostat/wdds/rest/data/v2.1/json/en/apro_cpsh1?filterNonGeo=1&precision=1&crops=T1000&strucpro=AR&strucpro=PR_HU_EU&time=2014&time=2015&time=2016&time=2017&time=2018

and refresh our model.

What do you think?

You can download a copy of the pbix file – here.

Did you like it then give the post a like or a comment – let me know what you think – hope you find it useful.

And at least take a spin around Eurostat to what interesting facts that you can find in the many datasets that are provided to you for free.

Creating a Power Query function browser – #powerbi

One of the best functions in Power Query (M) is for sure the function =#shared that inside the Query editor lists all the functions in M and in your current pbix file.

Picture 1 Result of =#shared

But sometimes you do want to check out the documentation online and then you have to leave Power BI desktop – so I thought would it be possible to use Power Query to scrape the documentation (link) – and create a Power BI Report in which I could browse the documentation.

And because the documentation is well structured is turned out to be quite easy

Let’s start by getting all the function categories

The start page of the Power Query M function reference contains a nice list of all the categories

So begin with we activate a new query with data from the web

And by using the new feature “Add tables using examples” we get a lot of help

Just type in the first 3-4 elements in the “In this section” list and we get a nice result

Click on the edit button and we have a nice list of the categories in our Query editor

Now each of the categories points to a specific web page

And the address for all of them is the name and the hyphens instead of spaces so I do a simple replace values where we add a new column with the relative path

And to get the function category URL I add a parameter called PQ Documentation URL that points to the relative path and that I can use in my other queries as well later.

And then add a column where I concatenate the path and the name of each categories relative path

And now we have a reference to all the pages of the Power Query function categories.

Let’s create a function to extract all the functions in a category

Now to create a table of all the functions I create a new query and reference the first Query I made

And the idea is then to create a custom function that returns a table of all the functions within a category

Each page of contains tables of the functions

And when creating functions I prefer to create a query that works and then turn it into a function so let’s dig in – and the navigator sometimes tries to help us a little bit to much so I rightclick the address and choose edit.

And we get a nice structure of the html page.

We only want the tables and as you can see in the preview each table is actually a list of the functions

Expand the Custom1 column table

And voila – a nice list of all the List functions in this case is returned

The other columns can now be removed, and we are ready to turn this into a function

Open the advanced editor and modify the query from

To

And we can use this function to add as a column in our function query to get all the functions in the different categories

So now we have all the categories and the functions within

Now to the hard part – get the description of each function into the model

Each of the function has a specific page – in this example the List.Accumulate

The address of each of these function pages is the name of the function https://docs.microsoft.com/en-us/powerquery-m/list-accumulate

With a hyphen instead of the . (dot) – so constructing the address is quite easy.

But I only wanted to extract the data within the black square.

By looking at different pages I found out that all of the information I wanted was located in a specific named element <pre>

So on each page I have to locate that tag and then the get all the html for there to a named element called </main> that marks the end of the area.

Now when we use the Web query interface we get a lot of help but in this case we do not want the help

So after clicking Edit in the navigator menu

We get this – and this is because the Web.Contents is wrapped in the Web.Page function

So by modifying it to Text.FromBinary instead we get the html in text returned instead

Now we can click the Fx to create a new step to find the <pre> tag

And another step where we still refer to Source and find the position of </main>

Now as the final step – we extract the html from the source by using the Custom1 and Custom2 value with some offset

And we have the HTML from the page that we wanted.

And we can turn it into a function

Finally I reference the previous query and create the query with all the syntax and example text

And we are ready to create the data model

Now click close and apply and create a datamodel with the following relationships

I created a measure to the create a default URL to the documentation if a category isn’t selected

And sat the data category to the different URL’s as Web’s url

And the rest is slicers and presentations

To present the HTML extracted from the function page I use the custom visual “HTML Viewer”

And the final report looks like this

And now I can hit refresh and have an updated Power Query Documentation report

Example file and comments

I hope you find this useful and if so please like or comment if you have ideas or feedback.

Link to the pbix file is – here

Or if you just want to browse the report – https://app.powerbi.com/view?r=eyJrIjoiMDczMGE5NjMtMjA4Zi00NmE4LTlmODAtZDRhMWQzMTk3ZmUzIiwidCI6ImUwN2Y3YzBiLTk5NWItNDg0NS1iZWJhLTQxNmYwZDYxZTFlMSIsImMiOjh9

#PowerQuery – Filter by a dynamically set of columns

A few days ago, I ran into a problem where I had to filter a product table by a dynamically set of columns – so in one solution it could be filtering by field x and y and in other solutions it had to be by field a, b and c.

In this case only rows that where not blank in either of these columns should be included in the result.

Hmmm… how can we do this using Power Query –

Here is the demo data

And if we only want the rows where Field A and Field B has the value “xx” we could add a filter statement (this can only by done if you modify the SelectRows formula manually! )

= Table.SelectRows(#”Changed Type”, each ([Field A] <> “” or [Field B] <> “”))

But what if we wanted to make the fields dynamically

Let’s start by creating a list containing the field names we want to filter

This list could also be retrieved by extracting data from a config file or a database if you would make it even more flexible – just remember it must be at list

Now you should know that any row in a table is a Record

And each record has fields and Power Query has functions related to Records.

https://docs.microsoft.com/en-us/powerquery-m/record-functions

One of this is the Record.SelectFields ( https://docs.microsoft.com/en-us/powerquery-m/record-selectfields ) and this function can be used to select specific fields in a record

So lets add a custom column with this formula

= Table.AddColumn(#”Filtered Rows”, “FilterMultipleColumns”, each Record.SelectFields(_, FieldList))

The first argument uses the _ statement and thereby referring to the current record – and the second is parameter containing the list of fields.

And as you can see when selecting a cell the record in the column now contains only of the fields we have specified in our list.

Now we turn this in to a list by using the function Record.ToList in our AddColumn statement

= Table.AddColumn(#”Filtered Rows”, “FilterMultipleColumns”, each Record.ToList(Record.SelectFields(_, FieldList)))

Notice that we now have a list in the column instead of a record

Next step is to use the function List.MatchesAny to check whether the contains our filter

= Table.AddColumn(#”Changed Type”, “FilterMultipleColumns”, each List.MatchesAny(Record.ToList(Record.SelectFields(_, FieldList)), each _ <> “”))

The second argument is then our filter for all fields and it will return TRUE or FALSE if the criteria is met.

And finally, we can filter the rows using the added Column and only include rows that return true

And if we want to filter by other columns/fields, we change the list

And it works

Hope you find this useful and let me know if you do

Here is a link to a demo pbix .

Query On !

#PowerQuery everywhere :-) – now in #microsoftflow as well

In the Microsoft Flow October 2018 update – link – we got a very interesting feature added to Microsoft Flow – Power Query and as they state –

Also, to better empower our users – we have integrated Power Query directly into the product. Power Query is a data ingestion tool that allows power users to transform, filter, manipulate, and import data into popular data stores like Power BI and CDS. We are adding this same capability to Microsoft Flow that will allow makers to shape data mashups from SQL Server using a Power Query Online experience.

So, I decided to try it out

You can find the Action by searching after “Power Query”

So far, we only have one possible data source– which is the SQL server (but there is more) – so selecting the action gives us this

You can either pick an existing SQL server connection or create a new connection via the dots…

Clicking the Create Query open the Power Query Editor and we can start our transformation/selection of data

And when we click next we get too the Query Editor

Notice that the Server and Database is created as Parameters.

OBS!!!

I haven’t tested it but when you right click the pane you actually get the full “Get data” experience – and when you choose one of these you have to use a Enterprise Gateway to use this.

We can now filter the rows

And even go to the advanced editor when you right click the Query name.

Important

You can only return ONE query to your Flow – so be sure to only enable one of your queries.

Getting it into Flow

When we click the Update button we just get back to our flow

And now we need to add an action that uses the result of the Power Query and we can pick between one of the many possible actions

But just to try let’s try and put the data into Power BI

So, I create a streaming dataset in My Workspace –

And remember to turn the Historical data analysis on

Now as the Query can return more than one row we first Add and “Apply to each” step

And then find the Action “Power BI – Add rows to a dataset”

Now we can pick the workspace – dataset we want the rows add to and bind each column to the result from Power Query

Now we are ready to test our flow

And after the connections is validated we can

Hit continue

And the flow starts

We can follow the progress from the Run history

And entering the step – we can check the output

So, as you can see its JSON document.

And we can start to build a report in Power BI

Opportunities

I think this opens for a lot of different opportunities for utilize the incredible Power Query language to extract and transform data from almost any where and then perform actions – like exporting data, sending notifications via e-mail, create task or alerts based on the data and ….

I am pretty sure that we will have great use of this and can’t wait to see what users will use this for.

Happy Power Querying !!!

#powerbi – Switching a report from one PowerBI dataset to another PowerBI dataset

At the Power BI world tour in Copenhagen last week I was asked how to change a Power BI report’s connection from a dataset in the service – for instance if you have a test workspace and then want to move the report to a production workspace.

I started to refer to my previous blogpost – https://eriksvensen.wordpress.com/2017/10/03/how-to-change-the-powerbi-desktop-file-connection-from-data-model-to-a-power-bi-service-dataset-or-azureas/ – but when using this hack it won’t work.

So we have to do it in another way and it is quite simple – one important thing is of course that the dataset has the same dimensions and measures – otherwise some of your visualisations will show an error.

When we create the report we can select a Dataset from the service

When we save the report it will automatically be saved in the workspace where the dataset resides.

But how can we change it –

If you go to Edit Queries

You can click Data source settings – then we will get the option to pick a dataset from another workspace

And then you just pick the dataset you want to use.

When you save the report now – the report will be saved in the workspace where the selected dataset resides.

Hope this can help you as well – and if so give this post a like

#PowerQuery – Change the Column name to the index of the column

Inspired by Ruth Pozuelo from Curbal latest video – https://www.youtube.com/watch?v=jlogBrcYZwc&t=2s where she shows an excellent tip on how to change column headers to upper or proper case –

And the trick is to use the second argument in the Table.TransformColumnNames – and for instance use Text.Upper to sett all names to capitals.


You can also use

Text.Lower

Text.Proper

I decided to check if I could use some of the other text functions to do even more πŸ™‚

So why not see if we could give each column the name of the index number they have in the query.

And it turns out that it’s possible

If you add the following step

= Table.TransformColumnNames(Custom3, each “”)

The each statement will to the renaming for each column and name it to an empty string – and if you have more than one column with the same name it will get a index number added as a suffix – so column 2 will be name empty string + 1 and so on.

In order to rename the columns to start with 1 instead of an empty string we can use the TransformColumnNames once more but this time to add 1 to each column names – but as column names is text we need to convert it to a value and back to text again

= Table.TransformColumnNames(#”Custom1″, each if _ = “” then “1” else Text.From(Value.FromText(_)+1))

And now we can refer to cells in our table using coordinates

Hope you find this useful and happy querying πŸ™‚