Microsoft Power Query is an Excel add-in that helps you

Microsoft Power Query is an Excel add-in that helps you gather, transform and analyse data
from various sources. It’s a very powerful tool and a must have for anyone interested in BI,
no matter how experienced. Microsoft also releases frequent updates of this add-in,
including new source connections.
Google Analytics is a powerful tool for monitoring and analyzing your website traffic. It
helps you know who is visiting, what they are looking for and how they are getting there.
As a BI consultant at Kohera, we’ve had requests to have Google Analytics as a data source
in Power Query. While this is possible very well with the Google Analytics Core Reporting
API, it’s not as straight forward to setup the OAUTH2 authentication for the first time. In a
first step this authentication is explained and further on you will find out how to retrieve
Google Analytics results and process them in Power Query.
OAUTH 2.0
First of all, you need access to a Google Analytics View. You probably wouldn’t be reading
this if you had none. In Google Analytics, go to `Admin` and then navigate to `View
Settings`:
Copy the ID you find there for later use.
If not yet done, go to https://console.developers.google.com/project to create a new project
(this is registering your application with Google). Once created, go to the section with the
APIs and enable the `Analytics API` (change the status from `OFF` to `ON`).
Now go to Credentials to create a new Client ID (for native application):
Copy the Client ID and Secret for later use. The Client Secret should be kept confidential.
With all codes available the user can now grant access, so the application can consume the
service. At first, an authentication code has to be requested with the following URL (use
your Client ID):
https://accounts.google.com/client_id=YourClientID
&response_type=code
&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob
&approval_prompt=force
&scope=https%3a%2f%2fwww.googleapis.com%2fauth%2fanalytics.readonly
&access_type=offline
Set the scope to what is wanted. In this case it was set to readonly access to Google
Analytics. Visit the OAuth 2.0 Playground to browse the available scopes:
Entering the URL in a browser will make it possible to grant access (you need to URL
encode the URL like changing `/` to `%2f` – this is already done in the url above):
When you accept, the first token (authentication code) will be returned:
Save this code, we’ll need it in a few moments.
From Authentication Code to Access Token
With the authentication code, you can now request the access and refresh token. This is a
POST request, so you can not just build the url and launch it from your browser. There are
several ways to do this. One method is using PowerShell. Since V3 it has a command named
`Invoke-RestMethod`. This is some example code (replace the first 3 parameters with the
proper information):
$TokenUrl = "https://accounts.google.com/o/oauth2/token";
$body = @{
code="the authentication code from the previous step";
client_id="your client id";
client_secret="your client secret";
redirect_uri="urn:ietf:wg:oauth:2.0:oob";
grant_type="authorization_code"; # fixed value
};
$json = Invoke-RestMethod -Uri $TokenUrl -Method POST -Body $body;
The json object contains an access token and also the refresh key. The access token is valid
for one hour (or check the expires_in property of the json object, that will normally be
3600).
A sortlike method can be followed to obtain a new access code, as every access token is only
valid for one hour.
$TokenUrl = "https://accounts.google.com/o/oauth2/token";
$body = @{
client_id="your client id";
client_secret="your client secret";
refresh_token="refresh key from the previous step";
grant_type="refresh_token"; # fixed value
};
$json = Invoke-RestMethod -Uri $TokenUrl -Method POST -Body $body;
Getting Information from Google Analytics
Now that the authentication has been set, it’s time to write the URL that will return the
needed information. As an example we want to return information about the Operating
System that was used to visit the website. The URL looks like this:
https://www.googleapis.com/analytics/v3/data/ga?access_token=123456789&ids=ga:33784
873&start-date=2014-10-01&enddate=2014-12-31&dimensions=ga:operatingSystem&metrics=ga:sessions
https://www.googleapis.com/analytics/v3/data/ga is the main part
access_token=123456789 is the Access Token (a token that is not yet expired)
ids=ga:33784873 is the Google Analytics view that we want to query
start-date=2014-10-01 is the start date (format: YYYY-MM-DD)
end-date=2014-12-31 is the end date (format: YYYY-MM-DD)
dimensions=ga:operatingSystem is to let the system know that we want to include the
Operating System as a dimension
metrics=ga:sessions is the number of sessions against the dimension
As the access code is only valid for one hour, it needs to be set dynamically. We will take
care of that in a moment.
Check this site for general information about the Google Analytics API. Available dimensions
and references can be found here or play around with them here. Some common queries are
listed here.
Power Query
Now that the URL is known, the returned information needs to get in Excel. For this, we use
Power Query, as it can parse json. One of the things that needs to be part of the URL is the
access_token parameter. We assume that it’s in an Excel table with the name `TblToken`. It
has one column, named Access Token.
Power Query can read the token like this:
Excel.CurrentWorkbook(){[Name="TblToken"]}[Content]{0}[Access Token]
Combined with some VBA we can update the token in an easy way. Alternatively it’s possible
to put the token in a text file and have Power Query read this (using `Lines.FromBinary`
and `File.Contents`).
This can be done with the PowerShell code from above that we run every five minutes. If the
token is about to expire we run the code and save the result to a file that Power Query can
read.
The VBA and PowerShell code will be explained in a moment, but first, let’s look at the
Power Query code. You can access the code in Power Query, via the `View` > `Advanced
Editor`:
let
analytics = "https://www.googleapis.com/analytics/v3/data/ga?access_token=",
addAccessToken = analytics &
Excel.CurrentWorkbook(){[Name="TblToken"]}[Content]{0}[Access Token],
addViewId = addAccessToken & "&ids=ga:33784873",
addStartDate = addViewId & "&start-date=2014-10-01",
addEndDate = addStartDate & "&end-date=2014-12-31",
addDimensions = addEndDate & "&dimensions=ga:operatingSystem",
url = addDimensions & "&metrics=ga:sessions",
GaJson = Json.Document(Web.Contents(url)),
GetRows = GaJson[rows],
TableFromList = Table.FromList(GetRows, Splitter.SplitByNothing(), null, null,
ExtraValues.Error),
ColOS = Table.AddColumn(TableFromList, "OS", each [Column1]{0}),
ColCount = Table.AddColumn(ColOS, "Count", each [Column1]{1}),
RemoveFirstColumn = Table.RemoveColumns(ColCount,{"Column1"})
in
RemoveFirstColumn
The first lines build the URL. Most parameters are hard-coded, but we could put these
values in Excel, just like we did with the Access Token. The part that contains
`Json.Document` gets the return value of the posted URL and tries to parse it into a JSon
object. As a list was returned, we first convert it to a table. This however results in a table
with a list in each cell:
That is why we add two columns, to extract the values from these lists and present it in a
cleaner way. The last line removes the first column, as it’s not needed any more. The final
result looks like this:
And when we add this as a source in Power View (ordered and with some images) it could
look like this:
Refresh the access token using VBA
Kyle Beachill has created a VBA Class Module that handles OAuth2 authentication,
including a method to renew access tokens. I tried this module with success. If
you’re working with Office 64 bit, the ScriptControl does not work, and so the
GetProp function does not work. But you can change it with this code:
Private Function GetProp(strPropName As String, Optional strJSObject As String = "")
As String
Dim regEx As New RegExp
Dim mc As MatchCollection
If Len(strJSObject) > 0 Then strResponseText = strJSObject
regEx.MultiLine = True
regEx.Global = True
regEx.Pattern = "\""" & strPropName & "\""" & "\s+:\s+\""(.+)\"""
Set mc = regEx.Execute(strResponseText)
If mc.Count > 0 Then
GetProp = mc(0).SubMatches(0)
Exit Function
End If
' maybe we're looking for digits instead of a string
regEx.Pattern = "\""" & strPropName & "\""" & "\s+:\s+(\d+)"
Set mc = regEx.Execute(strResponseText)
If mc.Count > 0 Then
GetProp = CStr(mc(0).SubMatches(0))
Exit Function
End If
End Function
You need to add a few references in order to use the VBA Module:
Refresh the access token using PowerShell
This code was already shown above. We could extend the script to have the access token
being saved in a (shared) location. The script could then run every hour using Windows Task
Scheduler. Using Power Query it’s no problem to read the first line of a text file:
Table.FromColumns({Lines.FromBinary(File.Contents("C:\...\access_token.txt"))}){0}[Col
umn1]
Or when saved on the (intra)net:
Table.FromColumns({Lines.FromBinary(Web.Contents("http://.../access_token.txt"))}){0}[
Column1]
Refresh the access token using Power Query
In case you want to keep everything in Power Query, we can use the Web.Contents function
with the `Content` variable. This will submit a POST request instead of a GET request
(which is the default). This should return a json object with a valid access token. The code
looks like this:
let
clientId = "client_id=YourClientId",
addSecret = clientId & "&client_secret=YourClientSecret",
addRefreshToken = addSecret & "&refresh_token=YourRefreshToken",
url = addRefreshToken & "&grant_type=refresh_token",
GetJson = Web.Contents("https://accounts.google.com/o/oauth2/token",
[
Headers = [#"Content-Type"="application/x-www-form-urlencoded"],
Content = Text.ToBinary(url)
]
),
FormatAsJson = Json.Document(GetJson),
access_token = FormatAsJson[access_token]
in
access_token
The result of this query could then be used in the solution above (instead of getting the code
from a cell in the worksheet you get it from the code above – if you reference another query
in the same workbook, don’t forget to enable the Fast Combine workbook setting). You
could even include further logic to only run the Web.Contents when the current code has
expired, but then you need to save the expiration date and access_token in a worksheet and
only refresh the token when the current date has passed that date.