Data Warehousing for Google Analytics Data


How it works

Needl uses the Google Analytics API to extract and enhance your data and then store it in BigQuery - Google's cloud data warehousing solution. An initialization will go back in time to build up your data and then make daily updates. The BigQuery account can be your own or we can host it for you.

The Google Analytics API limits the number of dimensions requested in a single query, Needl overcomes this by sending off multiple queries and stitching the results together.

BigQuery connects well with analytics tools such as Data Studio and Google’s ML platform. Give your data a proper home!


Features:

  • Session (and hit) level data rather than aggregated - with daily updates.
  • Over 50 dimensions supported as standard - see typical schema below. Google's Analytics API normally restricted to 7.
  • Hit level data - pages visited during a session supported as standard.
  • Client Id extracted - connect users sessions over time to allow more sophisticated analysis, for example channel attribution using a Markov model.
  • Weather data added per session as standard.
  • Sampling can be avoided - run reports across date ranges and as long as there is less than 500K sessions per day, no sampling involved.
  • Bucketed dimensions added to aid analytics:
    • technology = deviceCategory + operatingSystem + browser
    • daypart = combined dayOfWeek and hour
  • User local time added as standard - note that Analytics only reports times based on server time.
  • Custom dimensions supported on a request basis.


See it in action

Example dataset in BigQuery:

https://console.cloud.google.com/bigquery?project=needlbq&p=needlbq&d=public&t=demo&page=table


Typical Schema

Fields not available in Google Analytics are shown with a description.

Dimension/MetricDescription
time.utcDateTimeUTC date/time
time.dateTimeServer (Analytics) date/time
time.hour
time.dayOfWeekName
time.workingDay
time.dayPartCombined day of week/hour
time.userDayOfWeekNameUser’s local day of week
time.userDayPartUsers local daypart
time.userHourUser’s local hour
time.dayLightUser’s local daylight – light/dark/twilight
geoNetwork.cityId
geoNetwork.city
geoNetwork.region
geoNetwork.metro
geoNetwork.metroRegionCombined metro/region
geoNetwork.country
geoNetwork.longitude
geoNetwork.latitude
geoNetwork.timezoneTimezone of the user
geoNetwork.networkLocation
geoNetwork.continent
geoNetwork.subContinent
user.clientId
user.userType
user.sessionCount
user.sessionFrequency
user.sinceSessionBucket
user.weeksSinceLastSession
user.sessionDurationBucket
user.daysSinceLastSession
trafficSource.source
trafficSource.medium
trafficSource.trafficType
trafficSource.keyword
trafficSource.campaign
trafficSource.channelGrouping
trafficSource.fullReferrer
hits.time
hits.utcTimeStamp
hits.page
hits.page.hits
hits.page.pagePath
hits.page.pagePathBasePagepath with query string removed
hits.page.pagePathLevel1
hits.page.pagePathLevel2
hits.page.pagePathLevel3
hits.page.pagePathLevel4
hits.page.previousPagePath
weather.utcStationTimeWeather event UTC timestamp of station
weather.weatherCategoryMain weather feature (Rain/Clouds/Clear…)
weather.weatherDescriptionMore detail of weather
weather.sunriseSunrise (weather station)
weather.sunsetSunset (weather station)
weather.weatherIconIcon – can be used with url for weather image
weather.temperatureTemperature at weather station
device.osMajorVersion
device.browserMajorVersion
device.technologyCombined os/device/browser
device.deviceCategory
device.browser
device.operatingSystemVersion
device.browserVersion
device.operatingSystem
device.language
adWords.adwordsCampaignID
adWords.adwordsAdGroupID
adWords.adwordsCreativeID
totals.sessions
totals.bounces
totals.sessionDuration
totals.transactionRevenue
totals.transactions
totals.goal1Completions
totals.goal2Completions
totals.goal3Completions
:
totals.goal20Completions

Book a demo for free!