GA4 BigQuery analysis
For exact attribution across concurrent experiments — or any case where GA4’s Custom Dimension and Audience limits get in your way — pipe GA4 to BigQuery and joinexperience_impression events to your conversion events
on user_pseudo_id. This is the same join GA4 does internally, but in raw
SQL with no caps, no (other) rollup, and no event-scope limitations.
This page assumes you’ve already enabled the GA4 export. The
experience_impression event with exp_variant_string is what we’ll join
on.
Enable BigQuery export from GA4
Have (or create) a Google Cloud project linked to billing
BigQuery requires a billing-enabled Google Cloud project. The free tier
is generous (10 GB storage, 1 TiB queries / month — plenty for most
sites’ GA4 raw data), but a billing account must be attached.
Link GA4 to BigQuery
In GA4: Admin → Product Links → BigQuery Links → Link. Choose your
GCP project, the dataset location, Daily export (free tier) or
Streaming (paid, ~$0.05/GB), and which events to include (default:
all). Confirm.
Cost reality check
- GA4 Standard (free) supports the daily batch export at no charge up to approximately 1 million events / day. Above that, the daily export is disabled for Standard properties — you’d need streaming (paid GCP billing, ~$0.05/GB) or GA4 360.
- BigQuery query costs are $5 per TiB scanned after the free 1 TiB /
month. Our parameterized template uses
_TABLE_SUFFIXpartitioning to keep scans small; a single experiment over a 30-day window typically scans under 1 GB. - For most ABTestly customers the entire pipeline costs $0/month. If you’re a large site doing >1M events/day, factor in streaming cost.
The parameterized SQL template
Download:ab-test-by-variant.sql
The query joins experience_impression events (the variant exposures) to
your conversion events on user_pseudo_id, then aggregates per
(experiment, variant): unique users, impressions, conversions, revenue, and
conversion rate.
What it does
UNNEST(event_params)to pullexp_variant_string,experiment_id,variant_id, and revenue out of GA4’s nested schema.- Builds the exposed user set per variant for the chosen experiment.
- For each user, finds their first exposure timestamp — conversions before that don’t count (you can’t be influenced by a variant you haven’t seen yet).
- Joins conversions on
user_pseudo_id(or useuser_idif you have User-ID configured) and aggregates per variant.
Running it
In the BigQuery console (orbq query), do a find-and-replace of the
@... parameters and run it, or use parameterized queries with
bq query --use_legacy_sql=false --parameter=experiment_id:STRING:7 ….
Caveats
user_pseudo_idis per-browser-cookie — cross-device users count twice unless you have User-ID configured (swapuser_pseudo_idforuser_idin the query).- Sample-ratio mismatch (SRM) checks should be done in BigQuery too — compare the exposed-user counts per variant against the configured weights. Significant skew means the bucketing isn’t representative.
- This query attributes first-touch at the user level. Other attribution
rules (last-touch, time-decay) require different
JOINsemantics — adapt thefirst_exposureCTE accordingly.
Looker Studio dashboard
For non-technical teammates, the SQL above is wrapped in a Looker Studio template you can connect once and re-use per experiment.Connect Looker Studio to your BigQuery dataset
In Looker Studio: Create → Data source → BigQuery → CUSTOM QUERY.
Pick your billing project, paste the parameterized SQL above (with the
@experiment_id, @conversion, @start_date, @end_date placeholders
promoted to parameters in the query editor — gear icon → Parameters).Build the report
Drop variant_id as a dimension, users, converters, revenue,
and conversion_rate as metrics. Add controls for each parameter so
viewers can pick the experiment, conversion event, and date range.
We’re working on a publicly shareable Looker Studio template you can copy in
one click. In the meantime the recipe above takes about 15 minutes to wire
up the first time.