Introduction

Definition of N-Gram Analysis

N-Gram analysis is a method in natural language processing (NLP) and computational linguistics used to examine text and data by analyzing sequences of N consecutive elements or tokens (e.g., words or letters). It helps to identify patterns, frequencies, and relationships between elements in larger text corpora.

Importance of Optimizing Search Terms in Google Ads

In the world of digital marketing, Google Ads has established itself as an effective advertising tool that helps businesses reach their target audiences and generate clicks and conversions. Optimizing search terms is crucial for the success of a Google Ads campaign as it increases the relevance of ads to the target audience, leading to improved performance. The right keywords contribute to your ads being displayed according to users’ search queries, thereby capturing their attention.

Objective of the Blog Post

The purpose of this blog post is to provide you with an understanding of how N-Gram analysis can be used to optimize search terms for Google Ads. We will explain the basics of N-Gram analysis, highlight its benefits for keyword optimization, and provide a step-by-step guide for conducting an N-Gram analysis to improve your Google Ads campaigns.

Introduction to N-Gram Analysis

What is an N-Gram?

An N-Gram is a sequence of N adjacent tokens or elements extracted from a larger text or sentence. In natural language processing (NLP), these tokens are typically words or letters. For example, a bigram (N=2) consists of two consecutive words or letters, while a trigram (N=3) consists of three consecutive words or letters. Analyzing such sequences can help identify patterns and frequencies within textual data.

Application Areas of N-Gram Analysis

N-Gram analysis has a variety of applications in different disciplines. In computational linguistics and NLP, it is used to create language models for predicting the probability of word sequences. In machine learning, N-Gram analysis can be used for text classification, sentiment analysis, or automatic text generation.
In the context of Google Ads, N-Gram analysis plays a crucial role in optimizing search terms and understanding user behavior in relation to searching for products and services.

Benefits of N-Gram Analysis for Search Term Optimization

NGram analysis offers several advantages for optimizing your search terms in Google Ads:

1. Identify patterns and trends: By uncovering common word combinations in search queries, you can identify patterns and trends relevant to your campaign goals.
2. Improved ad relevance: By using the identified patterns to adapt your ad copy, keywords, and landing pages, you can increase the relevance of your ads and thus potentially achieve better results in terms of CTR, CPC, and conversions.
3. Effective data analysis: Instead of focusing on individual keywords, NGram analytics allows you to gain broader insight into your audience’s behavior and make more informed decisions about your campaign strategy.

Overall, NGram analysis offers a more efficient approach to keyword optimization in Google Ads and can help improve your campaign results.

Steps for Conducting an N-Gram Analysis for Google Ads

Data Collection: Gathering search queries and clicks

For an NGram analysis, sufficient search query data should be available. For this, the time period and the selected campaign should ideally contain several thousand records. The data can then be transferred either using this script or the Google Ads transfer for BigQuery . Both methods have already been described in detail.

Data processing: Decomposition of search queries into NGrams

To decompose the search queries into grams, Google BigQuery with the ML.NGRAMS function is the method of choice.

This BigQuery code executes a query on the table `MYproject.gads_custom.search_term_view`. The main goals of this query are: 

1. Selecting specific columns from the table
2. Creation of N-grams (1-gram, 2-gram, 3-gram, 4-gram and 5-gram) from the search term
3. Applying a date range for the query

Here is an explanation of each part of the code:

- `SELECT`: Selects the specified columns and calculated fields from the table.
- `segments_date`: A column in the table that represents the date.
- `searchTermView_search_term search_term`: Selects the column `searchTermView_search_term` and renames it to `search_term`.
- `ML.NGRAMS(..)`: A function to create n-grams from a text sequence.
- `REGEXP_EXTRACT_ALL(LOWER(searchTermView_search_term), '[a-z0-9\\+\\-äüöß]+')`: Converts the search term to lowercase and extracts all strings that contain letters, numbers or certain special characters (e.g. +, -, ä, ü, ö, ß).
- `[1,1]`, `[2,2]`, `[3,3]`, `[4,4]`, `[5,5]`: Specifies that n-grams with a length of 1 to 5 should be created.
- The columns `metrics_impressions`, `metrics_clicks`, `metrics_cost_micros`, `metrics_conversions` and `metrics_conversions_value` are selected directly from the table.
- `FROM`: Specifies the table from which the data should be queried: `ap-MYproject.gads_custom.search_term_view`.
- `WHERE`: Filters the rows in the table based on the specified conditions.
- `segments_date BETWEEN PARSE_DATE("%Y%m%d","20230301") AND PARSE_DATE("%Y%m%d","20230411")`: Filters the rows with a date range between March 1, 2023 and April 11, 2023.
- `AND searchTermView_search_term IS NOT NULL`: Additionally filters all rows where the search term is not NULL.

This BigQuery code executes a query on the table `MYproject.gads_custom.search_term_view`. The main goals of this query are:

1. Selecting specific columns from the table
2. Creation of N-grams (1-gram, 2-gram, 3-gram, 4-gram and 5-gram) from the search term
3. Applying a date range for the query

Here is an explanation of each part of the code:

– `SELECT`: Selects the specified columns and calculated fields from the table.
– `segments_date`: A column in the table that represents the date.
– `searchTermView_search_term search_term`: Selects the column `searchTermView_search_term` and renames it to `search_term`.
– `ML.NGRAMS(..)`: A function to create n-grams from a text sequence.
– `REGEXP_EXTRACT_ALL(LOWER(searchTermView_search_term), ‚[a-z0-9\+\-äüöß]+’)`: Converts the search term to lowercase and extracts all strings that contain letters, numbers or certain special characters (e.g. +, -, ä, ü, ö, ß).
– `[1,1]`, `[2,2]`, `[3,3]`, `[4,4]`, `[5,5]`: Specifies that n-grams with a length of 1 to 5 should be created.
– The columns `metrics_impressions`, `metrics_clicks`, `metrics_cost_micros`, `metrics_conversions` and `metrics_conversions_value` are selected directly from the table.
– `FROM`: Specifies the table from which the data should be queried: `MYproject.gads_custom.search_term_view`.
– `WHERE`: Filters the rows in the table based on the specified conditions.
– `segments_date BETWEEN PARSE_DATE(“%Y%m%d”,“20230301″) AND PARSE_DATE(“%Y%m%d”,“20230411″)`: Filters the rows with a date range between March 1, 2023 and April 11, 2023.
– `AND searchTermView_search_term IS NOT NULL`: Additionally filters all rows where the search term is not NULL.

Further Modification

For further processing, we now want to type the created grams so that we can choose between “1_gram” or “2_gram” in the data. We also want to see some example search queries from which the respective gram was created. We also want to summarize the KPIs so that we can estimate whether the gram is having a positive or negative impact on our campaigns. We do this with the following query:

 SELECT
  gram,
  "1_gram" type,
  ARRAY_TO_STRING(ARRAY_AGG(DISTINCT search_term IGNORE NULLS LIMIT 5 ),", ") top_search_terms,
  SUM(impressions) impressions,
  SUM(clicks) clicks,
  SUM(cost) cost,
  SUM(conversions) conversions,
  SUM(conversionValue) conversionValue
  FROM ngrams, UNNEST(gram_1) gram
  WHERE gram IS NOT NULL
  GROUP BY 1,2

The complete BigQuery query

WITH
ngrams as (
 SELECT
  segments_date,
  searchTermView_search_term search_term,
  ML.NGRAMS(REGEXP_EXTRACT_ALL(LOWER(searchTermView_search_term), '[a-z0-9\\+\\-äüöß]+'),[1,1]) gram_1,
  ML.NGRAMS(REGEXP_EXTRACT_ALL(LOWER(searchTermView_search_term), '[a-z0-9\\+\\-äüöß]+'),[2,2]) gram_2,
  ML.NGRAMS(REGEXP_EXTRACT_ALL(LOWER(searchTermView_search_term), '[a-z0-9\\+\\-äüöß]+'),[3,3]) gram_3,
  ML.NGRAMS(REGEXP_EXTRACT_ALL(LOWER(searchTermView_search_term), '[a-z0-9\\+\\-äüöß]+'),[4,4]) gram_4,
  ML.NGRAMS(REGEXP_EXTRACT_ALL(LOWER(searchTermView_search_term), '[a-z0-9\\+\\-äüöß]+'),[5,5]) gram_5,
  metrics_impressions impressions,
  metrics_clicks clicks,
  metrics_cost_micros cost,
  metrics_conversions conversions,
  metrics_conversions_value conversionValue
FROM `myproject.gads_custom.search_term_view`  
#WHERE segments_date BETWEEN PARSE_DATE("%Y%m%d",@DS_START_DATE) AND PARSE_DATE("%Y%m%d",@DS_END_DATE)
WHERE segments_date BETWEEN PARSE_DATE("%Y%m%d","20230301") AND PARSE_DATE("%Y%m%d","20230411")
AND searchTermView_search_term IS NOT NULL
),
gramOne as (
 SELECT
  gram,
  "1_gram" type,
  ARRAY_TO_STRING(ARRAY_AGG(DISTINCT search_term IGNORE NULLS LIMIT 5 ),", ") top_search_terms,
  SUM(impressions) impressions,
  SUM(clicks) clicks,
  SUM(cost) cost,
  SUM(conversions) conversions,
  SUM(conversionValue) conversionValue
  FROM ngrams, UNNEST(gram_1) gram
  WHERE gram IS NOT NULL
  GROUP BY 1,2
),
gramTwo as (
 SELECT
  gram,
  "2_gram" type,
  ARRAY_TO_STRING(ARRAY_AGG(DISTINCT search_term IGNORE NULLS LIMIT 5 ),", ") top_search_terms,
  SUM(impressions) impressions,
  SUM(clicks) clicks,
  SUM(cost) cost,
  SUM(conversions) conversions,
  SUM(conversionValue) conversionValue
  FROM ngrams, UNNEST(gram_2) gram
  WHERE gram IS NOT NULL
  GROUP BY 1,2
),
gramThree as (
 SELECT
  gram,
  "3_gram" type,
  ARRAY_TO_STRING(ARRAY_AGG(DISTINCT search_term IGNORE NULLS LIMIT 5 ),", ") top_search_terms,
  SUM(impressions) impressions,
  SUM(clicks) clicks,
  SUM(cost) cost,
  SUM(conversions) conversions,
  SUM(conversionValue) conversionValue
  FROM ngrams, UNNEST(gram_3) gram
  WHERE gram IS NOT NULL
  GROUP BY 1,2
),
gramFour as (
 SELECT
  gram,
  "4_gram" type,
  ARRAY_TO_STRING(ARRAY_AGG(DISTINCT search_term IGNORE NULLS LIMIT 5 ),", ") top_search_terms,
  SUM(impressions) impressions,
  SUM(clicks) clicks,
  SUM(cost) cost,
  SUM(conversions) conversions,
  SUM(conversionValue) conversionValue
  FROM ngrams, UNNEST(gram_4) gram
  WHERE gram IS NOT NULL
  GROUP BY 1,2
),
gramFive as (
 SELECT
  gram,
  "5_gram" type,
  ARRAY_TO_STRING(ARRAY_AGG(DISTINCT search_term IGNORE NULLS LIMIT 5 ),", ") top_search_terms,
  SUM(impressions) impressions,
  SUM(clicks) clicks,
  SUM(cost) cost,
  SUM(conversions) conversions,
  SUM(conversionValue) conversionValue
  FROM ngrams, UNNEST(gram_5) gram
  WHERE gram IS NOT NULL
  GROUP BY 1,2
)
SELECT * FROM gramOne
UNION ALL
SELECT * FROM gramTwo
UNION ALL
SELECT * FROM gramThree
UNION ALL
SELECT * FROM gramFour
UNION ALL
SELECT * FROM gramFive

The query can be immediately adopted in Looker and should work for testing purposes. For better performance and lower query costs, I recommend setting up the query as a daily run and writing a table with the ngrams.

Bernhard prange webmeisterei

SEA-Experte: Bernhard Prange

Bernhard Prange ist Google Ads Freelancer und Tracking-Spezialist mit über 10 Jahren Erfahrung im Performance-Marketing. Sein Fokus liegt auf datengetriebenem Arbeiten: von Google Shopping über Conversion-Tracking bis hin zu serverseitigen Lösungen mit Matomo und BigQuery.

Als Ansprechpartner für Agenturen, E-Commerce-Unternehmen und B2B-Dienstleister verbindet er technisches Know-how mit strategischem Blick auf Marketing und Geschäftsmodelle.

Beiträge, die dich auch interessieren könnten…

  • Better data, better decisions: Data enrichment in Server-Side Tracking

    Lesen
  • Google Ads DemandGen: The Complete Practical Guide

    Lesen
  • Claude MCP: 30+ Integrations for WordPress, Google Ads & SEO

    Lesen
  • Track User Reading Behavior with Google Tag Manager

    Lesen