The last excerpt from “Achieving the Gold Standard in Paid-Search Bid Optimization” clearly established that model-based optimization is superior to rules-based and that, in the world of models, global optimization beats local. But there is another layer of of global optimization that needs exploring. Within the global subset of model-based optimization are keyword-level and cluster modeling. The two primary differences between them are how they deal with performance data – especially the sparse data one finds in tail terms – and the financial impact on paid-search programs. If you want to back-story before you dive in here, you can start with “Paid-Search Overview” and “Rules vs. Models”.
Global Keyword vs. Global Cluster Modeling
Global keyword optimization results in improvements that are significantly better than cluster modeling because each keyword is treated individually. Cluster modeling creates groups of keywords, treating each member the same as the others – a hazardous shortcut for advertisers with large, complex paid-search campaigns.
Cluster model-based systems were developed to cope with the “sparse data” problem of tail terms for which there is very little historical data available. As the name implies, clustering aggregates data from several keywords to hundreds and even thousands of keywords, which are assumed to have the same general performance characteristics. Simply put, clustering was developed to assemble data sets large enough to permit human analysts to apply more traditional statistical techniques to determine bids.
While clustering may seem like a necessary way to manage keywords that receive very few clicks or impressions, the technique is in many ways sub-optimal. On the plus side, using clusters leads to model stability, meaning the results are repeatable, which is something statisticians love. But the negative effects of using clustering are substantial. For one, keywords in a cluster might seem similar, but each keyword is actually unique at some level. While statisticians might value the extra data, individual keywords lose their uniqueness in a cluster. The result is a loss of performance compared to modeling each keyword as a unique entity (see figure below), as the advertiser spends too much on some keywords in a cluster and too little on others.

A second issue is clustering typically requires statisticians to tune the models manually. As a result, cluster-based solutions are rarely pure software applications and thus can’t offer the degree of automated bid optimization needed by advertisers managing thousands or millions of keywords in a dynamic bid-based marketplace. A third issue is the time-intensive task of creating clusters, as all statistical models decay over time, making it necessary to periodically retrain the statistical models involved. For these two reasons, cluster models tend to be used far beyond their useful life cycle with a resulting negative impact on performance.
Clustering survives due to the belief that keyword-level modeling, while far superior in performance terms, can’t be done on tail terms for which there simply isn’t enough data available to build accurate models. That’s why most paid-search bid optimization vendors extensively rely on clustering, even though they might not reveal this fact to customers. Why? Clustering has yet to be automated and requires human analysis driving up cost and response times, even as it leads to sup-optimal performance on the vast majority of keywords in a campaign.
Fortunately for advertisers, this belief is false. Keyword-level modeling can be done on keywords with as few as 10 conversions a year. What’s required is the right balance of the appropriate math, software automation, and transparency into the specific variables that drive individual keyword performance. Global keyword-level modeling is the gold standard of bid management. It regularly improves performance (profit, revenue, ROAS, etc.) by 25 percent or more in controlled tests against global cluster-level, local keyword-level and rules-based technologies. The basic elements driving success for global keyword-level modeling can be summarized as follows.
Appropriate math: The mathematical approach to global keyword-level modeling rejects the “requirement” to cluster in order to create sufficient click or impression data. Clustering might produce an acceptable “average” for all keywords in the cluster but has little relevance to the future performance of the individual keywords the cluster contains. This accounts for the sub-optimal performance of clustering versus modeling the specific performance of the individual keywords within the cluster, which is achievable for even keywords with small amounts of data. Achievable, that is, using the appropriate mathematical modeling techniques – such as structural risk minimization, a technique that trains models to become simpler as data sets become sparser – as opposed to complex clustering techniques, which deal with sparse data by building models that are often far more elaborate than the data will support.
In one controlled test, using the right math to individually model each keyword drove 216 percent more account sign-ups than a competing cluster-level solution. One issue in this test was the age of clusters used, which hadn’t been refreshed and contained keyword groupings that were simply obsolete. Keyword-level modeling techniques identified a handful of good keywords hidden in clusters of bad keywords. Separating those out and bidding them up led to the volume increase.
Since keyword-level modeling eliminates the need to cluster, it can be achieved through software automation. For advertisers dealing with large numbers of keywords who want to avoid the performance sacrifice in long-tail keywords that is inevitable with clustering, automation is usually an advantage in a dynamic advertising marketplace, both in terms of lower cost and superior daily bid optimization of all keywords in an SEM program. The same type of software automation techniques can be applied to retraining the models that predict keyword performance. The models can be “taught” to automatically learn based on the most recent inputs that make them much more responsive to changing marketplace conditions.
Transparency: Transparency is the ability to make visible to the advertiser the individual variables that drive keyword performance to build understanding and trust. In this sense, clusters are about as transparent as mud, since the complexity of clusters makes them almost impossible to describe. Modeling keywords individually allows advertisers to make intelligent determinations on each and every keyword in an SEM campaign. They can see the decisions the software made and why. Figure 3 compared five variables that drove performance in two similar but distinct keywords. Using clustering, marketers would be forced to conclude that only one variable – cost per click – was the common and determinant performance factor, whereas in reality, the two keywords have very dissimilar performance profiles and require very different optimization strategies.
Bottom Line
Model-based keyword bid optimization is by far superior to rules-based, but selecting the right model-based system is critical to maximizing paid-search campaign performance. Where local optimization is incrementally better than rules-based, the performance improvement does not approach that of global cluster or global keyword. But only global keyword-level modeling, the gold standard of bid optimization, will truly maximize paid-search performance.
Keyword-level modeling for all keywords in an SEM program, not just the low-hanging fruit of head terms, is a demonstrably superior approach and delivers overall performance gains of 25 percent or more over clustering and other techniques. Software driven bid optimization techniques that use keyword-level modeling also provide advertisers with greater flexibility and control. Clusters take time to create and rebuild in response to changing business goals and marketplace conditions. With automated keyword-level bid optimization techniques, advertisers can continually change campaign goals and constraints and rebid keywords at any frequency they need.
Keyword-level, global bid optimization is the ultimate solution for deriving maximum profit from SEM campaigns that involve a large number of keywords. It’s the right choice for optimizing head terms, tail terms and everything in between.
@OptiMineInc