We’re a metric-happy bunch, we internet marketers. Data has been
our security blanket since the madness started, way back in 1995.
Now, though, things are getting a little uncomfortable. In 2011,
Google started encrypting search query data. Queries by users who were
connected to Google.com using an ‘https’ connection appear as “(not provided),” “(not set)” or “undefined” in analytics reports. At first, this was about 20-30% of
all searches. As of today, we see reports with over 80% “(not provided)” clicks. If you want the full explanation, visit the Search Engine
With “(not provided,)” Google’s hidden some of our most important
audience discovery data. For some web sites, “(not provided)” has hidden over 70%
of keyword traffic.
We’re not going to get it back. We can’t fight it any more than
I could stop a freight train with my thumbs. We can’t replace the data we’re
losing with another Google data source. Google Webmaster Tools query data is
woefully inaccurate. Adwords data isn’t much better. Nor can we replace Google
data with another source: Google occupies too much of the online universe.
If we’re going to keep finding new customers in the data, we
need a whole new metric.
This isn’t an SEO
If you think this is just an SEO problem, think again. Google is
such a dominant source of traffic, it isn’t part of a marketing channel.
It is a
marketing channel. Here’s Q2 2013 traffic sources from 20 of our clients:
Here’s why this is more than an SEO problem (unless noted otherwise, all of the data below is aggregate
clicks from Google organic, paid and display ads outnumber every other channel
but direct 10:1
of successful searches come from organic search results (from http://searchenginewatch.com/article/2200730/Organic-vs.-Paid-Search-Results-Organic-Wins-94-of-Time).
of organic searches are new visitors. That’s a wealth of data on potential new
- Google doesn’t reveal the proportion of branded vs. non-branded
searches that end up in (not provided). Extrapolation is impossible.
- As I said above, Google Webmaster Tools data is wildly…
random. It’s not even consistently
In other words: We no longer have access to accurate data on the
single largest source of new online customers.
The new metric: Random Affinities
We have to find a whole new method for data-driven audience
discovery. I’ve been playing with one such method for a while now, with good
results: Random affinities.
Random affinities are topics or brands that are related to each
other only because one or more people like them both. Examples include Pizza
and iPods™, Ford Mustangs™ and Tic Tac™, internet marketing and Dungeons and
Dragons (I’m not kidding) or travel and books.
Random affinities can connect general interests or brands to
each other in ways you wouldn’t expect. Those connections can reveal audiences
you didn’t even know existed.
You can pull random affinities from the social graph. Social
media services don’t generate much traffic to other websites (hence their
absence in the above chart). But they do get
a continent’s worth of users. Any cross-section you can generate from the
social graph is bound to provide a solid sample.
Other new metrics
I’m biased, of course. Random affinities have been my brainchild-in-the-making
for over three years now. There are other possible metrics for new audience
- Third-party data: Hitwise, Nielsen Netratings and others use
packet sampling and run their own ‘user panels’. They can provide insight
independent of Google.
- Other search engines. Bing only has 18% marketshare. But 18% of
a kajillion is still a lot. If you can get accurate keyword data, it’s useful.
One last one: Your instincts. Never, ever underestimate your
gut. Marketing is still more art than science. As long as our audience remains
human, it will be.