KDnuggets : News : 2001 : n07 : item13    (previous | next)

Q&A

From: Ismail Parsa, IParsa@epsilon.com
Date: Wed, 21 Mar 2001 13:35:17 -0500
Subject: On differences between the click-through rates reported by ad networks
Summary: Ismail Parsa analyzes the differences between the click-through
rates reported by ad networks, and those obtained from web log analysis.

The problem was as follows:

 One of our clients hired an ad agency to blast emails and
 to place banner ads on various sites on the Internet.
 Epsilon analyzed the Web logs to identify conversion
 resulting from the advertising click-throughs. During this
 process we discovered that we were consistently under
 estimating the click-through rates reported by the ad agency.
 For example, the ad agency reported 2059 banner clicks from
 one of the sites where banners were posted where as we were
 able to recover 1019 related visits coming from that site.
 On average, we recovered 25% and 33% of banner and email
 related click-throughs respectively.

The explanation follows:

We were only able to recover a fraction of the click-throughs
reported by the ad network as a result of processing the Web
logs. In a nutshell, the majority of the discrepancy is
explained by:

(1) The difference in methodology deployed by the ad network
    and by Epsilon in counting the online media related hits
(2) Unrecorded accesses caused by the phenomenon known as
    "caching."

The following table lists the recovery rates by media and by
campaign phase.

              Phase 2        Phase 3
     Banner    26%       36%
     Email     18%       28%
     Overall   25%       33%

The factors that lead to the inconsistent counts between the
numbers reported by Epsilon and the ad network are as follows:

1) Counting methodology.
Epsilon numbers reflect unique visits affiliated with online
media where as the ad network reports clicks and not visits.
For example, a visitor clicks 3 times on a banner while awaiting
for the page to load. Clicking on an ad multiple times out of
impatience is a common phenomenon. The ad network counts this as
3 click-throughs whereas Epsilon counts this a single visit to
the site. This particular ad network counts consecutive banner
ad clicks as multiple, up to six clicks. Any time a visitor
clicks more than 6 times, it will count as 6. There are no such
limits for email. I believe 20 to 30 percent of the average
difference of 70 percent is explained by this.

2) Unrecorded accesses.
When a visitor clicks on an ad, s/he makes a request to a Web
server to serve him/her the page referenced by the ad. The
referenced server serves the request. The visitor's browser,
by design, keeps a local copy of all files involved in the
request. This is called "locally caching" the web pages - i.e.,
maintaining fully functional local copies of the web pages. This
is done for efficiency purposes. Next time the visitor clicks
on that ad or follows the email link, the visitor's browser
fulfills the request by serving up the local copy instead of the
server referenced by the ad. The result of this is that the
referenced server does not have a record of that repeat visit.
This is known as "caching." There could be different levels of
caching involved. A visitor's ISP or company may also generally
cache all pages requested by their visitors/ workers on proxy
servers for easy retrieval next time the visitor/ worker requests
the same page. In all of these instances, the server referenced
by the ad does not record these additional visits -- until and
unless a new page other than the page involved during the initial
visit has been requested. I believe another 20 to 30 percent of
the average difference of 70 percent is explained by this.

The remaining differences may be due to:

3) Not all click-throughs make it to the destination site.  There
may be some sort of network connection problem, or the visitor
may choose to close the window before anything loads.

4) Corporate firewalls may block certain sites. The visitor
clicks-through but is not allowed to go there... Or the visitor's
e-mail server sits behind a firewall or the visitor's e-mail
service disables tracking... In these instances, visitor reaction
may not be traceable...

5) An unlikely scenario could be that some sites are causing
ad network ad servers to double count the click based on a local
implementation problem, for example, embedding the banner in a
frame, perhaps each is being counted.

6) Another possible scenario is that ad network click-through
counts may include entries from robots and spiders that come to
register sites with search engines, etc.

To conclude, in majority of the cases, the visits affiliated with
online media were appropriately identified. Even when there is
caching involved, the initial visit was identified. All subsequent
visits/ events with a different page request, a purchase or a
download event for example, were also identifying. The visits were
consistently under-estimated but were not missed (unless the visit
was of type 3 through 6.)

---
I compiled my findings along with comments from others (too many to thank
publicly.) But I want to thank in particular to Gordon Linoff
from Data Miners, Changfeng Wang from Engage and Panos Ventikos
from NetGenesis. I think KDNuggets readers can also benefit from
these findings. I would also like to hear their experiences/
comments, if any, on this subject.

Thanks,

     Ismail

Ismail Parsa
Sr. Director, e-Intelligence
Analytic Consulting Group
Epsilon
50 Cambridge Street
Burlington MA 01803 USA
781.685.6734 (tel)
781.685.0806 (fax)

KDnuggets : News : 2001 : n07 : item13    (previous | next)

Copyright © 2001 KDnuggets.   Subscribe to KDnuggets News!