Q&AFrom: Ismail Parsa, IParsa@epsilon.comDate: Wed, 21 Mar 2001 13:35:17 -0500 Subject: On differences between the click-through rates reported by ad networks Summary: Ismail Parsa analyzes the differences between the click-through rates reported by ad networks, and those obtained from web log analysis. The problem was as follows: One of our clients hired an ad agency to blast emails and to place banner ads on various sites on the Internet. Epsilon analyzed the Web logs to identify conversion resulting from the advertising click-throughs. During this process we discovered that we were consistently under estimating the click-through rates reported by the ad agency. For example, the ad agency reported 2059 banner clicks from one of the sites where banners were posted where as we were able to recover 1019 related visits coming from that site. On average, we recovered 25% and 33% of banner and email related click-throughs respectively. The explanation follows: We were only able to recover a fraction of the click-throughs reported by the ad network as a result of processing the Web logs. In a nutshell, the majority of the discrepancy is explained by: (1) The difference in methodology deployed by the ad network and by Epsilon in counting the online media related hits (2) Unrecorded accesses caused by the phenomenon known as "caching." The following table lists the recovery rates by media and by campaign phase. Phase 2 Phase 3 Banner 26% 36% Email 18% 28% Overall 25% 33% The factors that lead to the inconsistent counts between the numbers reported by Epsilon and the ad network are as follows: 1) Counting methodology. Epsilon numbers reflect unique visits affiliated with online media where as the ad network reports clicks and not visits. For example, a visitor clicks 3 times on a banner while awaiting for the page to load. Clicking on an ad multiple times out of impatience is a common phenomenon. The ad network counts this as 3 click-throughs whereas Epsilon counts this a single visit to the site. This particular ad network counts consecutive banner ad clicks as multiple, up to six clicks. Any time a visitor clicks more than 6 times, it will count as 6. There are no such limits for email. I believe 20 to 30 percent of the average difference of 70 percent is explained by this. 2) Unrecorded accesses. When a visitor clicks on an ad, s/he makes a request to a Web server to serve him/her the page referenced by the ad. The referenced server serves the request. The visitor's browser, by design, keeps a local copy of all files involved in the request. This is called "locally caching" the web pages - i.e., maintaining fully functional local copies of the web pages. This is done for efficiency purposes. Next time the visitor clicks on that ad or follows the email link, the visitor's browser fulfills the request by serving up the local copy instead of the server referenced by the ad. The result of this is that the referenced server does not have a record of that repeat visit. This is known as "caching." There could be different levels of caching involved. A visitor's ISP or company may also generally cache all pages requested by their visitors/ workers on proxy servers for easy retrieval next time the visitor/ worker requests the same page. In all of these instances, the server referenced by the ad does not record these additional visits -- until and unless a new page other than the page involved during the initial visit has been requested. I believe another 20 to 30 percent of the average difference of 70 percent is explained by this. The remaining differences may be due to: 3) Not all click-throughs make it to the destination site. There may be some sort of network connection problem, or the visitor may choose to close the window before anything loads. 4) Corporate firewalls may block certain sites. The visitor clicks-through but is not allowed to go there... Or the visitor's e-mail server sits behind a firewall or the visitor's e-mail service disables tracking... In these instances, visitor reaction may not be traceable... 5) An unlikely scenario could be that some sites are causing ad network ad servers to double count the click based on a local implementation problem, for example, embedding the banner in a frame, perhaps each is being counted. 6) Another possible scenario is that ad network click-through counts may include entries from robots and spiders that come to register sites with search engines, etc. To conclude, in majority of the cases, the visits affiliated with online media were appropriately identified. Even when there is caching involved, the initial visit was identified. All subsequent visits/ events with a different page request, a purchase or a download event for example, were also identifying. The visits were consistently under-estimated but were not missed (unless the visit was of type 3 through 6.) --- I compiled my findings along with comments from others (too many to thank publicly.) But I want to thank in particular to Gordon Linoff from Data Miners, Changfeng Wang from Engage and Panos Ventikos from NetGenesis. I think KDNuggets readers can also benefit from these findings. I would also like to hear their experiences/ comments, if any, on this subject. Thanks, Ismail Ismail Parsa Sr. Director, e-Intelligence Analytic Consulting Group Epsilon 50 Cambridge Street Burlington MA 01803 USA 781.685.6734 (tel) 781.685.0806 (fax) |
Copyright © 2001 KDnuggets. Subscribe to KDnuggets News!