tee-bee-dak | digital analytics, data, marketing

The web analytics implementation will die, part 2

I don’t know everything. From time to time, I am sure to write myself into a hole. Usually I’ll provide an escape route. The perfect example of such an escape route? Last month’s prediction post about “track everything“, where I so elegantly stated:

dataworld“We (Digital Analytics) need a solution that captures and organizes everything that happens on our sites, and lets analysts further sort everything out after the fact. Maybe it already exists, and if so… great (where can I sign up?)! I think there are several vendors out there who have some piece of the beginning stages of this dreamed up technology, but in fact they are quite a ways away from offering what I envision.”

Like I said … “maybe it already exists” … “I think there are vendors who have a piece of the beginning stages” … Escape routes! Years of consulting at work here, and I do apologize for it.

As it turns out, minor names such as Google and Tealium are already totally involved in a relationship with this concept. Other names… beautiful new names… like Heap Analytics (page title: “Capture Everything”) and SnowPlow Analytics, are also fully involved in relationships with the concept.

That’s great news.

It’s great news that makes me wonder whether today’s (yesterday’s?) web analytics vendors (HEY IBM, ADOBE, WEBTRENDS) should consider a drastic shift in data collection methodologies, or a drastic shift in business models (ahem… SERVICES). They’re in a dying game, as far as I (and SnowPlow/Heap) are concerned. SnowPlow was kind enough to publish this SlideShare deck yesterday to help me with my point.

At the same time… what these vendors address today is only part of what I was insinuating with my last post on “track everything”.

For data science (this may seem obvious)… you need data. What may not seem as obvious is that the data needs to be, to the extent possible, unbiased, disaggregated, and collected without abandon. That is not how today’s web analytics implementations are done. First off, you have some solutions that don’t offer disaggregated access to data. Beyond that, almost all of today’s (yesterday’s!) solutions indeed are biased and collected with abandon.

I want to know how a visitor interacted with a page. I don’t want to have to define which components I want to track, and what to call them, as part of the data collection process.

I want to know everything about the page at the point in time that the data was collected. I don’t want to have to reference a CMS or any other system or person to figure out what content was where.

I want to be able to report out at the page component level (those who saw this image were more likely to…)… without pre-defining what components to track.

Referring URL? I want to know everything about that page at the point in time that the data was collected (to the extent possible).

Did a twonversation or other social conversation occur based on this URL? Initiated from this URL? Either way, bring the data.

Earlier this evening I stated that “our world is not ready” (for this stuff). What I meant by that statement is that our digital analytics technology and expertise is still centered around traditional web analytics, and that suffices for an overwhelming majority of those who care. That said, the world moves fast. Data scientists want this unbiased, disaggregated, mass-collected data. They want to understand the data, but they don’t want someone else to predetermine that understanding and bake it into said data.

As “quickly” as the web analytics world moved from log files to page tags, I expect we will see a migration from predefined implementations to big data implementations.


EDIT: I forgot to credit a couple folks for the conversation/writings that led to part 2 here! Allison Hartsoe / @ahartsoe pointed out Heap in this post. Peter O’Neill pointed out SnowPlow and Celebrus — who didn’t even get a mention in my speedy delivery last night:  http://www.celebrus.com/



Thanks for stopping by. This "web/digital analytics implementation death" thing became a series. There were four posts:


This will continue to be a theme of my analytics hygiene posts, and presentations at eMetrics Chicago and... (Boston?)... and...?
  • jacqueswarren

    It seems to me that this is what iJento has been already doing for some time

    • toddbdac

      I most certainly will be taking a look — thanks Jacques!

  • Simon Burton

    Hi Todd,
    I am the CEO here at Celebrus Technologies. If you’d like to take a look at a system that can simply and easily collect detailed, individual level interaction data from all digital channels, please send me an email and I will organise a WebEx to show you how it’s done. I can be reached at simon.burton@celebrus.com
    Best regards

    • toddbdac

      Hi Simon — if you have a lull in the upcoming weeks or months, I would love to take you up on a closer look. I’ll reach out via email!

  • Simon Burton

    Sorry Jacques I disagree with you. If you speak to John Woods the CTO and founder at iJento, he would be the first to say they are unable to collect the level of data that Celebrus is capable of. We have huge respect for the iJento guys, however, their business model is to improve upon the ability of say Adobe, to segment and profile individuals from their “tagging” derived data. Celebrus is about delivering a tagging free system capable of capturing a complete audit trail of a users interactions with a website or web based application and then making that available in real-time to our technology partner systems such as Teradata or Oracle’s Enterprise Data Warehouse or Real-time Decisioning systems or SAS’s Campaign Management, BI & Predictive Analytics systems to drive “customer” analytics from either the web or Omni-channel or true 1:1 personalisation.

  • http://tim.webanalyticsdemystified.com/ Tim Wilson

    There is a lot that could be teased apart about these posts. I’m not 100% sold that this is the ideal future. But, I’m not sold that it isn’t, either. Adam Greco made a pretty keen observation after attending a conference earlier this year: “Most of the speakers were from agencies, and they all talked about cross-channel data and data integration. Yet, I spend my days with large companies that are still struggling with actually getting good and actionable data for their web site — actually nailing the fundamentals.” That totally resonated with me. I’ve spent a good chunk of my career in the reality that sooo many marketers struggle to: 1) clearly articulate what it is they’re actually trying to accomplish (a level down from “deliver ROI”…which is the “right” goal…but not all that useful without some meat behind it), and 2) (related) tackle low-ish hanging fruit by using data to answer questions that would lead to action.

    Analysts are complicit in this. As an industry, we don’t do a good job of helping marketers get over those two humps. And that’s bad.

    Coming from another direction, all analytics-related vendors — web analytics, social analytics, BI, tag management, cross-channel data integration, attribution management — make their numbers each quarter by touting that the biggest challenge is “getting the data,” and then data scientists armed with SPSS/SAS/R/you-name-it will be able to “mine for insights.” In my experience, which is heavy digital marketing (not advanced customer analytics where there is a vast amount of detailed customer data in a warehouse), “get all the data first and figure out what to do with it later” is more likely to lead to confusion and frustration than to a smoothly clicking data-driven organization (see the two struggles noted above).

    In the end, it’s not an “either/or” scenario. It’s more a sequencing issue: we can’t chase everything at once, and we have to provide near-term value. So, how do we balance people (training, hiring), process (for doing meaningful analysis), and technology (for capturing and crunching data)? All three are required, and the value delivered will be no more than the least mature of these three.

    • toddbdac

      Thanks for weighing in, Tim… there is much to talk about, for sure.

      Fundamentals are getting the job done for those who are doing it right, and they would be getting the job done for others. That will be the case for years. This direction will provide an additional layer of data above and beyond what we even think about capturing today, and it will remove collection bias… this creates a more inviting arena for data scientists, where they can better trust the data.

      I think of the solution as modularized. A base module covers your standard web stats… no need to “mine for insights” if you’re looking for standard ABC (thanks, Google) reporting.

      You are so right on this not being either/or. You can’t just collect everything and make sense of it with data science. In this magical future, some of today’s implementation engineers might find themselves working with data scientists who need context. Others may find themselves working with web teams, ensuring that sites are properly implemented so that the automated collection makes as much sense as possible.

      If history has anything to say about it, there will then also be a contingency of web analytics implementation engineers who are still doing JavaScript implementations in 10 years.

    • cleveyoung

      I’ve often been guilty of wanting to track much more than was requested. The urge was track as much as possible and then slice and dice that data into fantastic insights. The reality usually ended with me having access to much more data than I ever had the opportunity to use; the heavy analysis wasn’t a technical barrier, rather a marketing/business barrier. The marketing and/or business people simple had X amount of time to listen, grasp, build strategy, and execute how to utilize those golden nuggets of information I would occasionally provide.

      As Tim says, this is not an either/or scenario. It’s inevitable that we will continue to track more and more data. In fact, more data than 99% of companies are anywhere near ready to utilize. What we need to be careful of is asking for investments in technologies and resources to gather “it all”, which will not be cheap, and then not being able to deliver an acceptable amount of return on that investment. We will have the data, and maybe even a data scientist or two, but all of the great information and insights in the world won’t mean much if the marketing and business people are not ready or able to make use of it. Until we get better at utilizing the data we have now, then for most companies it will be a waste of money and resources to start acquiring even more data (which was probably Adam’s basic point too).

  • Matt Gershoff

    IMHO Too much focus on data collects perhaps, and for certain too much emphasis on web sites.
    1) The real constraint is not about data collection. What is binding is that processes tend to lack the capacity to be responsive/adaptive in the face of information about their environments/context. Data collection is all about the quality, coverage and resolution of the systems sensors – but is useless if the system has no actuators – an ability to chose from competing actions. We want to start asking about the expected marginal value of data (or better information) wrt actions that we can take.
    2) Any future system should have the ability to span tech platforms – so webpage, mobile app, call center IVR and agent scripts, email systems – whatever. You want your CMS to ‘know’ about the data – that way it can start making better decsions about content, layout etc.

    • toddbdac

      Thanks for the comments, Matt! The spirit of openness covered in #2 is important, and perhaps is another warning shot for the enterprise vendors who preach consolidation while aiming for the ultimate suite of features. The consolidation is a myth, and the #1 feature is cross-compatibility.

      I think we share the opinion that what folks are most bound on is the capacity to be responsive/adaptive, but at the time of writing this my mind was purely on collection. Web shines through because that’s where I’ve been for the last fifteen years or so, but I do mean for this series to apply to data collection with regard to applications in general. I feel like data science loses when developers or marketers decide what gets collected and how.

  • http://adam.webanalyticsdemystified.com Adam Greco

    Todd – This is an interesting discussion. As Tim mentions below, my experience has been a been different with companies collecting too much data and overwhelming their end-users. If you are talking about a core team of web analysts as your audience, it may be feasible to collect “everything” and classify/make sense of it later, but for day-t-day marketers, my experience show that you need to present a finite set of data that answer specific questions they have. This is especially the case when doing web analytics on a global scale where you need to enforce some sort of data standards to be able to compare different brands or different regions. I have seen that rolling out ten data points for a large global organization can be a challenge and would not like to endeavor to roll out hundred of random data points to thousands of users and hope for the best.

    One other point to make. While I agree that there are some new exciting developments and vendors that allow you to collect everything, it strikes me that vendors such as Tealeaf have been doing this for over a decade. While Tealeaf is a super-powerful tool, I have not seen it used as a daily “web analytics” tool by many companies. It is very complimentary to web analytics tools, due to its ability to replay sessions at a detailed level, but how many companies do you know that are saying “we won’t use GA or Adobe or Webtrends for web analytics because we prefer to just give our users access to Tealeaf since it collects every piece of data…” I have not seen that to be the case and this seems to be a test case of what you are proposing as the future state. Many companies will feed click data into their internal data warehouses, but I think that is different than just having all “events” at your disposal. Perhaps as marketers become more digitally and analytically savvy, it will happen as you suggest, but so far the last ten years or so that I have been in the game have gone the complete opposite direction. Why do clients abandon Adobe and go to GA or GA Premium? Whether it is true or not, the reason I hear is that Adobe is “to complicated” for them and they like the simplicity of GA and its easy to understand interface. If Adobe is too complex for folks, how will they deal with reams of unstructured data with little to no pre-defined interface?

    In summary, I don’t disagree with you that “collect everything” is enticing and may in fact be where the industry heads, but my experience says at this moment, I have not seen large global brands exhibit this behavior to date….I also think it depends if the audience for this data is a core analytics team or the hundreds of end-users that need a little bit of data each day to do their job. Eric Peterson discussed this a bit over four years ago in this post: http://blog.webanalyticsdemystified.com/weblog/2010/02/the-coming-bifurcation-in-web-analytics-tools.html

    • toddbdac

      Hi Adam! Thank you for joining – I appreciate the insights coming from you. I definitely agree with regards to what folks can, should, and do do with the data that is collected. In most cases, there is a pretty finite set of needs. It’s a stretch for many to handle analytics in the first place!

      In a response to a comment from Michele earlier this week, I conceded that over 50% of organizations are going to be fine with whatever they are given by Google (though I’ve also predicted that Google will offer this methodology within the next two years!), and as you are getting at… another good percentage are going to be fine with whatever they get from the legacy enterprise vendors. I can’t comment on TeaLeaf, but I plan to take a deeper look at some of the alternatives this year so that I can speak more intelligently on the options.

      Eric’s post gets to the point that organizations are set up differently (and, after someone else pointed me back to it I even included it in part three of this same series! http://www.tbdac.com/death-web-analytics-implementation-part-3/) . It’s probably a three armed branch we’re looking at, where some software vendors will be able position themselves well by molding themselves to fit around the needs of the most popular formats. (1: traditional web analytics w/some customer analytics, 2: business intelligence / data science, 3: traditional + BI/DS).

      This three armed branch and all of the vendors becoming available to support it is why I, in the end, claimed 2014 as the year of expansion for vendors. Not that we will see more vendors, but that it will be recognized that they exist (and I’m sure we’ll see more, too!)… There will be no one way to do analytics. I do think we will get to the point where there is a shift in the most popular way to capture data in web analytics… and it will be by filtering data out rather than selecting data in.

      “Filtering data out rather than selecting data in” is probably the most succinct explanation of how I want things to be done, as “capture everything” does end up sounding like a mountain of data that one must wade through.