Wednesday, April 13, 2016

Mutli-Channel Attribution and Understanding Interaction

I'm no cosmologist, but this post is going to rely on a concept well known to astrophysicists, who often have something in common with today's marketers (as much as they might be loathe to admit it). So what is it that links marketing analytics to one of the coolest and most 'pure' sciences known to man?

I'll give you a hint: it has to do with such awesome topics as black holes, distant planets, and dark matter

The answer? It has to do with measuring the impacts of things that we can't actually see directly, but still make their presence felt. This is common practice for scientists who study the universe, and yet not nearly common enough among marketers and people who evaluate media spend and results. Like physicists, marketing analysis has progressed in stages, but we have the advantage of coming into a much more mature field, and thus avoiding the mistakes of earlier times.

Marketing analytics over the years and the assumptions created :

  • Overall Business Results (i.e. revenue) : if good, marketing is working!
  • Reach/Audience Measures (i.e. GRPs/TRPs) : more eyeballs = better marketing!
  • Last-click Attribution (i.e. click conversions) : put more money into paid search!
  • Path-based Attribution (i.e. weighted conversions) : I can track a linear path to purchase!
  • Model-based Attribution (i.e. beta coefficients) : marketing is a complex web of influences!

So what does this last one mean, and how does it relate to space? When trying to find objects in the distant regions of the cosmos, scientists often rely on indirect means of locating and measuring their targets, because they can't be observed normally. For instance, we can't see planets orbiting distant stars even with our best telescopes. However, based on things like the bend in light emitted from a star, and the composition of gases detected, we can 'know' that there is a planet in orbit of a certain size and density, that is affecting the measurements that we would expect to get from that star in the absence of such a hypothetical planet. Similarly we don't see black holes, but we can detect a certain radiation signature that is created when gases under the immense gravitational force of the black hole give off x-rays.

This is basically what a good media mix/attribution model is attempting to do, and it's why regression models can work so well. You are trying to isolate the effect of a particular marketing channel or effort, not in a vacuum, but in the overall context of the consumer environment. I first remember seeing white papers about this mainly about measuring brand lift due to exposure to TV or display ads, but those were usually simple linear regression problems, connecting a single predictor variable to a response, or done as a chi-square style hypothesis test. But outside of a controlled experiment, this method simply won't give you an accurate picture of your marketing ecosystem that takes into account the whole customer journey.

As a marketer, you've surely been asked at some point "what's the ROI of x channel?" or "How many sales did x advertisement drive?" And perhaps, once upon a time, you would have been content to pull a quick conversion number out of your web analytics platform and call it a day. However, any company that does things this way isn't only going to get a completely incorrect (and therefore useless) answer, but they aren't really even asking the right question.

Modern marketing models tell us that channels can't be evaluated in isolation, even if you can make a substantially accurate attempt to isolate a specific channel's contribution to overall marketing outcomes in a particular holistic context.

Why does that last part matter? Because even if you can build a great model out of clean data that is highly predictive, all of the 'contribution' measuring that you are doing is dependent on the other variables.

So for example, if you determine that PPC is responsible for 15% of all conversions, Facebook is 9%, and email is 6%, and then back into an ROI value based on the cost of each channel and the value of the conversions, you still have to be very careful with what you do with that information. The nature of many common methods for predictive modeling is such that if your boss says, "Well, based on your model PPC has the best ROI and Facebook has the worst, so take the Facebook budget and put it into PPC" you have no reason to think that your results will improve, or change the way you assume.

Why not? Because hidden interactivity between channels is built into the models, so some of the value that PPC is providing in your initial model (as well as any error term), is based on the levels of Facebook activity that were measured during your sample period.

It's a subtle distinction, but an important one. If you truly want to have an accurate understanding of the real world that your marketing takes place in, be ready to do a few things:
  1. Ask slightly different questions; look at overall marketing ROI with the current channel mix, and how each channel contributes, taking into account interaction
  2. Use that information to make incremental changes to your budget allocations and marketing strategies, while continuously updating your models to make sure they still predict out-of-sample data accurately
  3. If you are testing something across channels or running a new campaign, try adding it as a binary categorical variable to your model, or a split in your decision tree
Just remember, ROI is a top-level metric, and shouldn't necessarily be applied at the channel level the way that people are used to. Say this to your boss "The marketing ROI, given our current/recent marketing mix, is xxxxxxx, with relative attribution between the channels being yyyyyyy. Knowing that, I would recommend increasing/decreasing investment in channel (variable) A for a few weeks, which according to the model would increase conversions by Z, and then see if that prediction is accurate." Re-run the model, check assumptions, rinse, repeat.


Friday, April 8, 2016

Own Your Data (Or at least house a copy)

There is a common trope among data analysts that 80% of your time is spent collecting, cleaning, and organizing your data, and just 20% is spent on analyzing or modeling. While the exact numbers may vary, if you work with data much you have probably heard something like that, and found it to be more or less true, if exaggerated. As society has moved to collecting exponentially more information, we have of course seen proliferation in the types, formats, and structures of the information, and thus the technologies that we use to store it. Moreover, very often you want to build a model that incorporates data that someone else is holding, according to their own methods that may or may not mesh well with your own.

For something as seemingly simple as a marketing channel attribution model, you might be looking at starting with a flat file containing upwards of 50-100 variables to start, pulled from 10+ sources. I recently went through this process to update two years' worth of weekly data, and it no joke took days and days of data prep just to get a CSV that could be imported into R or SAS for the actual analysis and modeling steps. Facebook, Twitter, Adwords, Bing, LinkedIn, YouTube, Google Analytics, a whole host of display providers... the list goes on and on. All of them using different date formats, calling similar/identical variables by different names, limited data exports to 30 days or 90 days at a time, etc..

Obviously, worth the effort for a big one-time study, but what about actually building a model for production? What about wanting to update your data set periodically and make sure the coefficients haven't changed too much? When dealing with in-house data (customer behavior, revenue forecasting, lead scoring, etc.) we often get spoiled by our databases, because we can just bang out a SQL query to return whatever information we want, in whatever shape we want. Plus, most tools like Tableau or R will plug right into a database, so you don't even have to transfer files manually.

At day's end, it quickly became apparent to me that having elements of our data, from CRM to Social to Advertising, live in an environment that I can't query or code against is just not compatible with solving the kinds of problems we want to solve. So of course, the next call I made was to our superstar data pipeline architect, an all-around genius who was building services for the Dev team running off our many AWS instances. I ask him to start thinking about how we should implement a data warehouse and connections to all of these sources if I can hook up the APIs, and he of course says he has already not only thought of it, but started building it. Turns out, he had a Redshift database up and was running Hadoop MapReduce jobs to populate it from our internal MongoDB!

So with that box checked, I started listing out the API calls we would want and all of the fields we should pull in, figure out the hook ups to the third party access points. Of course, as we have an agency partner for a lot of our paid media, that became the biggest remaining road block to my data heaven. I schedule a call to the rep in charge of their display & programmatic trade desk unit, just so we can chat about the best way to hook up and siphon off all of our daily ad traffic data from their proprietary system. After some back and forth, we finally arrive at a mainly satisfying strategy (with a few gaps due to how they calculate costs and potentially exposing other clients' data to us), but here is the kicker:

As we are trying to figure this out, he says that we are the first client to even ask about this.

I was so worried about being late to the game, that it didn't even occur to me that we would have to blaze this trail for them.

The takeaway? In an age of virtually limitless cheap cloud storage, and DevOps tools to automate API calls and database jobs, there is no reason that data analysts shouldn't have consistent access to a large, refreshing data lake (pun fully intended). The old model created a problem where we spend too much time gathering and pre-processing data, but the same technological advances that threaten to compound the problem can also solve it. JSON, SQL, Unstructured, and every other kind of data can live together, extracted, blended and loaded by HDFS into a temporary cloud instance, as needed.

The old 80/20 time model existed, and exists, because doing the right thing is harder, and takes more up-front work, but I'm pretty excited to take this journey and see how much time it saves over the long run.

(Famous last words before a 6 month project that ultimately fails to deliver on expectations; hope springs eternal)

What do you think? Have you tried to pull outside data into your own warehouse structure? Already solved this issue, or run into problems along the way? Share your experience in the comments!

Thursday, March 17, 2016

A Data-Driven Approach to the Marketing & Sales Funnel: Part 1

The goal of this series of posts is to provide value to the conversation around measuring marketing efforts and their contribution to the overall business goals of a company, and to make recommendations about how analytics and marketing can best work together to drive those goals. This will entail a bit of background on different marketing metrics, what their strengths and weaknesses are, and how they are commonly used, but is not meant to be a fully exhaustive list. Some points will draw on industry best practices, some will be technology based, and some will be statistical, but in an effort to be broadly useful, overly detailed or complex issues will be in separate posts.

https://upload.wikimedia.org/wikipedia/commons/thumb/9/9c/Probability_tree_diagram.svg/220px-Probability_tree_diagram.svg.png
Thought Process:
The first thing to try and do is (from a metrics/statistical standpoint) to stop thinking of customers and purchases in the traditional ways. Essentially, every entity, whether an individual or a company, that has a need for our product (i.e., the total IT training market) has what I will term a ‘potential for conversion event’ which is also conveniently the same initials as a ‘probability of conversion event,’ or a PCE.

(Thinking this way allows for some really beneficial logical progressions, one of the biggest being that there is no distinction to be made between new and renewal customers, or monthly and annual customers, from a qualitative standpoint, but more on that in a moment)

Every PCE or cluster of PCEs is just that: a business opportunity that has some chance of experiencing a ‘conversion event,’ which just means a transaction of some kind. For this endeavour, we are going to consider three different levels of contact with PCEs, the culmination of which is a ‘conversion event,’ which can happen unlimited times.

  • ‘Conversion Event’ - a transaction of some sort, regardless of amount or type
    • ex., trial sign up, first purchase, annual renewal, adding licenses, monthly auto-renew, etc.
  • ‘Event’ - a deliberate, non-transactional interaction initiated by a PCE or involving two-way communication
    • ex., email sign up, webinar sign up, phone call, chat, social follow/conversation
  • ‘Exposure’ - any interaction with the brand that is non-individualized or falls short of the other interactions
    • ex., see a display ad or Facebook status, read a blog post, visit the website, watch a YouTube video, etc.

The reality is that every purchase or conversion event is the cumulative result of a complex and infinitely mutable series of micro-interactions. As a result, we have to recognize that no conversion can be attributed to a single interaction, and that trying to do so will lead to erroneous assumptions. So the key to all measurement will be in the ‘P’ of PCE.

The goal in measurement will be, quite simply, to estimate as accurately as possible the effect of each interaction on the probability of an entity’s next conversion event, whether it is the first or tenth.

So, to look at what all this means, let’s go back to our initial definition of PCE, and examine ‘real world’ implications of this

Imagine that there are 10,000,000 individuals/companies in the world who need IT training:

  • The probability that someone who has no awareness of our brand will have a conversion event is 0%. You have to know about us to purchase
  • The probability that someone who knows that we exist, but nothing else, will have a conversion event is, say, .00001%
  • The probability that someone who knows us and follows us on twitter, and had read a few blog posts, is .001%
  • The probability that someone who has looked around the site, watched our YouTube videos, and discussed us on Reddit will have a conversion event is 1% (trial)
  • The probability that someone who has signed up for a free trial (experienced first conversion event) will have another (become a paying customer) is 20%
  • The probability that someone who has been on the site, gets our emails, did a free trial, and chats in to ask about subscription options will have a conversion event is 65%
  • The probability that someone who has had 7 users on annual licenses for the past three years, who combined watch an average of 16 hours of video per month, uses coaching, and has had two phone calls from a sales person about their account expiration date is 85%

Now, these are example values (though some of them are not far from real), but the point is that the only number that truly matters, from a revenue standpoint, is the probability of their next conversion event, which, combined with the size of the opportunity, can give us the expected value of an event.
  • i.e., a monthly customer in his/her second month, based on x interaction data, has a 78% chance of a $299 conversion event  20 days from now, with an expected conversion value of $233.22)
  • Of course, we can take this further. Expected value allows us to chain events this way:
  • for monthly customers, we know that they, as a whole, renew at some constant rate, with each renewal being a conversion event; this is a conditional probability (chances of X, given that Y)
    • Looking at everyone who buys a monthly then, we start with a total conditional probability estimate that looks like this:



So we would start by saying that the average lifetime value of someone who purchases a monthly subscription is a function of the summed probabilities of each number of purchases in the chart above, times $99
  • i.e. in the first column, we see that the probability of one purchase of a monthly is 1.0 or 100% (which makes sense, since our condition is the purchase of at least one monthly)
  • then the lifetime value would look like this:
    • (1 x $299) + (.8 x $299) + (.7 x $299) … (.025 x $299) = $1422.68

That’s great, but it lacks a lot of detail and context, and more over it is looking at the lifetime picture, when what we care about is the next conversion probability. However, we can still use a part of that in our model. For instance, we would make our subset [PCEs who have already renewed twice], who we know in aggregate will make the next purchase 60% of the time.
  • What we want to know, however, is of those PCEs with two previous renewals, what if they had been exposed to a marketing campaign? How much does that affect the probability of their next purchase? Does it go up to 62%? 70%?

Note: This model/way of thinking is not limited to marketing, but should be considered for all means of contact between a brand and a PCE (potential converter):
  • Talk to a sales rep? That will affect conversion probability
  • Received an automated system email based on behavioral triggers, or reminders? Yup
  • Use a feature of a product? Read a review on a blog or see a funny tweet from the brand? Yes, all of them

It’s also important to keep in mind that this isn’t model that simply marches towards a probability of 1 from 0, with every touch a positive contribution increasing likelihood of conversion. You can have inputs that add nothing, or even decrease the probability, which is what we need the models for. What models you use, how you select the variables, and what you do with the outputs is a whole other story, but that will be explained later. 

For now, start asking yourself questions about how you might be putting your understanding of customers and their journeys into the wrong boxes. A dollar of renewal revenue is just as good as a dollar of new revenue, right? So all you care about is the probability of each one, and how much effort it takes to change those probabilities. How can you increase the likelihood of getting the outcomes you want? How do you even know what events and exposures lead to conversions? We will look into that in the next post.

Protest Politics: Part 2




This is not meant to be a retraction, but perhaps an amendment to, my earlier post on the cancellation of a Donald Trump rally in Chicago (original post here). Additional information, and additional time, have caused my viewpoint on the matter to evolve a bit, and conversations with others have let me know that on certain points I have either not communicated my position well, or have simply been wrong and had to reconsider.

A big motivation for this further response is the arguments being made on the left like this one, suggesting that liberals who have any criticism of the protesters and the events in Chicago are traitors. While I disagree with some of the characterizations made, or take issue with certain conclusions drawn, on the whole I find this line of reasoning to really resonate with me, and can certainly be sympathetic to the emotion behind it.

Before continuing, full disclosure: I am a straight white American male. Except for 'born rich' I have checked all of the privilege boxes, and if you think that it means that I should avoid participating in this conversation, or that my opinion should carry less weight, I can't fault you for that. I have never known explicit or systemic discrimination, I have never had to speak with a subaltern voice, and I do not daily confront the legacy of any particular oppression. I try to be aware of, and adjust for, the native entitlement that naturally comes from a position of privilege, but there it is. I certainly recognize that what people say is affected by who they are, and how that changes the impact of my words.

There is no doubt that some of my natural inclination to be overly zealous in defending free speech comes from the fact that I get to take my own for granted.

A part of me thinks that maybe there should be a way to silence Donald Trump, and that the country would be a better place without his hateful, divisive rhetoric. In fact, I'm fairly sure about the latter, if not the former. I think that there probably are a number of ways to limit the spread of his ideas, or at least to stop making it so easy for him to reach a mass audience, if we can't prevent him from speaking them in the first place. I think that the freedom to assemble and speak for the protesters is just as important as it is for the supporters of Mr. Trump, and I think that it's a good thing that so many people turned out in Chicago to show those supporters just how unwelcome Trump's views are to broad swaths of the citizenry that they otherwise don't interact with.

Let's be clear about a few things though: Donald Trump is NOT a threat to this country because he represents a potential reality in which he becomes president and abruptly installs a fascist dictatorship. There is literally no mechanism by which he could do that in our current system of government (hell, given the current political environment, it would be a surprise if he could get a lunch order through the legislature), and more importantly, he has not espoused a lot of views that suggest he would even want to. He IS a threat primarily because he has contributed to an environment where explicitly or implicitly racist and xenophobic viewpoints are considered to be part of the mainstream political discourse, rather than something to be ashamed of and kept to one's self.

I think that this is an area where those of us on the left have done everyone, including ourselves, a disservice in our use of language. Calling Trump and his supporters fascists is either intellectually lazy, or deliberate hyperbole to inflame people, and neither one is a good thing when it comes to actually solving problems. Can't we just call him a racist, and a stupid clown who has terrible ideas? This isn't post-WW1 Italy or Germany, and comparisons to Hitler just don't stand up to scrutiny. Trump is his own brand of terrible, but thankfully, he isn't going to be invading Canada or building gas chambers. Again, I don't think that there is any value in the man or his message, but let's not pretend that his narcissism and ignorance make him a mass murderer.

Additionally, as far as the 'he doesn't need/deserve the right to a platform because he already gets too much attention from the media' argument goes, it's either stupid or hypocritical. First of all, pick an argument and stick to it, don't just pull in whatever comes to mind. Media savvy and/or previous media coverage as a prerequisite measuring stick for freedom of speech is a non-starter, and well beside the point. Saying 'we already know what he stands for' is not a reason that he shouldn't have the ability to a.) repeat that message, or b.) change his message, as has already happened in this campaign cycle.

More importantly however, we need to be realistic and acknowledge the fact that every time the protests get so virulent as to disrupt an event (like Chicago, or the guy who rushed the stage in Ohio), it increases the attention given to Trump. I would have no idea that he was even doing these events if it weren't for the protesters, so it doesn't make sense to blame the media for giving him all this attention when you are turning an otherwise non-newsworthy gathering of local bigots into a lead story. If Trump is a fire that threatens this country, attention is his oxygen. Protesters who make themselves into a news event are pouring gas on that fire, when we should be smothering it, with either silence or shame, but realize that you are not doing anything to hurt him by getting him more publicity.

I think that's a big part of what this keeps coming back to for me, the idea that Trump and the counter-Trump protesters are both expressions of anger, but that the form that the expression of that anger takes makes a difference. It might be cathartic, or even empowering to try and shut down the Trump machine through violent protest, and I can see how that kind of power can feel really good to people who have had to struggle a great deal. But don't conflate it with an effective means of defeating the ideology Trump and his followers. I've dealt with assholes, and I've wanted to punch them in their dumb faces, and a very few times in my life, I've taken the opportunity to do it. It feels great (unless it's followed by them punching me in my dumb face), but that doesn't mean that it was the right thing to do, or that it did anything long-term to modify my opponent's behavior.

...But maybe it did? If all of the above seems like criticism that is overly harsh, I can see the flip side of this, as well. If the problem is that Trump's campaign has normalized harmful speech by legitimizing it as reasoned political discourse, then a potential remedy is to shame the people that show up to listen to it. I think that's a positive outcome of a scene like the one in Chicago, in that it forced a bunch of (likely) suburban white people to confront just how upset Trump's rhetoric makes a big chunk of the American public. It's important to remind these people that whatever the source of their frustrations are, in 21st century America bigotry is something that should not and will not be acceptable in any kind of social setting. This is exactly the point of this type of protest and demonstration, even if it could have achieved that goal without becoming violent (though I can accept the argument that it wouldn't have gotten covered otherwise, the old 'Trump paradox').

It also seems like the ugliness of the eventual outcome in Chicago has helped prod the media to start condemning Trump's complicity in the violence a bit more, which is only a good thing. Obviously, the Marco Rubio video that has been spreading over the last 24 hours, in which he goes on at length to be critical of Trump and what it says about the electorate and the democratic process in this country, exemplifies that people are starting to take off the kid gloves that, for whatever reason, they have seemed to treat Trump with to this point. I agree with the people calling Trump a coward for not showing up to an event due to protests, but I wish that it was a viewpoint that had gained more traction for a candidate billing himself as a tough guy.

Maybe all of this will also bring about a discussion on whether or not the idea of unfettered free speech is still something that we choose to value as a society. It's possible that, just as we decided that yelling 'fire' in a crowded theater is not protected, or direct plausible threats against another individual's person, we will now decide that we need to be a little broader in our definitions of what constitutes public safety concern.

Donald Trump is definitely a monster, and his message should be defeated by (almost) any means necessary, but there is room for disagreement, even within the ranks of the left, on how to go about doing that. Certainly he and those who would attend his events should be forced to confront, in no uncertain terms, how hurtful and inappropriate their views are to the rest of the country. But make no mistake, such views are disappearing in this country despite Trump's best efforts, and nothing he can do will change that. If those of us who oppose him and his ilk allow ourselves to be internally divided over how best to defeat his message, we will have opened the only possible door to giving him actual power.