SEMiotics

Thursday, March 17, 2016

A Data-Driven Approach to the Marketing & Sales Funnel: Part 1

The goal of this series of posts is to provide value to the conversation around measuring marketing efforts and their contribution to the overall business goals of a company, and to make recommendations about how analytics and marketing can best work together to drive those goals. This will entail a bit of background on different marketing metrics, what their strengths and weaknesses are, and how they are commonly used, but is not meant to be a fully exhaustive list. Some points will draw on industry best practices, some will be technology based, and some will be statistical, but in an effort to be broadly useful, overly detailed or complex issues will be in separate posts.

Thought Process:

The first thing to try and do is (from a metrics/statistical standpoint) to stop thinking of customers and purchases in the traditional ways. Essentially, every entity, whether an individual or a company, that has a need for our product (i.e., the total IT training market) has what I will term a ‘potential for conversion event’ which is also conveniently the same initials as a ‘probability of conversion event,’ or a PCE.

(Thinking this way allows for some really beneficial logical progressions, one of the biggest being that there is no distinction to be made between new and renewal customers, or monthly and annual customers, from a qualitative standpoint, but more on that in a moment)

Every PCE or cluster of PCEs is just that: a business opportunity that has some chance of experiencing a ‘conversion event,’ which just means a transaction of some kind. For this endeavour, we are going to consider three different levels of contact with PCEs, the culmination of which is a ‘conversion event,’ which can happen unlimited times.

‘Conversion Event’ - a transaction of some sort, regardless of amount or type

ex., trial sign up, first purchase, annual renewal, adding licenses, monthly auto-renew, etc.

‘Event’ - a deliberate, non-transactional interaction initiated by a PCE or involving two-way communication

ex., email sign up, webinar sign up, phone call, chat, social follow/conversation

‘Exposure’ - any interaction with the brand that is non-individualized or falls short of the other interactions

ex., see a display ad or Facebook status, read a blog post, visit the website, watch a YouTube video, etc.

The reality is that every purchase or conversion event is the cumulative result of a complex and infinitely mutable series of micro-interactions. As a result, we have to recognize that no conversion can be attributed to a single interaction, and that trying to do so will lead to erroneous assumptions. So the key to all measurement will be in the ‘P’ of PCE.

The goal in measurement will be, quite simply, to estimate as accurately as possible the effect of each interaction on the probability of an entity’s next conversion event, whether it is the first or tenth.

So, to look at what all this means, let’s go back to our initial definition of PCE, and examine ‘real world’ implications of this

Imagine that there are 10,000,000 individuals/companies in the world who need IT training:

The probability that someone who has no awareness of our brand will have a conversion event is 0%. You have to know about us to purchase
The probability that someone who knows that we exist, but nothing else, will have a conversion event is, say, .00001%
The probability that someone who knows us and follows us on twitter, and had read a few blog posts, is .001%
The probability that someone who has looked around the site, watched our YouTube videos, and discussed us on Reddit will have a conversion event is 1% (trial)
The probability that someone who has signed up for a free trial (experienced first conversion event) will have another (become a paying customer) is 20%
The probability that someone who has been on the site, gets our emails, did a free trial, and chats in to ask about subscription options will have a conversion event is 65%
The probability that someone who has had 7 users on annual licenses for the past three years, who combined watch an average of 16 hours of video per month, uses coaching, and has had two phone calls from a sales person about their account expiration date is 85%

Now, these are example values (though some of them are not far from real), but the point is that the only number that truly matters, from a revenue standpoint, is the probability of their next conversion event, which, combined with the size of the opportunity, can give us the expected value of an event.

i.e., a monthly customer in his/her second month, based on x interaction data, has a 78% chance of a $299 conversion event 20 days from now, with an expected conversion value of $233.22)

Of course, we can take this further. Expected value allows us to chain events this way:

for monthly customers, we know that they, as a whole, renew at some constant rate, with each renewal being a conversion event; this is a conditional probability (chances of X, given that Y)

Looking at everyone who buys a monthly then, we start with a total conditional probability estimate that looks like this:

So we would start by saying that the average lifetime value of someone who purchases a monthly subscription is a function of the summed probabilities of each number of purchases in the chart above, times $99

i.e. in the first column, we see that the probability of one purchase of a monthly is 1.0 or 100% (which makes sense, since our condition is the purchase of at least one monthly)
then the lifetime value would look like this:

(1 x $299) + (.8 x $299) + (.7 x $299) … (.025 x $299) = $1422.68

That’s great, but it lacks a lot of detail and context, and more over it is looking at the lifetime picture, when what we care about is the next conversion probability. However, we can still use a part of that in our model. For instance, we would make our subset [PCEs who have already renewed twice], who we know in aggregate will make the next purchase 60% of the time.

What we want to know, however, is of those PCEs with two previous renewals, what if they had been exposed to a marketing campaign? How much does that affect the probability of their next purchase? Does it go up to 62%? 70%?

Note: This model/way of thinking is not limited to marketing, but should be considered for all means of contact between a brand and a PCE (potential converter):

Talk to a sales rep? That will affect conversion probability
Received an automated system email based on behavioral triggers, or reminders? Yup
Use a feature of a product? Read a review on a blog or see a funny tweet from the brand? Yes, all of them

It’s also important to keep in mind that this isn’t model that simply marches towards a probability of 1 from 0, with every touch a positive contribution increasing likelihood of conversion. You can have inputs that add nothing, or even decrease the probability, which is what we need the models for. What models you use, how you select the variables, and what you do with the outputs is a whole other story, but that will be explained later.

For now, start asking yourself questions about how you might be putting your understanding of customers and their journeys into the wrong boxes. A dollar of renewal revenue is just as good as a dollar of new revenue, right? So all you care about is the probability of each one, and how much effort it takes to change those probabilities. How can you increase the likelihood of getting the outcomes you want? How do you even know what events and exposures lead to conversions? We will look into that in the next post.

Protest Politics: Part 2

This is not meant to be a retraction, but perhaps an amendment to, my earlier post on the cancellation of a Donald Trump rally in Chicago (original post here). Additional information, and additional time, have caused my viewpoint on the matter to evolve a bit, and conversations with others have let me know that on certain points I have either not communicated my position well, or have simply been wrong and had to reconsider.

A big motivation for this further response is the arguments being made on the left like this one, suggesting that liberals who have any criticism of the protesters and the events in Chicago are traitors. While I disagree with some of the characterizations made, or take issue with certain conclusions drawn, on the whole I find this line of reasoning to really resonate with me, and can certainly be sympathetic to the emotion behind it.

Before continuing, full disclosure: I am a straight white American male. Except for 'born rich' I have checked all of the privilege boxes, and if you think that it means that I should avoid participating in this conversation, or that my opinion should carry less weight, I can't fault you for that. I have never known explicit or systemic discrimination, I have never had to speak with a subaltern voice, and I do not daily confront the legacy of any particular oppression. I try to be aware of, and adjust for, the native entitlement that naturally comes from a position of privilege, but there it is. I certainly recognize that what people say is affected by who they are, and how that changes the impact of my words.

There is no doubt that some of my natural inclination to be overly zealous in defending free speech comes from the fact that I get to take my own for granted.

A part of me thinks that maybe there should be a way to silence Donald Trump, and that the country would be a better place without his hateful, divisive rhetoric. In fact, I'm fairly sure about the latter, if not the former. I think that there probably are a number of ways to limit the spread of his ideas, or at least to stop making it so easy for him to reach a mass audience, if we can't prevent him from speaking them in the first place. I think that the freedom to assemble and speak for the protesters is just as important as it is for the supporters of Mr. Trump, and I think that it's a good thing that so many people turned out in Chicago to show those supporters just how unwelcome Trump's views are to broad swaths of the citizenry that they otherwise don't interact with.

Let's be clear about a few things though: Donald Trump is NOT a threat to this country because he represents a potential reality in which he becomes president and abruptly installs a fascist dictatorship. There is literally no mechanism by which he could do that in our current system of government (hell, given the current political environment, it would be a surprise if he could get a lunch order through the legislature), and more importantly, he has not espoused a lot of views that suggest he would even want to. He IS a threat primarily because he has contributed to an environment where explicitly or implicitly racist and xenophobic viewpoints are considered to be part of the mainstream political discourse, rather than something to be ashamed of and kept to one's self.

I think that this is an area where those of us on the left have done everyone, including ourselves, a disservice in our use of language. Calling Trump and his supporters fascists is either intellectually lazy, or deliberate hyperbole to inflame people, and neither one is a good thing when it comes to actually solving problems. Can't we just call him a racist, and a stupid clown who has terrible ideas? This isn't post-WW1 Italy or Germany, and comparisons to Hitler just don't stand up to scrutiny. Trump is his own brand of terrible, but thankfully, he isn't going to be invading Canada or building gas chambers. Again, I don't think that there is any value in the man or his message, but let's not pretend that his narcissism and ignorance make him a mass murderer.

Additionally, as far as the 'he doesn't need/deserve the right to a platform because he already gets too much attention from the media' argument goes, it's either stupid or hypocritical. First of all, pick an argument and stick to it, don't just pull in whatever comes to mind. Media savvy and/or previous media coverage as a prerequisite measuring stick for freedom of speech is a non-starter, and well beside the point. Saying 'we already know what he stands for' is not a reason that he shouldn't have the ability to a.) repeat that message, or b.) change his message, as has already happened in this campaign cycle.

More importantly however, we need to be realistic and acknowledge the fact that every time the protests get so virulent as to disrupt an event (like Chicago, or the guy who rushed the stage in Ohio), it increases the attention given to Trump. I would have no idea that he was even doing these events if it weren't for the protesters, so it doesn't make sense to blame the media for giving him all this attention when you are turning an otherwise non-newsworthy gathering of local bigots into a lead story. If Trump is a fire that threatens this country, attention is his oxygen. Protesters who make themselves into a news event are pouring gas on that fire, when we should be smothering it, with either silence or shame, but realize that you are not doing anything to hurt him by getting him more publicity.

I think that's a big part of what this keeps coming back to for me, the idea that Trump and the counter-Trump protesters are both expressions of anger, but that the form that the expression of that anger takes makes a difference. It might be cathartic, or even empowering to try and shut down the Trump machine through violent protest, and I can see how that kind of power can feel really good to people who have had to struggle a great deal. But don't conflate it with an effective means of defeating the ideology Trump and his followers. I've dealt with assholes, and I've wanted to punch them in their dumb faces, and a very few times in my life, I've taken the opportunity to do it. It feels great (unless it's followed by them punching me in my dumb face), but that doesn't mean that it was the right thing to do, or that it did anything long-term to modify my opponent's behavior.

...But maybe it did? If all of the above seems like criticism that is overly harsh, I can see the flip side of this, as well. If the problem is that Trump's campaign has normalized harmful speech by legitimizing it as reasoned political discourse, then a potential remedy is to shame the people that show up to listen to it. I think that's a positive outcome of a scene like the one in Chicago, in that it forced a bunch of (likely) suburban white people to confront just how upset Trump's rhetoric makes a big chunk of the American public. It's important to remind these people that whatever the source of their frustrations are, in 21st century America bigotry is something that should not and will not be acceptable in any kind of social setting. This is exactly the point of this type of protest and demonstration, even if it could have achieved that goal without becoming violent (though I can accept the argument that it wouldn't have gotten covered otherwise, the old 'Trump paradox').

It also seems like the ugliness of the eventual outcome in Chicago has helped prod the media to start condemning Trump's complicity in the violence a bit more, which is only a good thing. Obviously, the Marco Rubio video that has been spreading over the last 24 hours, in which he goes on at length to be critical of Trump and what it says about the electorate and the democratic process in this country, exemplifies that people are starting to take off the kid gloves that, for whatever reason, they have seemed to treat Trump with to this point. I agree with the people calling Trump a coward for not showing up to an event due to protests, but I wish that it was a viewpoint that had gained more traction for a candidate billing himself as a tough guy.

Maybe all of this will also bring about a discussion on whether or not the idea of unfettered free speech is still something that we choose to value as a society. It's possible that, just as we decided that yelling 'fire' in a crowded theater is not protected, or direct plausible threats against another individual's person, we will now decide that we need to be a little broader in our definitions of what constitutes public safety concern.

Donald Trump is definitely a monster, and his message should be defeated by (almost) any means necessary, but there is room for disagreement, even within the ranks of the left, on how to go about doing that. Certainly he and those who would attend his events should be forced to confront, in no uncertain terms, how hurtful and inappropriate their views are to the rest of the country. But make no mistake, such views are disappearing in this country despite Trump's best efforts, and nothing he can do will change that. If those of us who oppose him and his ilk allow ourselves to be internally divided over how best to defeat his message, we will have opened the only possible door to giving him actual power.

Wednesday, March 25, 2015

The Trouble With Stats (or the people who pretend to use them)

My father also used to enjoy telling me "figures don't lie, but liars figure." He was essentially trying to explain why people often don't trust those who use numbers to explain things that don't naturally appeal to our (highly flawed) instincts. There is a Simpsons quote along similar lines, but for those of you who are fans, you already know it, and for anyone else, it would be a waste of time.

There is another, larger problem, however, that also damages the reputation of statistical analysis and the things it can tell us about our world. It's when people know/do just enough to be dangerous, and without basic rigour produce findings that are unsupportable. When they spread those findings, people who don't have a strong understanding of the analytical process tend to take them at face value, and then end up being let down later. Rather than blaming the particular culprit who practiced the bad stats, they just blame stats in general. These bad actors aren't always guilty of malice, but it doesn't absolve them of crimes against science, and there is a perfect example on ESPN.com today.

Obligatory picture of Nomar, 'cause baseball!

Check out the article here (it's an Insider article, so if you aren't already behind the pay wall, you won't be able to read it). To summarize, Mr. Keating is trumpeting the value of a simple composite stat, runs per hit (R/H from now on), as some sort of gem in terms offensive value, in both real and fantasy terms. He begins, ironically enough, by pointing out that "many sabermetricians [statisticians who study baseball] barely glance at stats like runs and RBIs, which depend heavily on a player's offensive context" (which is true), and then immediately goes on to make a vague declaration about other skills he must have (not true).

Keating proceeds to provide a handful of historical examples of other players with a high R/H, in fun anecdotal fashion. Fully three of the seven paragraphs in the article don't talk about the particular player that it focuses on, Brian Dozier of the Twins, and even the ones that mention him hardly focus on him. The examples provided vary widely across run environments, from the height of the power-infused steroid era, to pre-WWII baseball, with no accounting for what the differences in offensive numbers looked like at those times.

Finally he cherry-picks a few contemporary data points that support his assertion, none of which are particularly relevant to the point that he is making (and in some cases, contradictory).

At no point does he cover anything like the methodology he used to arrive at his conclusions, or even imply that there was any methodology, which might be more disturbing. Here is one thing I know about people who appropriately use statistics: they love to share the gory details.

Now, I will admit, I had a little prior experience with this particular stat, because I had looked into it a few years ago, and ultimately rejected it as useful. However, context always matters, so it was worth another look. One of the first things that Keating does in the article is subtly undermine sabermatricians and complex statistics, which is curious for a guy who's bio at the bottom of the page says that he covers statistical subjects as a senior reporter. Basically, what he says is that R/H is a stat "...that is so simple you can calculate it in your head and it'll tip you off to hidden fantasy values" (I challenge both parts of that statement).

He mentions BABIP later on without any explanation, so he is assuming a certain amount of sophistication on the part of his readers, which is understandable since most fantasy enthusiasts have more exposure to statistical analysis than an average baseball fan. Certainly, anyone interested in "new" composite stats would also be familiar with OPS, and quite possibly wOBA and wRC as well, so the assumption has to be that R/H offers something that these other stats don't.

This is where the whole thing falls apart under scrutiny. Keating says that R/H is a good proxy for identifying other skills like speed, walk rate, and power, but all of those other stats I just mentioned do the same thing, but the difference is, they actually work. Depending on your analytical preference (or which site you get your baseball fix from), you might choose to believe in OPS, wOBA, or wRC (or base runs, etc.), but the reality is, they are all pretty good, and so they correlate very highly to one another (r values of above .95 between all of them). R/H, on the other hand, correlates with none of them, not even close.

Of course, those are all stats meant to capture real-world offensive value in a context neutral way, and maybe R/H only works for fantasy value. So the obvious thing to do would be look at R/H against ESPN fantasy player values. Unfortunately, once again, R/H has almost no correlation to the player rater values (whereas the fantasy rater correlates really well with those other stats).

"Wait," you might (and should) say, "the fantasy player rater is going to be heavily weighted towards counting stats, and thus might not be appropriate when comparing against a weight stat like R/H." Sadly, you, like Mr. Keating, would be wrong again, as Dozier was 7th in the majors in plate appearances (PA) among qualified batters, so he actually has an unfair advantage in this case.

The truth is, of players who had enough plate appearances to qualify for the batting title last year, 4 out of the top 5 by R/H were not in the top 30 fantasy players, and only 4 of the top 15 made the cut. Most of whom were not even guys with an balanced speed/power/discipline profile, like Dozier (which Keating says the stat should identify), but power hitters like Donaldson, Rizzo, and Bautista. If you look at wRC+, every single one of the top-15 players was in the top 40 by fantasy value, and has an r value twice as high as R/H (.71 vs .31).

A common way to look at how much noise might be in a particular stat is to look at year-over-year correlation, and I will say that for players with at least 350 PA in both 2013 and 2014 there was a slight correlation. At just below an r value of .5, however, it wasn't particularly strong, highlighting just how much luck goes into this formula by including runs, and frankly, this number would probably change by looking at more years (I admit to not going deeper here, but I will point out the failing if someone wants to look into it).

Bottom line: runs per hit is not a great stat over all, and it is not a great stat in the context of fantasy baseball value. I won't venture to say that Keating did this analysis and decided to hide the results from his readers, nor will I say that he was too lazy to do any analysis, but I do know that it took me all of thirty seconds to start finding problems with the assertion. As someone who plays fantasy baseball, I have no interest in giving my competition an advantage, but I also can't figure out how this kind of misleading "analysis" does any good for the field.

The issue isn't just that the number doesn't say what Keating claims it does, because it will, by the nature of its component parts and some of the reasons he gives, have some relationship with the skills he is trying to identify. The issues are that it does so very inefficiently due to flaws in the components used, that we already have much better stats for highlighting the same information. The last, biggest issue is that despite being at least somewhat aware of these flaws, Keating is touting this approach nonetheless, under the guise of statistical analysis, without providing any evidence in support.

Dozier is a good player, and has fantasy value, but anyone using R/H to evaluate players is going to be disappointed in their fantasy season. It is neither predictive nor descriptive of a player's skill, and shouldn't be used that way. The problem is that by placing this article behind the pay wall of an authority like ESPN, the false premise is given a credence that it doesn't warrant, and undermines statistical analysis as a whole.

Tuesday, March 10, 2015

Have Some Respect for Your Founding Fathers

One thing that really irks me in the debates that fill the public discourse, is the way that people say, "Obviously our founding fathers couldn't have foreseen [insert some eventuality], when framing the Constitution." [Note: I'm also going to include the Bill of Rights in this umbrella term from now on] This is followed by that person using it to justify his or her position in the debate. Now, the first thing that I should say is that I also hate when people use this rhetorical/logical device in support of positions that I myself back.

Allow me to make an example. "The founding fathers can't possibly have foreseen the kind of weapons that we have in society today, so the second amendment can't be taken as a literal statement, applicable in our time."

Now, without going into details that are off-topic, I generally agree that some additional gun control would be good for America, but I have also been a hunter and shooter for most of my life. Despite that, I can't support any attempts to dismiss, off-hand, the decisions made during the formative years of this country based on the clairvoyance of our predecessors.

For one thing, making that argument is somewhat self-defeating. If you are saying that people can't possibly imagine the legislative needs of a future version of their own society, then why are you supporting a particular approach? Doesn't that mean that your efforts are doomed to let down your great-great grandchildren, constraining them to a set of options ill-suited to the problems they face?

Now, that response is as spurious as the argument it refutes. I didn't spend much time on that back-and-forth, because it doesn't seem to bear much effort in my mind, due to the mechanical inconsistency involved (if you disagree, please comment, I would be very happy to see another point of view). As such, I'm going to get to what I think is the central issue, which has more to do with a basic misconception.

I think that what bothers me most is what could be considered an amalgam of two common cognitive biases; recency and confirmation. We want to believe that we are the smartest, most well-informed people ever, and so we gravitate towards evidence that supports that notion. We are also most familiar with our contemporary situation, which puts evidence in support of our modern sensibilities at our finger tips at all times.

...And that's fine. It isn't new; our grandparents suffered from those biases, and so did the framers of the Constitution. You know what else we have in common with them? The ability to consider a large data set, identify patterns, and extrapolate possible futures. We don't have a monopoly on that kind of thinking. We didn't invent it, we inherited it.

By the time the Constitution was being written, humans had investigated cause and effect pretty thoroughly, and utilized geometry and calculus (which they had already invented) to describe and predict the movement of heavenly bodies with a high degree of precision. The ability of humans to evaluate a situation taking into account nuance and subtlety was well established long before the United States came into being (a process which also took place over about hundreds of years, not in a moment).

It's not even a very complicated thought process, and there is extensive documentation proving that people of that era (and every one before it) were aware of the evolution of western armaments, along with many other things. Does anyone really think that members of the colonial aristocracy didn't understand the significance of the difference between the weapons possessed by the Native Americans and European settlers? Are people so arrogant as to believe that the founding fathers didn't have a sense of history very nearly as advanced as our own? Do people think that imagination is a recent evolutionary development?

There is no chance that Thomas Jefferson wasn't consciously aware of at least the following progression:

Humans throw rocks at one another
Invent slings to better throw rocks
Invent bows and arrows, replace rocks as missiles
Invent crossbows, replace bows as delivery mechanisms
Invent primitive firearms, replace both the launcher and ammo
Invent muskets, replace delivery mechanism

With plenty of iterations in between these leaps (one of the big advances in crossbow technology was the windlass, which required less strength to more quickly cock the weapon). When we represent things like photon torpedoes, light sabers, and ice-nine, we are taking our current understanding of technology, and applying our imagination to a projection of observed trends to create that image. It's a brilliance that, as far as we know, is unique to humans, but isn't unique to humans in the past 30 years.

Temporally, our distance from those colonists is only about 200 years, even if the technological gap has accelerated lately. That is about the same distance as they were from the 16th century, a time that saw Copernicus and Tycho Brahe challenge the widespread assumptions about the heavens and propose a heliocentric rather than geocentric system. The founders were building a new democratic society on a continent that had hardly existed on maps 200 years earlier. They were the inheritors of a network of settlements created by Protestants as a result of Martin Luther and the Reformation, you guessed it, about 200 years earlier. These were massive scientific and philosophical advancements that had radically changed Western civilization, and to suggest that the founders would not have been self-aware enough to recognize social evolution is either silly or arrogant.

Are we writing laws to control the proliferation of civilian models of future-government space lasers? Of course not. We are dealing with our current problems, not trying to legislate the solution of tomorrow's problems.

[Understand that this isn't a gun control argument, that's just an example of one way that the false dichotomy that I am illustrating is manifest]

Quite frankly, any discussion of the framers' intentions is pointless because we either have their own words expressing their thoughts, or we are guessing. If it doesn't inform our attempts to solve our current problems, it's no more than an interesting academic exercise in the context of the current debates (I say this as someone with a history degree). The Constitution is an important document, but it isn't the Ten Commandments, carved in stone. If we want to sanctify it, then it should be lovingly retired and removed from issues of public governance, because at that point it is no more than an idol for dogmatic worship. If it is sacrosanct and immutable, then it is a religious document and has no place in politics. If we treat it as a vital, living part of our republic, however, then we should be able to shape it and change it to suit our needs the same way that we would a bill that funded local road improvements.

If the founding fathers didn't create rules for us today, it's because they trusted that we would be able to handle that ourselves. We shouldn't belittle their contributions to the growth of this nation in order to justify our own. The United States, as a nation, continues to evolve, as does the body of documentation that defines it. We should feel free to change the Constitution because it is our responsibility to safeguard the well-being of this country today, not because we need to correct the ignorance of yesterday.

Monday, February 2, 2015

Attention Spans are Not Getting Shorter

A great deal has been made in recent years about how the attention span of the average American has been dropping precipitously. This complaint is most often leveled by people who are also most likely to wax nostalgic about how much better things used to be, have a deep-seated mistrust of social media, and are on the long side of 30. That hasn't stopped the narrative from spreading, however, and it has been picked up and repeated in some form across the social spectrum, even by those members of the younger generations who are believed to suffer most notably from this malady.

While the cause of such a disease has yet to be formally identified by the medical community, a list of usual suspects is generally paraded about by the media for public condemnation. In the 80s & 90s it was that kids watched too much television and played video games that rotted their brains. The 2000s saw the internet stretch its tentacles into practically every household occupied by people under 60, with damnable effects like email, instant messaging, and YouTube videos. Then we saw the blame shifting to blogs, Twitter, and Facebook, as mediums that encouraged communication in an ever shrinking range of word and character counts. Now, with the popularity of emoji, Instagram and Snapchat, people don't even have to write anything to get a point across!

Kids today, amirite?

This has been coming up a lot at my work, both in terms directing creation of the content that we sell (should we be making shorter videos to accommodate these shrinking attention spans?), as well as the way that we communicate with our users (are our emails too long? are we making enough image-only content?). Someone recently sent me an email about a report showing that Instagram users had overtaken Twitter users:

"Instagram (300 Mln) overtakes Twitter (284 Mln) in 2014, suggesting a trend towards an even shorter audience attention span - from 140 characters to a single picture!"

...to which I jokingly replied:

"Early human communication was visual, moving from literal pictographs (cave paintings) to metaphorical ones (hieroglyphics), then to cuneiform and eventually written words. We are now moving back in the other direction to more visual forms of getting our messages across."

I didn't even think about it again, or realize that my joke might be taken seriously without a sarcastic tone of voice, until my response was referenced seriously weeks later. I didn't mean what I had said at all, and I am actually dismissive of the notion, but we have come to a point where a suggestion that human communication is regressing to a preliterate state can be taken at face value, and given the direction of public discourse on the matter, I couldn't blame the recipients of my flip remark.

Let's first get one thing straight: in a literal sense, there is no way that our "attention span" (whatever that is) could possibly have changed genetically in the space of a generation or two, as a result of the technology that we are exposed to. Evolution is simply not a process that happens that quickly. Children born today will have the same natural attention span as their great grandparents, no matter how many tweets their 30-something parents have sent. So, any change in the actual communication preferences of people today is habitual, not hereditary. Now, this is not to say that there can't be any kind of biological change over the course of a person's life, brain chemistry is mysterious beyond my personal kenning, but I will leave that argument to scientists, rather than media theorists.

Perhaps the constant electrical stimulation of certain nerve clusters has indeed rendered us a bunch of easily distracted simpletons, but I submit that the growth of short-form mediums is in no way evidence of any such change. This is yet another example of a classic post hoc fallacy, an attempt to craft a narrative to explain an observed phenomenon. There are, however, alternative explanations that don't rely on faulty casual conclusions.

The simplest explanation for the proliferation of micro-communication has to do with resources, both technological and chronological. Essentially, pieces of media have gotten shorter because it is easier and cheaper to create and disseminate content.

Understand that the situation we are observing is nothing new, but the continuation of process that has existed since the birth of human society. When writing required materials that were difficult to produce, very few things were written down, because the cost was prohibitive. Even with the invention of the printing press, mass production was difficult and expensive, so books were rare. As the technology for making paper and printing became better and cheaper, we saw broadsheets and newspapers come into being, and film and television likewise made it possible to create content that reached a wider audience, but there were still physical limitations to deal with.

Think about it this way: 30 years ago, if you wanted to make a movie and show it to millions of people, you needed not only the time and resources to produce the original content, but then you had to spend huge amounts of time and money to get it broadcast over the airwaves, or distributed in theater chains across the country. It literally cost millions of dollars and thousands of hours, with a substantial risk that you would not return even a fraction of those investments. 99.999% of people simply couldn't afford to take part in such an economy of scale, and for those who could, it wasn't efficient to create anything that short.

Now? Anyone can make a 30-second video with their phone and upload it to YouTube. The investment requirements in time and money are minimal, meaning there is no risk, so there is no downside to creating something that no one actually cares about.

In the past, if I wanted to share a picture of my awesome dinner with the world, it would require taking out a magazine ad (which of course no one ever did). Now I can just post it to Instagram effortlessly. Because the barrier to entry in content publication is all but non-existent, there is essentially no threshold for evaluating the "worth" of any communication that can exist digitally.


eBooks are cheaper, so the same revenue means more authors

Obviously, these are all issues of creation and distribution, and can easily explain why there would be such a growth in micro-content, without making any assumptions about attention spans. It doesn't explain consumption, however. A tree falling in the forest and all of that, a million cat videos (and counting) in the world wouldn't lead to the current conversation if no one were watching them.

I would suggest that attention spans have not declined, simply that people devote about as much attention to any particular piece of media as it warrants. There is a great deal of content that takes only seconds to consume, and those fit easily into the spaces that exist in a modern schedule while being conveniently formatted for the smart phones that are never far from our hands.

I think that an additional point that needs to be made (and shoe-horned into this post) is that too much of the conversation around new media and attention spans implies that length or medium is somehow equivalent to value. If an Instagram post is considered less valuable because it is visual rather than text-based, what of a classic painting? Does the time that something takes to consume related to the time that it takes to produce? That picture of of a fancy dinner may only take a second to compose, but some chef could have spent a great deal of time on the dish.

What of a haiku, or a sonnet? From Seamus Heaney to Shakespeare, plenty of great poets created works ranging from a few lines to thousands. Some of the greatest prose writers in history utilized a variety of forms, from novels to novellas to short stories, adapting the medium to the muse, but no one questions their artistic merit.

For all of that, the same people who read thousands of texts and tweets, and scan the Reddit headlines without reading the linked articles (myself included), still somehow manage to marshal their mental faculties long enough to watch feature films and even remain gainfully employed. I have it on good authority that people even still read actual books, printed on paper.

It's perhaps a less compelling narrative, but the truth is that the medium simply fits the message, so as mediums are created that are more convenient for delivering short-form messages, it is only natural that more such messages will be crafted. A story that is worth telling, or is worth hearing, will receive the attention it deserves.

As a man in my thirties, I have grown up with all of the offending technologies that should have stunted my ability to focus, from the Nintendo I got as a child to the Twitter account with which I shared this post. On any given day I will check my email, Instagram, Twitter, and several forums, and watch a YouTube video or two. At the same time, I am ten weeks into an online statistics course, 500 pages into Anna Karenina, and recently watched a two and half hour movie. I will spend hours cooking a meal, or a whole day exploring a short stretch of river out of cell range. I am not alone in this, because movies, books, and Rolling Stone still have audiences.

So I suppose that my point is, far too many words later, that our attention spans are fine, just so long as we turn them towards objects that are deserving.

[Of course, the fact that probably only one person in a hundred made it this far into the post doesn't say anything good about the value of this blog. If you are reading this, congratulations on your attention span!]

Tuesday, January 13, 2015

Putting the 'Fun' in 'Logistic FUNctions!'

So, this weekend I was playing around with R to graph some data sets and get the variables that go into the formula for doing logistic modeling. While trying to figure out some f(t) or t-values given the inputs, I was annoyed at calculating the results by hand, even if I was able to get the variables from R. It's pretty normal math for this kind of work, and it's good to know how to do it manually step-by-step, but eventually the fun wore off, and I just wanted to get it done.

Anyhow, I built a "Logistic Function Solver" aka calculator in Excel, and figured I would share it on the off-chance anyone else needs such a thing and doesn't feel like taking the time to build one. It's already been useful at work.

Here it is

Anyhow, you should be able to access the file on Google Drive with that link, and then make a copy for yourself. Let me know in the comments if this doesn't work for you.

Basically, cells with the light yellow background and green text are ones in which you should enter your values, and then your desired output will be in the green background with red text. (Leave the other cells alone, they have formulas)

Enjoy!

Thursday, January 8, 2015

Marketing & Data for New Businesses, Pt. 2: SEO

I'm going to start with the rub.

As I have mentioned before, the death of traditional search has been widely heralded, and patently disproved. It is a trendy thing to write about, but has little place in today's business world, because it's a prediction that has created many fools. If you were a weather forecaster, and just showed up each day and said "It's going to rain today," you would eventually be correct, but couldn't brag too much if those showers followed three weeks of sunshine. One day, traditional search will indeed see a serious decline in relevance, but that day is not yet come.

As such, small/new businesses have to continue to care about search, and SEO in particular. Why in particular? Primarily because it is "free," but also because it can scale better than any other channel, as organic search almost automatically encompasses your entire target audience.

A search for "Elizabeth Warren" probably wanted a result about this one

So where's the rub? SEO, boon and lifeblood that it is, can be extremely difficult for new companies. In large part, this is because the number one concern of the search engines* (here I will cease deliberately referring to them collectively, and start just using "Google" or "the engines" interchangeably, because, let's face it, Google still calls all of the shots in most countries) is authority. Authority is a somewhat (deliberately) vague term that describes the likelihood that a particular result is the most common intended target of a query.

Example: If someone searches for "Elizabeth Warren," Google is going to assume that they are looking for the official website of the politician (or recent news results, or her Wikipedia page), and not a PDF of the marriage announcement of a woman with the same name from a 1978 edition of the Albuquerque Journal.

Ultimately, authority (and SEO) is about probability. The more specific the query, the easier it is for the search engine, but most users start with something fairly broad, hoping that the engine will guess their intent, and show them a set of results that contains the desired website. The problem is that for a new business, unless you are filling a niche that no one else has ever tried to fill, it's very unlikely that you will have a great deal of authority for any search other than your brand name.

It takes time to build authority, which looks at historical data, including the clicks of users following a particular query (which is one of the few times that a confirmation bias is actually an appropriate thing). However, this isn't to say that a new business should not care about SEO, just that it is a more difficult path early on.

So what can you do? Here are a few recommendations:

Authority is based on a lot of legacy clout. Macy's has been Macy's forever. So try to compete most strenuously in areas that are newer, for example:

social signals are only gaining relevance in SEO, so make sure that your brand is extremely active on all of the social platforms, especially Google+
also, make sure that for each platform, you create the right type of account, that best represents your business
secure your site - Google has recently said that they will give slight preference to sites that are operating on HTTPS instead of HTTP; this is a pain to overhaul for sprawling legacy sites, but should be easy if you are new

Get it right from the start. Many companies have websites for years before they start worrying about SEO. Don't wait until you are big enough to pay a consultant to follow best practices:

audit your pages up front - proper metadata and linking from day one
make sure that you have enough content (500-1000+ words) on your important pages, even if some text is hidden to the viewer (but not to crawlers)
authorize yourself as the owner of your site on Google and Bing webmaster tools, because they will be the respective canaries in your coal mine

Publish or perish

Content and SEO will only be more inextricably linked, so get writing! Provide value and you will get traffic, which will in turn get you authority
if you are going to blog or produce other types of content, keep it on your primary domain; Wordpress may be easier, but you are just dividing your 'authority points'
if your content/blog does live offsite, make sure to cross-link it to the most relevant page on your main site, with appropriate anchor text

Measure what you can

On a basic level, use Analytics (or whatever you have) to keep an eye on the number of visits, and new visits (to separate people who just don't use bookmarks or type in the nav bar), from organic search
the move to a 'not provided' world from a keyword standpoint in Google Analytics hurt, but you can see what searched terms brought organic traffic to your site in webmaster tools, so identify those queries that should be central to your business and pull that data monthly to identify trends
Webmaster tools will also provide you with information about your rankings for various keywords, which is important in tandem with your volume of traffic from each term; use a combination to identify shifts in search behavior and effectiveness of on-page SEO

That should be enough to keep any new SEO busy for a while! It looks daunting, but the key is to just try and do the right things, generally. The engines ultimately want what is best for the consumer, and will reward sites that provide real value in an authentic manner. Keep SEO in mind with everything that you do, even if it is never the top priority, and you won't have to go back and clean it all up after the fact.

For more tips for small/new businesses, check out part one of the series. Feel free to post questions in the comments.