To Find the Answer, Change the Questions

Earlier blog posts outlined issues around polls and who completes them.  There is a more fundamental question: whether polling is a tool of the television age and not of the internet age.

Polling came into its heyday with television communications.  Advertisers wanted to know about how to appeal to the aggregate of television viewers back when 12 million people generally watched Ed Sullivan on Sunday nights – a variety show with something for almost everyone and acts ranging from Elvis, to Sam Cooke, to Jim Henson’s Muppets, to Maria Callas. (As an aside, only one of them was not born in my new home state of Mississippi). 

Now, there are no variety shows and most people stream videos in line with their specific interests. Opera lovers do not need to watch Elvis and vice versa.    (https://www.cnbc.com/2018/03/29/nearly-60-percent-of-americans-are-streaming-and-most-with-netflix-cnbc-survey.html).

Advertising to the aggregate still has value.  Half of Americans prefer to watch rather than read the news and most of those watchers do so on television.  (https://www.journalism.org/2018/12/03/americans-still-prefer-watching-to-reading-the-news-and-mostly-still-through-television/).  But the online only audience is growing, and has the added advantage of being able to appeal to people according to their interests rather than forcing Maria Callas on the Elvis crowd or vice versa.  

The internet is a fundamentally different medium than television.  It requires a different kind of message and message delivery.  Different research is needed to design and evaluate that.

1.  The internet is about engagement.   In his classic 1964 work, Understanding Media, Marshall McLuhan distinguished between “hot” media that are low in audience participation and “cool” media that are high in it.  His analogy was that a lecture would be a hot presentation and a seminar a cool one, allowing more participation.  He argued that television is a “cool” medium because it demands an audience response whereas movies are hotter. Laugh-tracks were a standard element of television shows but not of movies. But if television is, “cool,” the internet must be cold:  the medium is defined by audience response. 

Traditional polling does not generally look at what may interest people, or what they want to know, but at what they think given a limited number of pre-coded options.    It doesn’t say nearly as much about what will engage people cognitively, visually, or emotionally.

2.  The audience is different. The internet does not define audiences the way TV does.  TV buyers buy time to reach broad demographics – like women 35 plus.  Internet buyers can reach people based on interests using Google affinity audiences, custom affinity groups, lists or look-alike targeting (https://support.google.com/displayvideo/answer/6021489?hl=en ). They can explicitly include or exclude political junkies, strong partisans, and news junkies.   They can target country music listeners, classic rock aficionados, or those who respond to heavy metal – or opera. 

Polling does not look at audiences that way and so does not effectively target for internet advertising.   Your candidate – or issue or idea – cannot be different things to different people; that will produce an inauthentic mishmash and people still do talk to each other.  But people who are searching online for Kim Kardashian, World Cup Soccer, or Crafting Hacks, will not likely engage with the same content, even if they are all women 35 plus.

3.  The format(s) are different.  Online ads come in a variety of formats with different purposes and goals.  In most online ads, you do not have people for a full 30 seconds, although if you get their attention, they may stay with you far longer than the typical television ad.  The format of the ad must be tailored to its purpose, with a very broad range of choices.   And the ad will most often be seen on a screen far smaller than the average television set.  

Traditional polling may tell you what voters – at least those willing to be polled – believe distinguishes Candidate A and Candidate B to Candidate A’s advantage. Even so, that is only a first step to figuring out how that message may be expressed.   For online communications, add the need to strategically build the argument in way that engages people over the long haul. (There is no such thing as 2000 gross rating points of internet).

# # #

I believe we are just beginning to learn how to use online communications in politics – there are more messaging options, targets, social networks, and connections than dreamed of in Marshall McLuhan’s television philosophy.

The questions and the methods for internet communications are likely situation specific.  For the next blog post, I will outline a possible internet research strategy, if you were, for example, trying to break into the top tier in the Iowa Democratic caucuses.  Keep a look out for that before the end of April.      

Insurgency and Consultants

This blog is mostly about research but I decided to interrupt the regularly scheduled program to weigh in on the DCCC “blacklist” of consultants who help candidates running in primaries against Democratic Caucus members.

First, let me say the rule is neither new nor surprising.  The DCCC is funded by Members and by donors who support the current Caucus and so of course it serves to protect its own membership.  Second, the DCCC has always had its favorites, often former staff who have become consultants, and who are self-evidently in a mutually supportive relationship with the status quo at the DCCC (which I am quite certain is also the case at the NRSC).  The only difference is there is now a form, rather than the classic “hire your friends” habit that remains part of most political (and other) institutions.     

I did immediately wonder, however, whether anyone really thinks depriving insurgent candidates of establishment consultants hampers their chances.  While smart and with abundant technical skills, consultants are generally not well-prepared to assist insurgent revolts. 

Still, my firm did help some insurgents historically (and in most cycles we were not a Committee favorite).  So here, free of charge, are some questions insurgents should answer for themselves in assessing and planning strategy.  These questions should serve as a guide as well for incumbents in assessing their vulnerabilities. 

What has changed?   The insurgent is working to unseat someone whom the district has chosen before.  You need to identify what has changed to develop an insurgent strategy.  The change can be among voters as the result of shifting demographics, or district lines, or levels of participation.  Or the change can be in the incumbent, or his/her relationship to the district:  like he/she doesn’t live there, doesn’t communicate, votes against voters’ interests, or has become self-important and/or distant in some way. 

Where can you find the votes to win?  Usually in a congressional district, fewer than 60,000 votes are cast in primaries, although in some districts the number may inflate this year if the primary is at the same time as the presidential primary.  In many cases, the number is far lower:  In NY 14, when AOC defeated Joe Crowley, fewer than 30,000 votes were cast. Come up with a reasonably high guesstimate and figure out where your 50 percent plus 1 can come from.  If you are a different kind of candidate who can excite a different level of participation, support may be from people who have not previously voted in a primary.  (Incumbents:  do not limit communications to the core of party activists who have voted in the last four primaries.)

What do you have to say that is new and different and relates to the above?  The insurgent is unlikely to win unless (a) something has changed and (b) the insurgent is representative of that change and/or the need for change.  I have seen insurgents win because voters wanted economic change and the insurgent spoke to that; because the incumbent (state legislator) also held a second job and the insurgent knocked on every door and said he would work for voters full time; because the incumbent was invisible in key constituencies and the insurgent had the capacity to motivate those same voters; because the insurgent had experienced and survived discrimination by institutions the incumbent embraced.  In each case, the key element was clarity in what needs to change and an insurgent who represented the change.

Are you willing to go door-to-door?    Door-to-door canvassing remains the most effective form of communication and the incumbent, stuck in Washington (or your state Capitol if you are running for a legislative seat) won’t be able to do as much.  If people have talked to you personally, and haven’t heard from the incumbent except through paid advertising, you are advantaged.  Internet organizing (as opposed to using the internet like a TV channel) can also be very effective as people are hearing from those they know and trust. 

# # #

If, as an insurgent, you cannot answer these questions affirmatively, you probably will not win.  And if incumbents are genuinely and broadly reaching out to their district’s constituencies – seeking advice and genuinely listening across divides of age, gender, race, ethnicity, and income – they are unlikely to lose.

Far less important in the calculus:  Who has consultants approved by the DCCC.

Next Post:  Back to the regularly scheduled program.   

Could AI Write This Blog?

Yes, in theory but not in today’s reality.  In my last post, I suggested that polls are still usually predictive but modeling often more so.  But quick-and-dirty analytics has few advantages over quick-and-dirty polling.

About 15 years ago, campaigns started to use statistical modeling to produce efficiencies and better targeting.  Before that, campaigns targeted persuadable precincts.  Statistical modeling helps find the individual swing voters in precincts that are generally Democratic or Republican, a process that is more inclusive and allows more efficient use of resources. 

Increasingly, campaign communications are directed to individuals – online, through the mail, or through door knocks or addressable TV – and not exclusively (or even mostly) through broadcast media like television.  Polling analyzes people in the aggregate – telling a campaign what percent of men or women, younger or older voters, as examples, support a particular candidate.  Polls also say what groups are more undecided or seem more likely to move in response to arguments about the candidates.  Modeling makes those same predictions on the individual level improving prediction, efficiency, and targeting accuracy.

A decade ago, modeling used commercially available data and advanced statistics to make predictions.  A woman in an urban area where an unusually high number of people are college educated and of color is probably a Democrat – especially if she has a frequent flyer card that shows international vacation travel.  An older man in a rural area that has few people of color is more likely a Republican – especially if he has a hunting license and a subscription to a gun magazine.  Those are stereotypical examples but the plethora of available data on where people live, shop, travel, and what they read helps make probabilistic predictions on the individual level.

The process, however, still requires a lot of data collection to make the less obvious associations between people’s behavior and their voting habits.  To make modeling less expensive, the next iteration started with a set of assumptions and pre-established algorithms to allow modelers to collect less data (and do less analysis) in achieving results.  Currently, most modeling is done with artificial intelligence and machine learning, which is even more efficient, and uses smaller samples than it used to. 

AI often skips the step of understanding why certain variables are predictive, or asking how this individual situation may be different, or even analyzing in depth the patterns of errors and misassumptions that may lead to those questions.  Those who assumed people who voted for Obama would vote for Clinton made errors; voters who supported Trump in 2016 did not quite so reliably vote Republican last year. Failing to consider the why and the underlying dynamics led to strategic errors.

Opinion research at its best has more depth of understanding than AI produces, plus some judgment calls, or hunches, or perhaps a little artistry, which machine learning does not (yet?) produce.  Analytics is unquestionably a boon.  Using advanced statistics is an important tool to prediction and targeting especially as samples are increasingly skewed.  It is not, however, a replacement for strategy or judgment nor does it help much (although it could) in understanding what people are thinking and feeling, how they perceive a candidate, or how that candidate can improve his or her relationship with their constituencies. 

Next Post:  To find the answer, change the question 

Living with Uncertainty

Given the problems with polls, do they accurately predict elections?  The answer is they usually do.  Their assumptions need to be more or less correct and their samples inclusive, which is harder than it has been in the past.  Nobel Prize winning physicist Richard Feynman said, “It is scientific only to say what’s more likely or less likely, and not be proving all the time what’s possible or impossible.”  

Some studies have concluded polls are no less accurate now than they have ever been, but that should not be too reassuring.  Polls have always made assumptions about who is going to vote, and wrong assumptions have long led to wrong predictions – like on how Dewey would beat Truman in 1948.  Samples were off then because people without phones supported Truman.  Samples were off in 2016 when polls included too many voter with college experience and made wrong assumptions on voter turnout. 

We never really know the outcome of an election before it happens.  At best, we know what outcome is more likely (and sometimes much more likely).  Most of us do not really have a reason to know the outcome of the election before it happens.  If we work for a partisan committee like the NRSC or DSCC, we may be concerned with allocating resources.  If we work for the media, we may feel polls are newsworthy (although I wish y’all found them less so is it really news that someone is more likely to win than someone else?). 

Polling somewhat ironically shows that voters do not trust the polls they hear about in the media – and those who took that poll may arguably have trusted polls more than the average voter or why bother.  Media coverage of bad polling can create electoral outcomes, a concern raised by a bipartisan group of pollsters (https://www.huffingtonpost.com/2010/11/08/pollsters-raise-alarm-ina_n_780705.html) back in 2010.

Modeling can help with prediction because it develops a predictive algorithm or formula that does not require a random or representative sample.  The process does require an adequate sample, however, and often more analysis and examination of error than is applied (which is the subject of a subsequent blog).  And modeling still just provides a probability and not an absolute.   Two plus two may always equal four but neither polling nor modeling are arithmetic; they say there is some level of probability that Candidate X will win, or that voter Y will support him or her.

The margin of error does not help.  It describes the statistical chance the poll is wrong by more than that number of points assuming the sample is truly random, which is rarely the case, or representative, which is increasingly arguable.

More skepticism about polls is healthy.  It reduces the risk of cutting off resources from a campaign that can win, or affecting electoral outcomes through publicizing wrong polling.  As for campaign strategy, we could use some new thinking about how we listen to voters that might make campaigns more interesting and engaging to more people, even while their outcome remains uncertain.  It is sometimes the job of campaign strategists to make what seems impossible, in fact not only possible but real, and polling alone does not do that.

Next Post:  Could AI have written this blog?

The Self-Selecting Internet

The most obvious solution to the problems with telephone polling is to administer polls online.  That is a solution to the burgeoning cost of polling but not to the problem of whether the sample is representative of the electorate.  Internet polling is less expensive and many companies provide polling panels that can mirror population demographics.  But there is no way around the reality that people who are less interested in politics are disinclined to complete polls about politics even if they are interested enough to vote.  

Internet respondents for the most part (there are exceptions) are people who have signed up to be on a panel and take a lot of polls.  Should we assume that those who subscribe to a panel in exchange for a reward of some kind are representative of those who do not?  I do not think so – especially when the invitation to complete the poll often tells you what it is about.  (As a panelist, I chose to complete recent polls on feminism and on the Supreme Court but not on several other topics).    

One group that is often underrepresented in both telephone and online polls are people in the middle of the political spectrum.  In 2018, most voters were knew early on which party they would support, particularly in federal races (at least according to the polls). The election depended on voter turnout patterns and the relatively small number of people in the middle who were undecided, conflicted, not yet paying attention, more disinterested, or considering split tickets.

Voters in the middle are less likely than rabid partisans to want to share their political views whether probed online or on the phone.  If you are tired of arguments about President Trump from either perspective, you are less likely to agree to spend 10 or 15 minutes talking (or writing) about him.  Internet poll results often have even fewer undecided voters than telephone polls.           

Luckily for the pollsters, the middle was a small group in this year’s election and so the absence of people in the middle did not skew too many polls.  Some polls were wrong in Ohio because voters in the middle were disproportionately likely to support Republican Mike DeWine for Governor and Democrat Sherrod Brown for U.S. Senate. Those who were careful to poll the middle correctly predicted the result in each race.  Those who polled more partisans and fewer voters in the middle got it wrong. 

Online polling is also prone to leave out another significant group:  people who are not online.  Telephone polling tells us that 80-85 percent of voters are online but there is still 15 to 20 percent who say they are not.  Combining online and telephone samples can fill that gap – except they will both leave out those people who simply – for whatever reason – do not want to be polled.

Next Post:  Living with Uncertainty 

People Do Not Want To Be Polled

The core problem with polling is that people do not wish to be polled.  Those who answer their phones when the caller is unknown to them are unusual and atypical.  And even many who do answer do not choose to complete the poll.    

This year’s telephone polling results were closer to the final election results than in 2016.  Much of the improvement, however, was in the nature of the mid-term electorate and not because the polls themselves were better.  The mid-term electorate was highly polarized, and rabid partisans are easier to poll than voters in the middle.  Polls were still wrong when those in the middle did not break proportionately to the partisans. 

Back in the 1980s, polling achieved representative samples of voters by calling phone numbers at random.  The definition of random is that everyone in the universe of interest (people who will vote in the next election) has an equal chance of being polled. With the advent of cell phones, caller ID, and over-polling, samples have not been random for a while – not since the last century anyway. 

Pollsters replaced random samples with representative ones. Political parties and commercial enterprises have “modeled” files – for every name on the voter file, there is information on the likely age, gender, race or ethnicity and, using statistics, the chances that individual will vote as a Democrat or Republican.  If the sample matches the distribution of these measures on the file, then it is representative and the poll should be correct.

There are three problems (at least) with that methodology:  (1) there may be demographics the pollster is not balancing that are important;  pollsters got the 2016 election wrong in part because they included too few voters without college experience in samples and college and non-college voters were more different politically than they had been before.  (2)  rather than letting the research determine the demographics of the electorate, the pollster needs to make assumptions about who will turn out to make the sample representative – including how many Democrats and how many Republicans.  When those assumptions are wrong so are the polls.  This year, conventional wisdom was correct and so the polls looked better.

The third problem is perhaps the most difficult and follows from the first two:  pollsters “weight” the data to their assumptions.  If there are not enough voters under 30 in the sample (and they are harder to reach) then pollsters count the under 30 voters they did reach extra – up weighting the number of interviews with young people to what they “should” have been according to assumptions.  Often, however, the sample of one group or the other wasn’t only too small, but was an inadequate representation in the first place – a skewed sample of young people is still skewed when you pretend it is bigger than it actually was. 

The problems can be minimized by making more calls to reduce the need to up-weight the data.  If 30 percent of some groups of voters complete interviews but only 10 percent of other groups, just make three times the number of calls to the hard to reach group.  That is what my firm and others did this year.  It is, however, an expensive proposition and still does not insure that the people who completed interviews are representative of those who did not.

Next Post:  The Self-Selecting Internet