How are the surveys aggregated and weighted?
Estimates of the weighted proportion of the survey results are calculated using a weighted local regression model that models the percentages of support per candidate as a function of time. The LOESS model is a tool that allows identifying the non-linear relationship between a set of variables through the use of pochart-titlelynomials. This model is optimal to extrapolate in a weighted way the proportion of respondents who prefer a candidate without assuming the results follow a linear trend.
The model incorporates a weighting designed to give greater weight in the estimation to certain observations. The elements that make up this weighting are the time (greater weight to recent observations), historical precision of the polling firm that publishes the results (greater weight to results of polling firms with high levels of precision in past surveys), precision of the survey (greater weight to surveys with smaller error margins, approximated by the size of the sample reported) and type of survey (greater weight to surveys conducted in housing).
Weighting by historical accuracy
The indicator of accuracy of the polling firms is based on the adaptation for a multiparty system of the “precision index A” of Martin, Traugott and Kennedy (2005) by Arzheimer and Evans (2014). The A precision index is a measure of bias that provides information about the over or under estimate of the results of a survey once it is compared with the official results, but was originally designed to operate in bipartisan systems only. According to the algorithm of Arzheimer and Evans, to adapt it to a multiparty system such as the Mexican, the sub-indexes of precision A’ corresponding to each of the parties are calculated first,
where , is the proportion of respondents who support party i , and the proportion of votes that party i had in the election. In this way, the combined index B is the sum of the absolute value of the sub-indexes divided by the number of parties in the election. Specifically,
to calculate the accuracy indexes of each polling firm, the surveys that were published and registered in the National Electoral Institute’s archives for the presidential elections of 2006 and 2012 were analyzed. The B index for each polling house was calculated as the average of the indices received by each of the studies published in an electoral cycle. To privilege the polls in the most recent election, the index of final accuracy gives more weight to the results of 2012 than those of 2006.
This exponentiated amount, exp(B), can be interpreted as the average factor by which the odds of the different parties were over or under estimated. Minor numbers indicate greater precision, so the weight is the inverse of this amount. Those pollsters that do not have registration in 2006 or 2012 were assigned the highest precision number in the sample plus a 0.02 penalty.
The historical accuracy indicator shows the following classification of polling firms. The pollsters are listed from biggest weight to smallest.
|Buendia & Laredo|
Weighting by precision and margin of error
The precision weighting of the survey is determined by the standard error derived from the size of the sample and the proportion of votes registered for a candidate A:
Weighting by type of survey
Due to the potential errors of coverage of telephone surveys that are the result of the inevitable biases of interviewing only the population that has a telephone line at home, the weighting gives less relative weight to telephone surveys, compared with housing surveys.
To give more importance to the most recent surveys, we calculate the temporal distance between the publication date of each of the surveys and the day of the election. Because smaller quantities are more desirable, the temporary weight is equal to the inverse of this amount.
Finally, the previously mentioned quantities were multiplied to form the vector of final weights.
A final note on why we do not use surveys on social media
We decided not to include the results of polls on social media (in particular, the surveys among Facebook users published by SDP) for three fundamental reasons:
1. Users of social media tend to be younger and have a higher socioeconomic level than society in general. This introduces biases in the selection of interviewees that can be mitigated by post-stratifying the results to give the underrepresented groups in the study a weight similar to that in the general population. But doing so does not mitigate biases that come from differences in political preferences between those who are users of social media and those who are not.
2. There is no clear way to know who was invited or not to participate in the survey because the systems that control the invitations are, basically, black boxes that prevent knowing the sampling frame and the selection mechanisms of the users to participate in the survey. This makes it impossible to calculate any measure of uncertainty of the results and makes their samples, which are large, useless to estimate how accurate the results actually are.
3. The participants are self-selected, that is, they decide to participate or not for reasons that are not quantifiable and are selected from an unknown and unverifiable population.
Finally, it is important to take into account that surveys in social media are not considered as valid by the industry, at least not to generalize their results to the general population. Adding them to the weighted average will go against industry standards (see AAPOR 2014). In addition, given that the size of the samples they handle is very high, adding them to the weighted average calculated by this system will cause a significant bias in favor of their results, thus affecting their validity.
AAPOR (2014), “Social Media in Public Opinion Research”, consulted on February 11, 2018.
Kai Arzeimer and Jocelyn Evans (2014), “A New Multinomial Accuracy Measure for Polling Bias”, Political Analysis, vol. 22, n.1.
Martin, Elizabeth A., Michael W. Traugott y Courtney Kennedy (2005), “A Review and a Proposal for a New Measure of Poll Accuracy”, Public Opinion Quarterly, vol. 69, n. 3.
The Mexico Bloomberg Poll Tracker methodology was developed by Salvador Vázquez, Conacyt Research Professor at the National Public Policy Laboratory in CIDE, Mexico City, and by Michelle Torres, PhD candidate in political science by Washington University in St. Louis.