The statistician and political polling analyst Nate Silver will discuss his career -- from student journalist to baseball prognosticator to the creator of FiveThirtyEight.com, perhaps the most influential political blog in the world -- and the ways in which statistics are changing the face of journalism in a conversation with Seth Mnookin, a former baseball and political writer who co-directs MIT's Graduate Program in Science Writing.
[this is an edited summary, and not a verbatim transcript]
By Katie Edgerton
Photos by Greg Peverill-Conti
Moderator Seth Mnookin introduced Nate Silver to a packed house by reading a tweet sent to the panelists from an audience member: "Nerds nerding out, standing room only."
"It's an honor to be here," Silver said. "We're at ground zero for nerds here in Cambridge."
Best known for his work in political and baseball forecasting, Silver's career didn't start with statistics. After college, Silver joined the economic consulting firm KPMG, where he worked on transfer pricing, a process that governs how much income a multi-national company can derive from its subsidiaries.
"I hadn't thought too much about what I wanted to do," Silver confessed. "It's tough for someone who is 21 or 22 to figure out what course they want to take in life."
Bored at work, Silver began to develop a forecasting system for baseball statistics. This project eventually became PECOTA, or the Player Empirical Comparison and Optimization Test Algorithm, which Silver sold to baseball forecasting website Baseball Prospectus.
Baseball, said Silver, offered one of the world's best datasets. PECOTA drew on statistics dating back to WWII, comparing current players with a wealth of historical information. Soon after the sale, Silver joined the Baseball Prospectus team himself. At the company, "I was doing three or four things at once," he said. In addition to statistical work refining PECOTA, Silver helped manage the business and began to write about his findings.
Over the past decade, statistics have had a profound effect on how baseball is played and managed, Silver said. In the few years he spent at Baseball Prospectus, many people left the company to work in teams' front offices as baseball franchises began to utilize statistics to put their rosters together. "Metrics were co-opted by the teams," Silver said.
When Silver first began to work in baseball forecasting in the early 2000s, methods of evaluating players based on statistics—popularly known as "moneyball" in reference to Michael Lewis's bestselling nonfiction book on the subject—hadn't yet gained by wide acceptance. But by 2007 and 2008, those struggles had largely been resolved. The new consensus in baseball management was that a mixed-methods approach to player evaluation-- weighing both statistics and qualitative observation-- worked best.
"Seeing what statistics had done to baseball," said Silver, "I thought it was long overdue in politics." While baseball managers and sports reporters drew on statistics regularly, by contrast, political reporting seemed to be "in the stone ages." It wasn’t data-driven, and Silver felt that a greater engagement with statistics could elevate campaign coverage, in particular.
However—with regards to statistics—politics was in a different place than baseball had been prior to Moneyball. Historically, baseball managers hadn’t made much use of statistics; the quantitative revolution happened "from the outside in," spearheaded by independent statisticians like Bill James. By contrast, Silver said, many recent political campaigns have made sophisticated use of metrics-- particularly those of George W. Bush and Barack Obama.
Statisticians on Obama's campaign staff assigned two numbers to every registered voter in the country, Silver explained. The first numbered measured how likely a given voter was to vote for Obama. The second number quantified how likely it was that the person would vote. People would receive different solicitations based on how their numbers lined up. For instance, someone who was likely to vote Obama, but not as likely to go to the polls, would receive targeted "get out the vote" literature. A swing voter who would almost certainly vote, but not necessarily for Obama, would get another set of advertisements. Statistics, Silver concluded, have allowed campaigns to employ sophisticated "micro-targeting” tactics.
Silver started writing about politics under the pseudonym Poblano as the 2008 presidential campaign was heating up. “I was known for writing about baseball,” Silver said, “and I didn’t want to alienate my audience.” Eventually, he revealed his identity so that the project could grow, ultimately leaving Baseball Prospectus so that he could work on politics fulltime. Silver’s forecasting model correctly predicted all 35 senate races and 49 of 50 states in 2008—he missed Indiana.
Political journalists were friendlier when Silver first entered the scene than they would become later on. His work received a lot of attention in the mainstream press, in part because there was so much demand for coverage during the 2008 election. In 2008, political journalists often referenced Silver’s work, but once he joined the Times, he became “more of a target.”
Silver’s work with on 538 arose from a chance meeting with New YorkTimes editor Gerald Marzorati on a train platform. Silver had attended a sports business conference at MIT in 2010 and ran into Marzorati on his way back to New York City. The two started talking, and their initial conversation led to Silver’s popular 538 blog, named after the 538 Electoral College votes up for grabs in a presidential election.
After Silver joined the Times, his relationship with the mainstream media became much more contentious and personal, Mnookin observed. Politico, for instance, referred to Silver as “a one-term celebrity.” What did Silver think the root of the conflict was?
It came down to competition, said Silver. When he aligned himself with the Times, he became a competitor to many other news outlets. Furthermore, many people in the mainstream media painted the election as a toss up, and Silver’s forecasting engine didn’t support that outcome as a likely probability. Part of what made the media attacks so personal was that there wasn’t a lot of ambiguity in Silver’s predictions. You couldn’t debate the details, Silver said. “We were adding to 270,” or the number of electoral votes needed to win the presidency.
In the 2012 election, Silver correctly predicted 50 out of 50 states. But he cautioned against putting too much stock in that achievement. Sometimes the media is too results-oriented, Silver said. His real pride was in the process. Ultimately, calling 50 out of 50 states correctly didn’t matter as much as calling the election, Silver observed. The 538 team predicted that Obama would win the popular vote by 2% and he ultimately won by 4%-- if that margin of error had swung the other way, Republican challenger Mitt Romney could have won. But had Romney won, Silver still would have felt confident in his system and forecasting engine. In the early days of the campaign, we tackled a lot of problems, he said, such as how do you weigh incumbency, or economic uncertainty? Towards the end, all the factors in the model converged into how often candidates win states where they have certain percentage-leads. Silver felt that his overall process was sound, whatever the final outcome.
“I was working 100-hour weeks during the election,” Silver said. “I got involved in the psychodrama.” During election season, commentators can become abstracted from the real world, he said. Campaign journalists can fall into a trap where their perspectives are skewed all too easily, said Silver, himself included. In this environment, using statistics can become all the more critical.
Could prediction become a scientific discipline? asked Mnookin. Because of the work of Silver and other statisticians, more people are becoming interested in forecasting.
There is a lot of interest in statistics now, Silver agreed, and clearly, over the past decade, we have had an exponential increase in the amount of data that’s available to us. But the volume of data is never the constraint—the real limiting factor is our ability to process the data and make it meaningful. There are many large data sets, said Silver, but not all of them are rich. Data can be descriptive but not predictive. The algorithms being developed to analyze so-called “big data” sets are refining forecasting methods, but this is happening slowly.
Question and Answer
Ian Condry, Associate Professor of Anthropology, asked Silver if he thought there was a danger that statistical models might come to have too much predictive power. Was Silver concerned that blogs like 538 might make people feel as if forecasts were so accurate that elections were already decided, and there was no point in voting? Will a time ever come when statistics are so advanced we won’t need voting?
I would prefer to have accurate information influencing public opinion. But if he ever discovered that 538 had negatively affected voter behavior, he would walk away from it. Silver also said that he believed that his work had a higher likelihood of affecting donors’ behavior than votes. For instance, a donor might check 538—or any other political coverage—to deicide which candidate to support. Their decision could, in turn, influence the election; Silver expressed his discomfort with even that level of influence.
Would Silver ever consider making his economic forecasting model freely available, asked an audience member, so that statisticians could learn from it, tweak it, and build on it for their own research. If not, what are the chief impediments to making the code behind his model open source—available to anyone?
I’ve been transparent about the data that goes into the model, so it isn’t that difficult to reverse-engineer, said Silver. “I am sympathetic to the academic viewpoint that science should be open source,” he commented, “but I don’t have a tenure track position.” If you open source a tool, its market value is diminished, Silver concluded, and he was in a for-profit business. Nonetheless, Silver hoped that statisticians could reshape the current discourse so that there was a more open dialogue about how information is modeled. He also said that he wouldn’t rule out open sourcing his model at some future point.
Punditry in the media is a huge problem, said an MIT student. Does Silver have any ideas about how to change political coverage and improve the tone of current discourse?
It is easy, and not all that productive, to critique individual journalists who might be doing things in a silly way, said Silver, but we have to remember that it is in their self-interest to produce overly sensational journalism—it’s what sets them apart from the crowd of other people trying to make names for themselves. Silver argued that journalists who tried to get maximum exposure in the short term by being sensational might be damaging their long-term brand. How can people differentiate themselves without resorting to punditry? he asked. To solve that issue, people would have to look at the media outlets on the organizational level and see how they were incentivizing their journalists. If we changed the incentives, the discourse might change too.
Every few years Silver seems to take on a new topic, observed an audience member. First baseball, now politics—what’s next? What does the future hold for Silver in terms of technical innovation?
I’ve started to get interested in urban data, Silver said. Like baseball and politics, data about cities is rich and underutilized. You can learn a lot modeling data about different neighborhoods. When a statistician has a developed model—as Silver does in baseball and electoral politics—most of refinements are small adjustments on the margins. Modeling a fresh dataset provides a different kind of challenge.
How can we better apply statistical models to policy? asked Sonny Sidhu, graduate student in MIT Comparative Media Studies. Elections seem like the starting point of statistics’ potential influence on politics. Can Silver’s tactics enrich debates that currently rely on reductive or simplistic metrics?
There are statisticians in the United States Treasury Department who are already doing this kind of work, said Silver. The problem is communication—how do you convey statistical information to the public in an accessible way? Journalists like Ezra Klein of the Washington Post are already bridging this communication gap. Political polling is a relatively simple set of facts, Silver concluded, and for intricate policy questions, the statistical analysis would require much more complicated models than 538’s.