Every autumn, David Pendlebury looks forward to hearing who has won the year’s Nobel Prizes. Pendlebury isn’t hoping to queue up for an award himself, but he still has a big stake in the results. As citation analyst at Thomson Reuters, he spends months digging into data dating from as far as 3 decades ago in search of what he calls scientists and researchers of “Nobel class” who were pioneers in their respective fields. These are the men and women responsible for laying the very foundations of today’s discoveries. Last year, each of Pendlebury’s “predictions” came true; he predicted all four prizes and all nine new laureates. This set an all-time record since the Thomson Reuters Citation Laureates prediction process was officially launched in 2002. To date, Thomson Reuters Citation Laureates, part of the Intellectual Property and Science business of Thomson Reuters, has accurately predicted 27 Nobel Prize winners.
But Pendlebury was realistic about setting another record for 2012. “We were very fortunate last year,” he says. “I think in statistical terms, we’re due for what we’d call a ‘reversion to the mean.’ The results [for 2012] probably won’t be as spectacular as they were last year.” And he was right. After the Nobel Prize committee issued the roster of its 2012 winners, Pendlebury weighed in on the results: “This year, we predicted only one Prize (Physiology or Medicine) and only one of the two laureates for this Prize: Shinya Yamanaka, whom we named a Thomson Reuters Citation Laureate in 2010. John Gurdon, who also won this Prize was not a Thomson Reuters Citation Laureate.”
Not All Win the Prize
Pinpointing Nobel Prize winners using citation analysis is neither an exact science nor an immediate outcome. Pendlebury emphasizes the fact that just because these esteemed citation laureates are being named this year, it doesn’t mean that they will win this year. In fact, last year’s winners were actually scientists whom Pendlebury named as citation laureates in 2008 and 2010.
“The problem is that there are so many people who are deserving of the Nobel Prize, and there are so few Nobel Prizes to go around,” says Pendlebury. “There are always going to be the uncrowned people, those of ‘Nobel class’” who may not win for years to come or who may never win an award at all.
Pendlebury begins his research in July by rolling up his sleeves and partitioning the literature into the Nobel Prize-recognized fields of chemistry, physiology or medicine, physics, and economics (he doesn’t weigh in on the Nobel Peace Prize or the one for literature). He employs a holistic blend of data for his predictions, in which he gathers highly cited papers and citation data from Web of Knowledge, as well as information about a scientist’s credentials and affiliations, such as being elected to a national academy, receipt of any prestigious appointments, and recognition of top prizes or awards. The sum total of a scientist’s credentials, citations, and accomplishments move him or her up accordingly in ranking. Even so, the spotlight shines on very few elite researchers: Thomson Reuters Citation Laureates are ranked among the top one-tenth of 1% of researchers in their individual fields.
Thomson Reuters Citation Laureates grew organically from a 1965 experiment done by Eugene Garfield, founder of the Institute for Scientific Information. Garfield began to examine the relationship between the people who won Nobel Prizes and the number of their citations, which he then committed to paper. His findings were first published in Nature in 1970. “It was simple,” says Pendlebury. “He looked at the Science Citation Index (SCI) for the year 1967, and then he tallied the most cited people in that one year of SCI.” Working with the total number of citations for the top 50 scientists he recorded, Garfield discovered that six of them had already won a Nobel Prize. And year after year, more scientists from Garfield’s list won a Nobel Prize.
Pendlebury took over the citation laureate analysis in 1989, and the first citation laureates were announced a decade ago. “All of this work demonstrates what we’re trying to do,” he says. “We’re finding a correlation between citations/literature and peer review.”
The Citation Long Tail
Numbers don’t lie. “Citation indicators have the strongest signal at the highest frequency,” says Pendlebury. He sees the distribution of citations as a long tail: quite a few items at the top that roll out into many citations with more papers at more frequent intervals. “As you move down the line, you see more and more frequency of citations, and that means it’s very hard to discriminate between the differences along the tail,” he says. But at the very top end, there are so few papers that the exceptional events stand out. These are the “black swans,” the rare breed, he says. “And at those high frequencies, it does seem like citations are powerful indicators of peer recognition.”
Peer review is often seen as being in direct opposition to quantitative analysis, says Pendlebury. But he actually considers it quite complementary. “Peer review is a group of experts on a small scale, bottom-up,” he says. “Citation analysis is the whole scientific community with millions of choices, so it’s global, and it’s top-down.” He’s not surprised that the two different groups lead to the same destination.
For 2012, most (13) of the citation laureates are from the U.S., along with two from Canada, three from Japan, and three from the U.K., which is in keeping with geographic distribution in the recent past. But science will encompass more global citation laureates from China, India, Singapore, and Korea in the years to come, he says. In fact, he reports that work being done today in these geographic regions will likely be recognized in the late 2020s or 2030s.
Another trend is that more women are entering the Nobel class. This year, Lene V. Hau of Harvard University was a citation laureate in physics along with Stephen E. Harris of Stanford University for their work on slow light.
Pendlebury says he stands at the end of a long chain of data, one that is the culmination of a massive team effort. There are even Twitter feeds and Facebook pages for those who want to follow the predictions, both of which were abuzz with comments in 2011 and in 2012.
Discovering the highlights in a literature search works the same way as it does for identifying excellence, and Pendlebury relishes the challenge of discovery. He is the first to admit that his role in the research process demands “no depth of scientific expertise on my part.” He says he just follows the trail.