What will the natural inclinations of major publishers be to Elsevier’s announcement? Charles Oppenheim sees little hope for major change: “There will be a strong incentive for other TA-STM publishers, including, I suspect, NPG [Nature Publishing Group], to follow the Elsevier route. This will either result in a plethora of per-publisher click-through licences or a single, probably highly restrictive Elsevier-like licence, available through a publisher supported gateway.” However, the winds of change are blowing stronger than this year’s polar vortex, and publishers may feel the blast of changes far beyond their control.
Peter Murray-Rust, a chemistry professor at Cambridge University and open access (OA) advocate, is one strong critic: “We have literally billions of dollars of information locked up in the current scholarly literature. And 10000 papers come out each day. We need content mining to manage these—read them for us. Organize them. Let us search after we’ve read them. Do some of our routine thinking for us. On our own terms for our own needs. It can happen, just as Wikipedia happened. So don’t turn away—believe that Content Mining matters—matters massively.”
Michael W. Carroll, professor of law at American University’s Washington College of Law, asserts in the LIBLICENSE listserv that “in the United States text mining is a user’s right not a copyright owner’s right. When a library signs an agreement denying users the right to bulk download for the purpose of text mining, the library is giving up a user’s right in exchange for access to the publisher’s database of articles. … Elsevier may have the rights to control text mining in some European countries, but this announcement still means that in the U.S. Elsevier wants to control computational research in ways that go beyond its rights as a copyright owner.”
Carroll continues to note that “the Google Books decision provides support because Google created a digital archive of publisher’s works for the purpose of making them searchable (and to enable text mining). The court held that Google’s creation of this archive and its continued retention of it was necessary to the beneficial purposes of providing search and text mining. Google’s keeping the archive after it created its index did not effect the publishers’ economic interests in exploiting the copyrighted works and therefore is a fair use. Although the purpose of the text mining researcher and Google are somewhat different, they both can articulate a socially beneficial reason for keeping a private archived copy of the publishers’ works and their doing so does not interfere with the publishers’ ability to economically exploit the works.”
Acting to Open the Gates Much Farther
Last December, the Public Library of Science (PLOS) announced more extensive requirements for making available data underlying their published research. “PLOS is now releasing a revised Data Policy that will come into effect on March 1, 2014, in which authors will be required to include a data availability statement in all research articles published by PLOS journals. … PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. … Refusal to share data and related metadata and methods in accordance with this policy will be grounds for rejection.” Acceptable methods for making data available were outlined in the announcement.
As Emma Ganley and Jonathan Eisen noted on the PLOS blog, “Access to data facilitates reproducibility and testing of a [paper’s] conclusions and methods and also enables new discoveries to be made without the expense of redoing the experiments. We believe that the more open we all are about open data, the more we discuss the benefits and challenges, and the more we shift the bar towards openness, the better off all of science will be.” Coupled with the new data management requirements from the National Institutes of Health (NIH) and the National Science Foundation (NSF), American researchers are clearly positioning themselves for an OA future.
In 2011, Ian Hargreaves’ report, “Digital Opportunity: A Review of Intellectual Property and Growth,” was completed at the request of the British prime minister due to “the risk that the current intellectual property framework might not be sufficiently well designed to promote innovation and growth in the UK economy.” The report, in sum, states, “We have found that the UK’s intellectual property framework, especially with regard to copyright, is falling behind what is needed. Copyright, once the exclusive concern of authors and their publishers, is today preventing medical researchers studying data and text in pursuit of new treatments.”
Murray-Rust sees resolution coming soon, at least for the United Kingdom: “In two months the UK parliament is expected to table and pass the Hargreaves recommendations for TDM, when we will be able legally to carry this out in UK. Since my institution subscribes to a large number of NPG journals which I have the right to read I expect to start mining them, without further negotiations and without your further permission, in the near future.” If you combine data mining’s proven successes in research and business applications with semantic analysis of texts and add in the emerging area of image mining as well, the ability to integrate and extract information from previously inaccessible materials opens exciting possibilities and potentials for new discoveries, treatments, and understanding. Today, the door is opening wider, with the promise for even greater access and advances yet to come.