Computer-based searching has become the central component not just of computer processing, but of the dominant information platforms of the 21st century. Algorithms, computer power, speed optimization, and other technologies are working to make it easier to deal with the huge amounts of internet-based information from libraries, governments, and other organizations—much of which is now web-only. In addition, the internet has become our communications device.
“Every minute, there are an estimated 590,278 Tinder swipes, 694 Uber rides, and 4,166,667 Facebook likes,” notes IFLScience. Thanks to various innovations and the techniques being designed to provide adequate access to these stores of information at a reasonable speed, we are in the midst of a major shift—from silicon to genomics and biological computing. In the past year, major advances have been made that might seem like something from science fiction, and yet hold promise for taking computing to a whole new level of performance and value.
The Limits of Silicon and Moore’s Law
For many years now, the computer industry has relied on a theory posited by Intel co-founder Gordon Moore in 1965 concerning the future potential for computing power in a silicon-based hardware environment. Moore’s Law states that the number of transistors in a dense integrated circuit (or central processing unit; CPU) would double every 2 years as newer advances in computer science allowed for greater processing power. Tying performance to power, “The insight, known as Moore’s Law, became the golden rule for the electronics industry, and a springboard for innovation. As a co-founder, Gordon paved the path for Intel to make the ever faster, smaller, more affordable transistors that drive our modern tools and toys. Even 50 years later, the lasting impact and benefits are felt in many ways,” writes Intel.
However, physical technology has its limits, and as a 2016 MIT Technology Review article titled “Moore’s Law Is Dead. Now What?” says, “[I]n a few years technology companies may have to work harder to bring us advanced new use cases for computers.” Given the growth of the internet as a platform for publishing, archiving, and entertainment, new approaches beyond silicon were clearly needed. Until recently, SEO has been a matter of incremental changes and improvements to systems as a way to increase speed and accuracy. As even Google notes, SEO “is about putting your site’s best foot forward when it comes to visibility in search engines, but your ultimate consumers are your users, not search engines.”
University of Michigan’s Verdict Takes On Algorithms
All search systems today rely heavily on algorithms, which are used, in effect, as surrogates for human mental processing. However, as an article in the Harvard Business Review says, “[A]lgorithms often can predict the future with great accuracy but tell you neither what will cause an event nor why. An algorithm can read through every New York Times article and tell you which is most likely to be shared on Twitter without necessarily explaining why people will be moved to tweet about it. An algorithm can tell you which employees are most likely to succeed without identifying which attributes are most important for success.” Algorithms process/identify/understand only what they are explicitly instructed to, for a world in which anyone, anywhere, at any time can be searching for information, with knowledge and expectations shifting constantly. Despite the—perhaps natural—limitations of these programming techniques, software doesn’t seem able to afford the type of certainty that is needed today.
One problem resulting from this reality is that algorithms are able to think, reason, or develop new understandings as people are able to do. Additionally, there are clearly serious social problems inherent in many of the systems that exist. Harvard University professor Latanya Sweeney found that people with names that are common in African-American communities received ads while doing web searches that tended to represent services related to being arrested, such as bail bondsmen. The study led the Atlanta Daily World to note that these practices were “perpetuating the myth of the Black criminal. In other words: Which came first, the racist chicken or the racist egg?”
XPRIZE Foundation’s Peter Diamandis predicts that “advances in quantum computing and the rapid evolution of AI [artificial intelligence] and AI agents embedded in systems and devices in the Internet of Things will lead to hyper-stalking, influencing and shaping of voters, and hyper-personalized ads, and will create new ways to misrepresent reality and perpetuate falsehoods.”
Researchers at the University of Michigan developed a promising response to these issues with software they call Verdict, which “enables existing databases to learn from each query a user submits, finding accurate answers without trawling through the same data again and again.” The system provides a soft version that saves energy; however, for those complex research questions, “Verdict allows databases to deliver answers more than 200 times faster while maintaining 99 percent accuracy. In a research environment, that could mean getting answers in seconds instead of hours or days.” This is one of the first deep learning systems being worked on across the globe.
Barzan Mozafari, one of the designers of Verdict, says, “Databases have been following the same paradigm for the past 40 years. You submit a query, it does some work and provides an answer. When a new query comes in, it starts over. All the work from previous queries is wasted.”
The paper “Database Meets Deep Learning: Challenges and Opportunities” notes, “The foundation of deep learning was established about twenty years ago in the form of neural networks. Its recent resurgence is mainly fueled by three factors: immense computing power, which reduces the time to train and deploy new models … massive (labeled) training datasets [that] enable a more comprehensive knowledge of the domain to be acquired; [and] new deep learning models [that] improve the ability to capture data regularities.”
However, according to a recent article in Nature, “[T]his field is still in its infancy,” and it will be some time before we see it integrated into scientific, let alone commercial, databases. Still, this is a major step forward for the future of 21st-century computing.
Way back in 2003, I was asked to write about the potential futures for computers in the 21st century (“Nanotechnology, Genomics, and Biological Computing: The 21st Century World of Computers,” ONLINE 27: 26–31). I predicted that “the powerful combination of genomics and nanotechnology—along with other discoveries—is creating a new type of computer—one based on living organisms.” At the time, the news media was shocked at this prediction. But as we have learned, nothing is impossible.
In the current issue of The Scientist, Catherine Offord describes the development of DNA hard drives in which a “few kilograms of DNA could theoretically store all of humanity’s data.” As Mike Murphy writes on Quartz, “There are one billion gigabytes in an exabyte, and 1,000 exabytes in a zettabyte. The cloud computing company EMC estimated that there were 1.8 zettabytes of data in the world in 2011, which means we would need only about 4 grams (about a teaspoon) of DNA to hold everything from Plato through the complete works of Shakespeare to Beyonce’s latest album (not to mention every brunch photo ever posted on Instagram).” Many issues need to be addressed—data errors, random access retrieval, stability of these living tissues, etc.—however, as we continue to make more and more information available, being able to keep up with it (and searching) will require what now appear to be extraordinary solutions.
In 2012, Stanford University scientists reported in the Proceedings of the National Academy of Sciences that they had programmed DNA to work as arewritable biological data storage system. Drew Endy, the key researcher in the project, told the BBC that “I’m not even really concerned with the ways genetic data storage might be useful down the road, only in creating scalable and reliable biological bits as soon as possible. Then we’ll put them in the hands of other scientists to show the world how they might be used.”
Singularity Hub sees three major challenges ahead in the development of DNA-based information storage:
- Develop new and better ways of translating digital information into biological information; ways that enable fast, accurate and cost-efficient retrieval of information.
- Invent and advance new chemistries to enable cheap DNA synthesis.
- Incorporate more automation in production workflows to achieve cost reductions.
Offord notes in The Scientist, “Whatever the future of DNA in these more complex technologies, such projects are a testament to the perceived potential of molecular data storage—and an indicator of just how much the field has progressed in a very short period of time.”
Clearly, computer technologies are moving quickly—but not as fast as the growth of information and the ubiquity of the internet. Nevertheless, these efforts show that the future is bright for quality searching across all forms of content. For information professionals, the increasing mass of content now available online may find systems in the near future that will allow for the long-hoped “information superhighway” to work at truly highway speeds, connecting the world’s information as never before.