Anurag Acharya Helped Google’s Scholarly Leap  
by: Francis C. Assisi

03 January 2005 -- As if you weren´t spending enough time Googling, the world’s favorite search engine now offers another reason to loiter there: a tool aimed at scientists, scholars, students and other researchers, helping them dig through millions of pages of hefty information.

Thanks to principal engineer Dr. Anurag Acharya, the new tool, called Google Scholar, is a version of the company´s popular search service but limits its results to "scholarly literature such as peer-reviewed papers, theses, books, preprints, abstracts, and technical reports." The new search tool’s goal is to offer the most comprehensive list of research papers available on the Web.

Announcing the launch of Google Scholar’s experimental or beta version on November 18, 2004, Acharya expressed the hope that it would become a first stop for researchers looking for scholarly literature like peer-reviewed papers, books, abstracts and technical reports. Acharya, who started the Google Scholar project, said his motivation, in part, had been a desire to help the academic community from which Google emerged.

To begin with Acharya first had to make his software identify and gather scientific papers from around the web using simple rules based on the common format of scientific papers, and then extract the title, abstract, authors and references. "I started the project because I wanted to build something better for researchers," says Acharya. Building automated citation indexes was new to him, but his background in designing large-scale distributed computing systems helped the scaling up. "Extracting information and references was the hard part," he says. "Building an index, making it run fast, and stable, that was easy; I already know how to do all that."

Acharya has stated that the goal of the service is to “make it easier to find content, open access or not. The first step in any research is to find the information you need to learn and then build on that. Not being able to find information hinders scholarly endeavor.”

Which is why Acharya’s search algorithm scours for articles, reports, and other documents from publishers, universities, professional societies, and medical databases such as PubMed. Observing that much of scholarly research is learning what others have discovered and building on it, Acharya added: “We at Google have benefited much from academic research. This is one of the ways in which we are giving back to the research community. We hope Google Scholar will help all of us stand on the shoulders of giants.”

Google officials say they have the cooperation of a broad range of academic publishers, library groups, scholarly societies, and colleges while academics are looking at the search engine giant´s new service as a welcome addition to their research repertoire. Almost all top scholarly publishers have agreed to let Google index their sites, says Acharya, including the publishers of Science and Nature. And because the site is in beta, it is likely that other additions and changes will be made as scholars use the service. Google has requested that users send in suggestions, questions and comments.

Since its launch in September 1998, search engine Google has chugged past well-funded giants in the field and into the hearts of Internet information seekers worldwide. The brainchild of two Stanford University doctorate students, Google has evolved into a kind of an ultimate searching tool currently indexing over 8 billion pages. And the proudly geekish company has quietly built its search engine into the world´s most popular and trusted Web site, according to research company Jupiter Media Metrix.

Today, most web users agree that search engine Google has become an invaluable asset in navigating through the vast ocean of material on the Web. And just as with Google Web Search, Google Scholar orders search results by how relevant they are to your query, so the most useful references appear at the top of the page. This relevance ranking takes into account the full text of each article as well as the article´s author, the publication in which the article appeared and how often it has been cited in scholarly literature.

Google Scholar also automatically analyzes and extracts citations and presents them as separate results, even if the documents they refer to aren´t online. This means search results may include citations of older works and seminal articles that appear only in books or other offline publications.

To rank the results, Google Scholar applies the same criteria that scientists use when deciding which papers to read, says Acharya, including the importance of the journal and how often the work has been cited. Although the tool obtains abstracts for most articles, one will need a subscription to download the full text of some publications. Acharya says upcoming features will include limiting searches by date.

According to Acharya, a former faculty member at UCSB, the project was also an effort to address a problem he confronted while enrolled in his B.Tech at IIT Kharagpur. As a student he found materials in his college library, at times, to be significantly out of date. Acharya, who earned his Ph.D. from Carnegie Mellon in 1997, expects Google Scholar will make the world´s scientific literature universally accessible.

Three years ago, principal scientist Krishna Bharat, who now leads Google’s research division in Bangalore, conceived the idea for Google News and helped set it up with his ‘Hilltop algorithm’. It enabled Google News to automatically arrange news so that the most relevant and up-to-date news is presented first.

Since September 2002, Google News has evolved into one of the largest and most up-to-date news services online, gathering content from more than 4,500 online news sources around the world, and then determining which stories are related and grouping them based on importance. Country specific versions of the service are already available in the U.K., China, France, Australia, India, Germany and Canada.

What is the secret of Google’s spectacular success over the past six years? Says Krishna Bharat: "At Google we have a broad charter - to organize the world´s information and make it universally accessible and useful. The mission puts people before profits, so our focus is locked on to a stable target - meeting people´s expectations, rather than the whims of the marketplace."

He explains: "Google´s formula for success is our emphasis on personal creativity. Engineers devote time to pet projects that may bloom into the next big product. Experimentation is encouraged and people work very hard on things they really believe in. Good ideas find champions quickly. And our history is paved with fresh ideas and healthy partnerships. In fact, our labs are a playground for projects not yet deployed."

francisassisi@hotmail.com


Science researchers interested in profiling their work in this column are encouraged to submit their biodata and relevant publications to INDOlink at: editor@indolink.com