KMWorld CRM Media Streaming Media Faulkner Speech Technology Unisphere/DBTA
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM EContentMag Faulkner Information Services Fulltext Sources Online InfoToday Europe Internet@Schools Intranets Today KMWorld Library Resource Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research

News & Events > NewsBreaks

Back Index Forward
Twitter RSS Feed
Weekly News Digest

March 12, 2001 — In addition to this week's NewsBreaks article and the monthly NewsLink Spotlight, Information Today, Inc. (ITI) offers Weekly News Digests that feature recent product news and company announcements. Watch for additional coverage to appear in the next print issue of Information Today. For other up-to-the-minute news, check out ITI’s Twitter account: @ITINewsBreaks.

CLICK HERE to view more Weekly News Digest items.

Reuters Releases Free Archive of Over 800,000 News Stories

Reuters, the global information, news, and technology group, is, for the first time, making available free of charge, large quantities of archived Reuters news stories for use by research communities around the world. The first Reuters Corpus archive includes over 800,000 English-language news stories, equivalent to the annual global news output of Reuters.

The Reuters Corpus offers researchers a unique body of static information upon which to research, test, and benchmark emerging technologies. These include research into language processing, speech synthesis, voice recognition, indexation, search, and information retrieval.

The growth of the Internet has led to an explosion in the information services available to businesses and consumers. Additionally, improvements in bandwidth have increased the variety of channels and devices used to deliver and access information. Consequently, research into technologies that help businesses and individuals improve the way they access, search, and manipulate information has assumed even greater significance. According to the announcement, the availability of the Reuters Corpus assists organizations conducting this research.

Richard Willis, head of research and standards for the Reuters Chief Technology Office, said: "Reuters has always been heavily involved in language and data research, and to strengthen our links with the research community around the world, we have made available one of the most complete news archives ever released. The data provided will aid research into many aspects of language processing and information retrieval."

The archive includes all English-language stories produced by Reuters globally between August 20, 1996 and August 19, 1997. The news data is available on two CD-ROMs and formatted in XML to make it easier to use as a research tool. All the news stories are fully referenced using a total of 775 different category codes for topic, geography, and industry sector.

Marc Moens, head of Edinburgh University's Language Technology Group, said: "Because of its size and the amount of preparation that has gone into it, the Reuters collection provides scope for many new types of research and development work. It allows for the systematic evaluation of progress and comparison of results between different development groups. I am sure this Corpus will soon be seen as a standard in document-related work."

Yorick Wilks, a professor at Sheffield University, said: "We can already see the potential benefits of such a corpus for stylistic language analysis. The topic codes would also give us the opportunity to analyze the geographic location, industry area, or topic that received news coverage from Reuters. Areas such as semantic Web applications, categorization research, and machine learning of topic routings would also benefit. This will be a very useful resource."

As part of the research agreement covering use of the archive, researchers will supply Reuters with a copy of any material published using the data. Working with this feedback from research groups, Reuters hopes to bring out other corpora, including multilingual versions and volumes covering other date ranges. Further information on the corpus is available at

Source: Reuters

Mondosoft Unveils MondoSearch 4.1

Mondosoft has announced the release of MondoSearch version 4.1, a world-class site-search engine for large Web sites and intranets. Unveiled at the Internet World Spring 2001 conference in Los Angeles, MondoSearch 4.1 also provides significant new tools for Web site administrators.

According to the announcement, with MondoSearch 4.1, managers of large Internet and intranet sites can extend to site visitors a powerful search engine technology that enables them to see search results in relevant, pre-defined categories and in 12 languages.

Also available as an optional addition to MondoSearch 4.1 is a robust tracking module that enhances the power of the MondoSearch InSite management tool. The new module enables Webmasters to determine specifically what information users are looking for when they come to a site, what search terms they employ, and the degree of success the users achieve in their searches.

"We all know that many Web sites fail simply because users cannot easily find relevant information," said Laust Sondergaard, president and CEO of Mondosoft. "MondoSearch helps Webmasters make their sites ‘sticky.' With this tracking tool, we will help unleash the raw power that has been hidden away in troves of information on intranets and large corporate or organizational Web sites."

MondoSearch 4.1 includes some of the following features:

  • Full support for intranets
  • Indexing and searching of Adobe PDF files
  • Indexing and searching of Microsoft Office files
  • Split-pages functionality that brings a person directly to the part of the document in which his or her search term is located
  • A tracking module that provides Web-site owners with information such as what terms users searched for and whether their searches were successful
Like earlier versions of the software, MondoSearch 4.1 is easy to install and configure—an operation that usually takes only a few minutes, even for the most complex Web sites. MondoSearch is a multilingual site-search engine that presents results in customizable categories by relevance, language, and file-data type complete with icons identifying the media/format and language of the file. MondoSearch also eliminates duplicate information, even that presented from different-named servers. For frames-based sites, MondoSearch stores the correct frames layout of each page.

MondoSearch 4.1 will be available in April through Mondosoft and through authorized Mondosoft application service providers, business partners, and resellers. Pricing depends on the number of pages on the site, starting at about $150 per month for up to 200 pages for hosted applications and beginning at about $800 for an individual site license.

Source: Mondosoft

CRIBIS Announces New Content Integration Into SkyMinder

CRIBIS Corp. ( has announced the ability to access, through SkyMinder (, content from the Investext, MarkIntel, and Industry Insider collections provided by Thomson Financial. These brokerage, analyst, and industry reports, the majority of which SkyMinder users can purchase by individual page as their needs require, can be used for financial projections, market analysis, business intelligence, and evaluation of industry performance.

SkyMinder is a corporate information and decision-support tool that provides in-depth information on more than 31 million public and private companies across more than 200 countries worldwide. SkyMinder offers company information, financial company data, industry information, executive profiles, news, market research, tables, and credit information from leading sources, including Thomson Financial, Jordans', Integra Information, Hoover's, Graham & Whiteside, Dun & Bradstreet, COMTEX, and more.

In addition to the Investext, MarkIntel, and Industry Insider reports from Thomson Financial, SkyMinder recently integrated Dun & Bradstreet's (D&B) Million Dollar Database Plus, which allows information professionals to retrieve detailed information on 481,000 U.S. and Canadian businesses. The database is composed of companies that have more than $3 million in sales or more than 45 total employees or branches with more than 50 employees. The company information includes eight-digit SIC codes, number of employees, annual sales figures, location types, principal executives and their biographies, and a company's banking and accounting firms when available.

"In order to maintain SkyMinder's record of continuous improvement and to expand our service offering, we chose to integrate the very comprehensive Thomson Financial content for its authoritative financial data and quality business research on companies, industries, and markets worldwide," said Carlo Gherardi, chief operating officer for CRIBIS. "This addition, along with the integration of the new D&B database, which is well-known for its reliability, allows users to access the best company and industry information possible."

In addition to the millions of online reports available through SkyMinder, its users can order custom off-line reports through the CRIBIS Consultancy Service. According to the announcement, information professionals will be interested in the ability to request a "fresh investigation" for any company anywhere in the world. For a free trial, register at the SkyMinder Web site.

Source: CRIBIS Corp.

Send correspondence concerning the Weekly News Digest to NewsBreaks Editor Brandi Scardilli
              Back to top