|Weekly News Digest
June 5, 2014 — In addition to this week's NewsBreaks article and the monthly NewsLink Spotlight, Information Today, Inc. (ITI) offers Weekly News Digests that feature recent product news and company announcements. Watch for additional coverage to appear in the next print issue of Information Today. For other up-to-the-minute news, check out ITIís Twitter account: @ITINewsBreaks.
CLICK HERE to view more Weekly News Digest items.
HathiTrust Dataset Analyzes Page-Level Features
The HathiTrust Research Center (HTRC) released the alpha version of a new dataset of page-level features (notable or informative text characteristics) extracted from HathiTrust’s original, scanned representations of public domain volumes.
Extracted features include occurrences of terms as parts of speech, term-frequency counts, and line and sentence counts on each page of text, with a total of more than 67 million pages. Pages are broken into header, body, and footer sections so they can be analyzed at scale.
The HTRC welcomes feedback on how the dataset can help researchers.
Send correspondence concerning the Weekly News Digest to NewsBreaks Editor