Text and data mining: Together at last!

CINF 16

Anthony J. Trippe, atrippe@cas.org, Science IP/Chemical Abstracts Service, 2540 Olentangy River Rd., Columbus, OH 43210
Many techniques and tools have long been available to information professionals for statistical analysis of fielded (structured) data. Lately, there has been an increased focus on the analysis of textual (unstructured) data. Traditionally, these forms of analysis have been conducted separately. In general, it was not possible for the value and strengths of these approaches to be combined. New software now allows the application of rigorous data mining tools, e.g., data grouping and clean-up, to the creation of bar charts and 2-D matrix charts from fielded data. It also allows the use of text mining elements, including data harmonization, for the creation of concept clusters and maps from unstructured data. Output from both is linked and dynamically interactive. A brief discussion of the software's capabilities will be followed by a case study on how the marriage of text and data mining supports strategic business research by providing rapid, insightful analyses.