Journal of Pathology Informatics (Jan 2015)
Extraction and analysis of discrete synoptic pathology report data using R
Abstract
Background: Synoptic pathology reports can serve as a rich source of cancer information, particularly when the content is available as discrete electronic data fields. Our institution generates such reports as part of a province wide program in Ontario but the resulting data is not easily extracted and analyzed at the local level. Methods: A low cost system was developed using the open sourced and freely available R scripting/data analysis environment to parse synoptic report results into a dataframe and perform basic summary statistics. Results: As a pilot project text reports from 427 prostate needle biopsies were successfully read into R and the data elements split out and converted into appropriated data classes for analysis. Conclusion: This approach provides a simple solution at minimal cost that can make discrete synoptic report data readily available for quality assurance and research activities.
Keywords