Challenges in understanding clinical notes: Why NLP engines fall short and where background knowledge can help

Sujan Perera, Amit Sheth, Krishnaprasad Thirunarayan, Suhas Nair, Neil Shah

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Understanding of Electronic Medical Records(EMRs) plays a crucial role in improving healthcare outcomes. However, the unstructured nature of EMRs poses several technical challenges for structured information extraction from clinical notes leading to automatic analysis. Natural Language Processing(NLP) techniques developed to process EMRs are effective for variety of tasks, they often fail to preserve the semantics of original information expressed in EMRs, particularly in complex scenarios. This paper illustrates the complexity of the problems involved and deals with conflicts created due to the shortcomings of NLP techniques and demonstrates where domain specific knowledge bases can come to rescue in resolving conflicts that can significantly improve the semantic annotation and structured information extraction. We discuss various insights gained from our study on real world dataset.

Original languageAmerican English
Title of host publicationDARE '13
Subtitle of host publicationProceedings of the 2013 international workshop on Data management & analytics for healthcare
Pages21-26
Number of pages6
DOIs
StatePublished - Nov 1 2013
Event2013 International Workshop on Data Management and Analytics for Healthcare - San Francisco, CA, United States
Duration: Nov 1 2013Nov 1 2013

Conference

Conference2013 International Workshop on Data Management and Analytics for Healthcare
Abbreviated titleDARE 2013
Country/TerritoryUnited States
CitySan Francisco, CA
Period11/1/1311/1/13
OtherCo-located with the 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013

ASJC Scopus Subject Areas

  • General Decision Sciences
  • General Business,Management and Accounting

Keywords

  • Knowledge base
  • Natural language processing
  • Negation detection

Disciplines

  • Bioinformatics
  • Communication
  • Communication Technology and New Media
  • Computer Sciences
  • Databases and Information Systems
  • Life Sciences
  • OS and Networks
  • Physical Sciences and Mathematics
  • Science and Technology Studies

Cite this