Title |
Biomedical named entity extraction: some issues of corpus compatibilities
|
---|---|
Published in |
SpringerPlus, November 2013
|
DOI | 10.1186/2193-1801-2-601 |
Pubmed ID | |
Authors |
Asif Ekbal, Sriparna Saha, Utpal Kumar Sikdar |
Abstract |
Named Entity (NE) extraction is one of the most fundamental and important tasks in biomedical information extraction. It involves identification of certain entities from text and their classification into some predefined categories. In the biomedical community, there is yet no general consensus regarding named entity (NE) annotation; thus, it is very difficult to compare the existing systems due to corpus incompatibilities. Due to this problem we can not also exploit the advantages of using different corpora together. In our present work we address the issues of corpus compatibilities, and use a single objective optimization (SOO) based classifier ensemble technique that uses the search capability of genetic algorithm (GA) for NE extraction in biomedicine. We hypothesize that the reliability of predictions of each classifier differs among the various output classes. We use Conditional Random Field (CRF) and Support Vector Machine (SVM) frameworks to build a number of models depending upon the various representations of the set of features and/or feature templates. It is to be noted that we tried to extract the features without using any deep domain knowledge and/or resources. |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Switzerland | 1 | 5% |
Unknown | 19 | 95% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 5 | 25% |
Student > Master | 3 | 15% |
Lecturer | 2 | 10% |
Student > Bachelor | 2 | 10% |
Student > Postgraduate | 2 | 10% |
Other | 5 | 25% |
Unknown | 1 | 5% |
Readers by discipline | Count | As % |
---|---|---|
Computer Science | 8 | 40% |
Social Sciences | 3 | 15% |
Medicine and Dentistry | 2 | 10% |
Business, Management and Accounting | 1 | 5% |
Nursing and Health Professions | 1 | 5% |
Other | 3 | 15% |
Unknown | 2 | 10% |