Collective Classification of Biomedical Articles using T-Cell Cross-regulation


Alaa Abi-Haidar and Luis M. Rocha

School of Informatics, Indiana University, 919 East Tenth Street, Bloomington IN 47408, USA
and
FLAD Computational Biology Collaboratorium, Instituto Gulbenkian de Ciencia, Portugal

Citation: A. Abi-Haidar and L.M. Rocha [2010]. "Collective Classification of Biomedical Articles using T-Cell Cross-regulation". In: Artificial Life XII: Twelfth International Conference on the Simulation and Synthesis of Living Systems.H. Fellermann et al et al (Eds.). MIT Press, pp. 706-713.

The pre-print is available in Adobe Acrobat (.pdf) format only. Due to mathematical notation and graphics, only the abstract is presented here.

Abstract.

We continue our investigation of a bio-inspired solution for binary classification of textual documents inspired by T-cell cross-regulation in the vertebrate adaptive immune system, which is a complex adaptive system of millions of cells interacting to distinguish between self and nonself substances. In analogy, automatic document classification assumes that the interaction and co-occurrence of thousands of words in text can be used to identify conceptually-related classes of documents—at a minimum, two classes with relevant and irrelevant documents for a given concept (e.g. articles with protein-protein interaction information). Our agent-based method for document classification expands the analytical model of Carneiro et al [5], by allowing us to deal simultaneously with many distinct populations of antigen-specific T-Cells and their collective dynamics. We have previously extended this model to produce a spam-detection system [2; 3]. We have also developed our agent-based model further to apply it to biomedical article classification [4], testing it on a dataset of biomedical articles provided by the BioCreative 2.5 challenge [17]. Here, we study the effect that the sequence of presentation of articles has on classification performance, as well as the robustness of the ensuing T-cell cross-regulation dynamics to initial biases of the proportions of effector and regulatory T-cells. We show that classification is improved when we preserve the original temporal order of biomedical articles, suggesting that our model is capable of tracking the natural conceptual drift of the relevant biomedical literature. We further show that initial biases in the proportions of T-cells are corrected by the dynamics of the model. Our results are useful for biomedical text mining, but they also help us understand T-cell cross-regulation as a potential general principle of classification available to collectives of molecules without a central controller. While there is still much to know about the specifics of T-cell cross-regulation in adaptive immunity, Artificial Life allows us to explore alternative emergent classification principles while producing useful bio-inspired tools.

Keywords:Artificial Immune System, Collective Classification, Collective Behavior, Collective Computation, Bio-medical Document Classification, T-cell Cross-Regulation, Bio-inspired Computing, Artificial Intelligence.

For the full paper please download the preprint in pdf


For more information contact Luis Rocha at rocha@indiana.edu. Check the Web Design Credits, for due credit.
Last Modified: October 26, 2010