Authors: Mengjie Zhang, Xiaoying Gao, Minh Duc Cao
Source: GZipped PostScript (54kb); Adobe PDF (154kb)
This paper describes an approach to the use of neural networks for improving the scientific paper classification performance. On the basis of the initial classification results obtained from the content-based Naive Bayes method, this approach uses neural networks to model the citation link structures of the scientific papers for refining the class labels of the documents. The approach is examined and compared with the Naive Bayes method on a standard paper classification data set with increasing training set sizes. The results suggest that using citation link structures, neural networks can significantly improve the system performance over the content-based naive Bayes method for all the training set sizes.
Keywords: Document classification, content based classification, citation links