In this paper we describe work relating to classification of Web documents using a graph-based model instead of the traditional vector-based model for document representation. We compare the classification accuracy of the vector model approach using the k-nearest neighbor (k-NN) algorithm to a novel approach which allows the use of graphs for document representation in the k-NN algorithm. The proposed method is evaluated on three different Web document collections using the leave-one-out approach for measuring classification accuracy. The results show that the graph-based k-NN approach can outperform traditional vector-based k-NN methods in terms of both accuracy and execution time.
Classification of Web documents using a graph model
2003-01-01
264185 byte
Aufsatz (Konferenz)
Elektronische Ressource
Englisch
Classification of Web Documents Using a Graph Model
British Library Conference Proceedings | 2003
|Classification and Postprocessing of Documents Using an Error-Correcting Parser
British Library Conference Proceedings | 1995
|Classification and Functional Decomposition of Business Documents
British Library Conference Proceedings | 1995
|An Adaptive System for Automatic Invoice-Documents Classification
British Library Conference Proceedings | 2005
|