Abstract: Aimming at full consideration of the characteristics of the network technology in a various methods of classification of resources and a large quantity, this paper proposes a kind of crawler algorithm based on directory tree. The algorithm extracts and recognizes the directory links based on domain ontology knowledge as effective evaluation, and links the nodes effectively through a modified strategy of link analysis, eventually carry through collecting operation. The algorithm not only studies in-depth on the crawler architecture, but also pays attention to the speed of access to the latest resources optimization. Experimental results show that the algorithm can effectively achieve the established objectives both in speed and efficiency.
science and technology resource,