最强的爬虫工程,只需要简单的配置即能实现自己的功能
资 源 简 介
最强的爬虫工程,只需要简单的配置即能实现自己的功能-Most reptiles works, only need a simple configuration that can realize the function of their own
文 件 列 表
lib
commons-collections-3.1.jar
jasper-runtime-tomcat-4.1.30.jar
poi-2.0-RC1-20031102.jar
commons-httpclient-3.0.1.jar
commons-cli-1.0.jar
commons-codec-1.3.jar
bsh-2.0b4.jar
commons-pool-1.3.jar
commons-logging-1.0.4.jar
mg4j-1.0.1.jar
commons-lang-2.1.jar
libidn-0.5.9.jar
poi-scratchpad-2.0-RC1-20031102.jar
jasper-compiler-tomcat-4.1.30.jar
servlet-tomcat-4.1.30.jar
dnsjava-1.6.2.jar
junit-3.8.1.jar
ant-1.6.2.jar
je-3.0.12.jar
fastutil-5.0.3-heritrix-subset-1.0.jar
itext-1.2.0.jar
javaswf-CVS-SNAPSHOT-1.jar
jetty-4.2.23.jar
commons-net-1.4.1.jar
webapps
admin.war
selftest.war
st
ata
selftest
order.xml
profiles
default
org
archive
apache
my
postprocessor
extractor
SohuNewsExtractor.java
SohuNewsExtractor.class
modules
BaseRule.options
CrawlScope.options
Credential.options
DecideRule.options
Filter.options
Frontier.options
Processor.options
StatisticTracking.options
jobs
WSDL-20080304091545468
Sohu_new-20080304091136015
Sohu_new-20080304090953234
Sohu_new-20080304090440671
.classpath
.project
arcMetaheaderBody.xsl
heritrix.properties
heritrix_dmesg.log
heritrix_out.log
jndi.properties