资 源 简 介
Introduction
Snaker is an open source web crawler written in Java. It is a power plug-able download/crawler platform, easy to setup and use. User can write customized crawler scripts with JavaScript. It also has a friendly web console which help users to manage and monitor the download process.
Features
High performance downloading
Support session/cookie
Support customizable crawler script
Support HTTPS
Support HTTP proxy
Support OCR
Support multi charset
Screen Shot
Crawler script example
Please refer to HowToWriteCrawlerEngineScript for detail.
A simplest example which just download single URL:
```
// ==UserScript==
// @name SingleFile
// @title Single File
文 件 列 表
engines
train.js
english_pod.js
recognize_test.js
sina_carmodel.js
single_file.js
tianya.js
lib
jna.jar
platform.jar
commons-codec-1.4.jar
commons-el-from-jetty-5.1.4.jar
commons-httpclient-3.0.1.jar
commons-logging-1.1.1.jar
jasper-compiler-5.5.12.jar
jasper-compiler-jdt-5.5.12.jar
jasper-runtime-5.5.12.jar
jetty-6.1.19.jar
jetty-util-6.1.19.jar
js.jar
jsp-2.1.jar
jsp-api-2.1.jar
log4j-1.2.14.jar
servlet-api-2.5-20081211.jar
spring.jar
xstream-1.4.2.jar
snaker.jar
snaker.log
startSnaker.bat
ocr
tesseract
webapps
WEB-INF
css
js
downloaded.jsp
downloading.jsp
menu.jsp
newtask.jsp
recognize.jsp
setting.jsp
startSnaker.sh