资 源 简 介
All development has ceased on this version of warctools.
A newer version in python has been published and is maintained on our site here: http://code.hanzoarchives.com/warc-tools
The main goal of WARC Tools is to facilitate and promote the adoption of the WARC file format for storing web archives by the mainstream web development community by providing an open source software library, a set of command line tools, web server plug-ins and technical documentation for manipulation and management of web archive files, or WARC files.
WARC files are produced by web archiving crawlers, such as Heritrix, the open-source, extensible, Web-scale, archiving quality Web crawler developed by the Internet Archive with the Nordic National Libraries, and Hanzo"s own commercial crawlers.
The project is lead by Hanz