资 源 简 介
.NET architecture
arachnode.net is the most comprehensive open source C#/.NET web crawler available. Use arachnode.net from any .NET language.
Configurable Rules and Actions
Implement custom pre- and post-request crawl rules and actions without source recompilation. The existing crawl rules and actions architecture easily enables crawling enhancements such as federation, partitioning and distributed caching.
Lucene.NET Integration
Lucene.NET integration allows for full-text search through a familiar web interface. Easily integrate your search results into Solr or other Lucene index utilization solutions, whether they be in .NET, Java or any other language that supports Lucene.
SQL Server 2005/2008 and full-text indexing
SQL Server 2005/2008 full-text indexing is configured at all appropriate content storage locations for files, images and web pages.
.DOC/.PDF/.PPT/.XLS Indexing
Crawl, index and search Microsoft Word, PowerPoint and Excel and Adobe
文 件 列 表
Structures
arachnode.net.snk
bin
obj
PriorityQueue.cs
Properties
Structures.csproj
Value
Test
obj
1.htm
Adsense Code 2.txt
Adsense Code.txt
Computer Term Dictionary.txt
LPTD-DBCH1-Finish-Shot.jpg
LPTDGTCH1-Finish-Shot.jpg
LPTD-HBCH1-Finish-Shot.jpg
TestPage1.htm
TestPage2.htm
TestPage3.htm
TestPage4.htm
TestSite.exe
TestSite.pdb
TestSite.vshost.exe
UriTest
UriTest.htm
web.config
bin
Utilities
arachnode.net.snk
Base64Encoder.cs
bin
Directories.cs
EXIF
Files.cs
obj
Properties
Resources.Designer.cs
Resources.resx
Strings.cs
Utilities.cd
Utilities.csproj
Web.cs
AbsoluteUriUtilities.cs
FileUtilities.cs
ImageUtilities.cs
WebPageUtilities.cs
UriExtensionMethods.cs
Web
Arachnode.Master
Arachnode.Master.cs
Arachnode.Master.designer.cs
arachnode.net.snk
bin
Browse.aspx
Browse.aspx.cs
Browse.aspx.designer.cs
Cached.aspx
Cached.aspx.cs
Cached.aspx.designer.cs
Depth_1
Explanation.aspx
Explanation.aspx.cs
Explanation.aspx.designer.cs
Global.asax
Global.asax.cs
Managers
obj
Properties
Search.aspx
Search.aspx.cs
Search.aspx.designer.cs
Service.asmx
Service.asmx.cs
style.css
Test.aspx
Test.aspx.cs
Test.aspx.designer.cs
Test
UserControls
Web.config
Web.csproj
Web.csproj.user
arachnode.net.bak_2005.zip
arachnode.net.bak_2008.zip
arachnode.net.ndoc
arachnode.net.sln
arachnode.net.snk
arachnode.net.suo
arachnode.net.vsmdi
Arachnode.SiteCrawler.Next.saproj
Arachnode.SiteCrawler.saproj
cfg.Configuration.xml
CommonQueries.sql
LocalTestRun.testrunconfig
lukeall-1.0.1.jar
readme.txt
RestoreDatabaseWithNames.sql
UpdateDatabase_2.0.0.0_to_2.5.0.0.sql
UpdateDatabase_2.5.0.0_to_2.5.0.12.sql
UpdateDatabase_2.5.0.12_to_3.0.0.0.sql
Administration
Administration.csproj
Administration.csproj.user
App_Data
arachnode.net.snk
bin
Default.aspx
Default.aspx.cs
Default.aspx.designer.cs
DynamicData
Global.asax
Global.asax.cs
obj
Properties
Site.css
Site.master
Site.master.cs
Site.master.designer.cs
web.config
Analysis
Analysis.database
Analysis.dwproj
Analysis.dwproj.user
bin
Discovery Types.dim
Domains.dim
Domains.dmm
Domains_DMDim.dim
Domains_DMDSV.dsv
Extensions.dim
Hosts Discoveries.dim
obj
Schemes Discoveries.dim
UriClassificationCube.cube
UriClassificationCube.partitions
UriClassificationCube_DM.cube
UriClassificationCube_DM.partitions
UriClassificationDataSource.ds
UriClassificationDataSourceView.dsv
Application
Application.csproj
Application.csproj.user
Arachnode.Master
Arachnode.Master.cs
Arachnode.Master.designer.cs
arachnode.net.snk
bin
Crawl.aspx
Crawl.aspx.cs
Crawl.aspx.designer.cs
obj
Properties
Scripts
style.css
Value
Web.config
App_Data
Cache
arachnode.net.snk
bin
Cache.csproj
DistributedCache.cs
obj
Properties
Configuration
ApplicationSettings.cs
arachnode.net.keys
arachnode.net.snk
bin
Configuration.cd
Configuration.csproj
ConnectionStrings.config
obj
Properties
Value
WebSettings.cs
Console
App.config
arachnode.net.snk
bin
Console.cd
Console.csproj
CrawlRequests.txt
obj
Program.cs
Properties
Web References
Console.Benchmark
arachnode.net.snk
bin
Console.Benchmark.csproj
obj
Program.cs
Properties
Console.Next
arachnode.net.snk
bin
Console.Next.csproj
Console.Next.csproj.user
CrawlRequests.txt
obj
Program.cs
Properties
DataAccess
arachnode.net.snk
ArachnodeDAO.cs
bin
DataAccess.cd
DataAccess.csproj
Managers
obj
Properties
Value
ArachnodeNextDAO.cs
DataSource
app.config
arachnode.net.snk
ArachnodeDataSet.cs
ArachnodeDataSet.Designer.cs
ArachnodeDataSet.xsc
ArachnodeDataSet.xsd
ArachnodeDataSet.xss
bin
ConnectionString.cs
DataSource.cd
DataSource.csproj
DataSource.csproj.user
Next
obj
Properties
ReportingLinqToSql.dbml
ReportingLinqToSql.dbml.layout
ReportingLinqToSql.designer.cs
DemoFiles
bin
Debug
CoreConfiguration.xml
CrawlRequestPluginConfiguration.xml
DemoFiles.csproj
DemoFiles.csproj.user
FileAndImageAbsoluteUriRegularExpression.txt
HyperLinkAbsoluteUriRegularExpression.txt
Next
obj
Properties
readme.txt
sqlceca35.dll
sqlcecompact35.dll
sqlceer35EN.dll
sqlceme35.dll
sqlceoledb35.dll
sqlceqp35.dll
sqlcese35.dll
System.Data.SqlServerCe.dll
System.Data.SqlServerCe.Entity.dll
Documentation
obj
readme.txt
Functions
arachnode.net.snk
bin
ComputeLevenshteinDistance.cs
ConvertSource.cs
ExtractAlphaNumericCharacters.cs
ExtractDirectory.cs
ExtractDomain.cs
ExtractExtension.cs
ExtractFileExtension.cs
ExtractFileName.cs
ExtractHash.cs
ExtractHost.cs
ExtractIPAddress.cs
ExtractNonAlphaNumericCharacters.cs
ExtractPhrases.cs
ExtractResponseHeader.cs
ExtractScheme.cs
ExtractTags.cs
ExtractText.cs
ExtractWords.cs
Functions.cd
Functions.csproj
Functions.csproj.user
GenerateIncorrectKeystrokeTypos.cs
GenerateMissedKeystrokeTypos.cs
GenerateRepeatedKeystrokeTypos.cs
GenerateTransposedKeystrokeTypos.cs
IsDisallowed.cs
obj
Properties
Value
GraphicalUserInterface
app.config
Arachnode.cs
Arachnode.dbml
Arachnode.dbml.layout
Arachnode.designer.cs
arachnode.net.snk
bin
Form1.cs
Form1.Designer.cs
Form1.resx
GraphicalUserInterface.csproj
obj
Program.cs
Properties
Integration
Integration.database
Integration.dtproj
Integration.dtproj.user
obj
TermExtraction.dtsx
TermLookup.dtsx
Library
arachnode.net.snk
Arachnode.SiteCrawler.dll
Arachnode.SiteCrawler.Next.dll
bin
Highlighter.Net.dll
HtmlAgilityPack.dll
Interop.SHDocVw.dll
itextsharp.dll
Library.csproj
Lucene.Net.dll
Microsoft.mshtml.dll
NClassifier.dll
obj
Properties
RSS.NET.dll
sqlceca35.dll
sqlcecompact35.dll
sqlceer35EN.dll
sqlceme35.dll
sqlceoledb35.dll
sqlceqp35.dll
sqlcese35.dll
System.Data.SqlServerCe.dll
System.Data.SqlServerCe.Entity.dll
WatiN.Core.dll
Performance
arachnode.net.snk
bin
Counters.cs
obj
Performance.csproj
Properties
Plugins
app.config
arachnode.net.snk
bin
CrawlActions
CrawlRules
EngineActions
obj
Plugins.csproj
Properties
Web References
Plugins.Next
arachnode.net.snk
bin
CrawlerPlugins.cs
CrawlRequestPlugins.cs
FileAndImageDiscoveryPlugins.cs
FilePlugins.cs
HyperLinkDiscoveryPlugins.cs
ImagePlugins.cs
obj
Plugins.Next.csproj
Properties
WebPagePlugins.cs
PostProcessing
App.config
arachnode.net.snk
bin
Main.cs
Main.Designer.cs
Main.resx
obj
PostProcessing.csproj
Program.cs
Properties
Proxy
app.config
arachnode.net.snk
Authentication
bin
Clients
Connections
ConsoleAttributes.cs
Handlers
license.txt
Listeners
make.bat
obj
Properties
Proxy.chm
Proxy.cs
Proxy.csproj
ProxyConfig.cs
readme.txt
Value
Renderer
App.config
arachnode.net.snk
bin
BrowserOptions.cs
COMInterop.cs
HtmlRenderer.cs
Iid_Clsids.cs
obj
Program.cs
Properties
Renderer.cs
Renderer.csproj
Renderer.Designer.cs
Renderer.resx
TestData
Value
WinAPIs.cs
Search
arachnode.net.snk
bin
Document.cs
InvertedIndex.cs
obj
Properties
Search.csproj
SearchEngine.cs
Word.cs
Security
arachnode.net.snk
bin
Encryption.cs
Hash.cs
Obfuscation.cs
obj
Properties
Security.csproj
Service
App.config
arachnode.net.snk
bin
CrawlRequests.txt
obj
Program.cs
ProjectInstaller.cs
ProjectInstaller.Designer.cs
ProjectInstaller.resx
Properties
Service.cd
Service.cs
Service.csproj
Service.Designer.cs
SiteCrawler
app.config
arachnode.net.snk
bin
Components
Core
Crawler.cs
EngineActions
Managers
obj
Properties
READ_ME_DEMO.txt
Rules
SiteCrawler.cd
SiteCrawler.csproj
Utilities
Value
Web References
Actions
SiteCrawler.Next
App.config
arachnode.net.snk
bin
ClassDiagram.cd
ConnectionStrings.config
Core.cd
CoreConfiguration.xml
Crawler.cs
CrawlRequestPluginConfiguration.xml
DataAccess
ExtensionMethods
FileAndImageAbsoluteUriRegularExpression.txt
HyperLinkAbsoluteUriRegularExpression.txt
Managers
obj
Properties
SearchEngine.cs
SiteCrawler.Next.csproj
SiteCrawler.Next.csproj.user
Utilities
Value