metasploit-framework/lib/anemone/docs/README.rdoc

= Anemone

Anemone is a web spider framework that can spider a domain and collect useful
information about the pages it visits. It is versatile, allowing you to
write your own specialized spider tasks quickly and easily.

See http://anemone.rubyforge.org for more information.

== Features
* Multi-threaded design for high performance
* Tracks 301 HTTP redirects
* Built-in BFS algorithm for determining page depth
* Allows exclusion of URLs based on regular expressions
* Choose the links to follow on each page with focus_crawl()
* HTTPS support
* Records response time for each page
* CLI program can list all pages in a domain, calculate page depths, and more
* Obey robots.txt
* In-memory or persistent storage of pages during crawl, using TokyoCabinet, MongoDB, or Redis

== Examples
See the scripts under the <tt>lib/anemone/cli</tt> directory for examples of several useful Anemone tasks.

== Requirements
* nokogiri
* robots

== Development
To test and develop this gem, additional requirements are:
* rspec
* fakeweb
* tokyocabinet
* mongo
* redis

You will need to have {Tokyo Cabinet}[http://fallabs.com/tokyocabinet/], {MongoDB}[http://www.mongodb.org/], and {Redis}[http://code.google.com/p/redis/] installed on your system and running.
Initial import of an Anemone snapshot git-svn-id: file:///home/svn/framework3/trunk@10924 4d416f70-5f16-0410-b530-b9f4589650da 2010-11-06 04:34:43 +00:00			`= Anemone`

			`Anemone is a web spider framework that can spider a domain and collect useful`
			`information about the pages it visits. It is versatile, allowing you to`
			`write your own specialized spider tasks quickly and easily.`

			`See http://anemone.rubyforge.org for more information.`

			`== Features`
			`* Multi-threaded design for high performance`
			`* Tracks 301 HTTP redirects`
			`* Built-in BFS algorithm for determining page depth`
			`* Allows exclusion of URLs based on regular expressions`
			`* Choose the links to follow on each page with focus_crawl()`
			`* HTTPS support`
			`* Records response time for each page`
			`* CLI program can list all pages in a domain, calculate page depths, and more`
			`* Obey robots.txt`
			`* In-memory or persistent storage of pages during crawl, using TokyoCabinet, MongoDB, or Redis`

			`== Examples`
			`See the scripts under the <tt>lib/anemone/cli</tt> directory for examples of several useful Anemone tasks.`

			`== Requirements`
			`* nokogiri`
			`* robots`

			`== Development`
			`To test and develop this gem, additional requirements are:`
			`* rspec`
			`* fakeweb`
			`* tokyocabinet`
			`* mongo`
			`* redis`

			`You will need to have {Tokyo Cabinet}[http://fallabs.com/tokyocabinet/], {MongoDB}[http://www.mongodb.org/], and {Redis}[http://code.google.com/p/redis/] installed on your system and running.`