Crate spider_middleware

Crate spider_middleware 

Source
Expand description

§spider-middleware

Built-in middleware for the crawler runtime.

This crate contains the request/response hooks that sit between scheduling, downloading, and parsing. It is the right layer for retry policy, rate limiting, cookies, proxies, user agents, robots.txt, and caching.

§Example

use spider_middleware::{rate_limit::RateLimitMiddleware, retry::RetryMiddleware};

let crawler = CrawlerBuilder::new(MySpider)
    .add_middleware(RateLimitMiddleware::default())
    .add_middleware(RetryMiddleware::new())
    .build()
    .await?;

Modules§

autothrottle
Adaptive throttling middleware.
cookies
Cookie persistence middleware.
http_cache
On-disk HTTP response cache middleware.
middleware
Middleware trait and control-flow types.
prelude
Common spider-middleware re-exports.
proxy
Proxy middleware.
rate_limit
Rate-limiting middleware.
referer
Middleware that fills Referer headers for follow-up requests.
retry
Retry middleware.
robots
robots.txt middleware.
user_agent
User-agent middleware.

Structs§

Request
Outgoing HTTP request used by the crawler runtime.
Response
Represents an HTTP response received from a server.

Enums§

Body
Request body variants supported by the default downloader.