Middleware

Trait Middleware 

Source
pub trait Middleware<C: Send + Sync>:
    Any
    + Send
    + Sync
    + 'static {
    // Required method
    fn name(&self) -> &str;

    // Provided methods
    fn process_request<'life0, 'life1, 'async_trait>(
        &'life0 self,
        _client: &'life1 C,
        request: Request,
    ) -> Pin<Box<dyn Future<Output = Result<MiddlewareAction<Request>, SpiderError>> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait,
             'life1: 'async_trait { ... }
    fn process_response<'life0, 'async_trait>(
        &'life0 self,
        response: Response,
    ) -> Pin<Box<dyn Future<Output = Result<MiddlewareAction<Response>, SpiderError>> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait { ... }
    fn handle_error<'life0, 'life1, 'life2, 'async_trait>(
        &'life0 self,
        _request: &'life1 Request,
        error: &'life2 SpiderError,
    ) -> Pin<Box<dyn Future<Output = Result<MiddlewareAction<Request>, SpiderError>> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait,
             'life1: 'async_trait,
             'life2: 'async_trait { ... }
}
Expand description

Trait implemented by request/response middleware.

Middleware runs around the downloader boundary:

  1. process_request sees outgoing requests before download
  2. the downloader executes the request unless middleware short-circuits it
  3. process_response sees successful responses
  4. handle_error sees download failures

Each hook can continue normal processing, stop it, or redirect control flow through MiddlewareAction.

Required Methods§

Source

fn name(&self) -> &str

Returns a human-readable middleware name for logs and diagnostics.

Provided Methods§

Source

fn process_request<'life0, 'life1, 'async_trait>( &'life0 self, _client: &'life1 C, request: Request, ) -> Pin<Box<dyn Future<Output = Result<MiddlewareAction<Request>, SpiderError>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait,

Intercepts an outgoing request before the downloader runs.

Typical uses include header injection, request filtering, cache lookup, throttling, or proxy selection.

Return:

  • Continue(request) to keep normal processing
  • Drop to stop processing that request entirely
  • ReturnResponse(response) to bypass the downloader
Source

fn process_response<'life0, 'async_trait>( &'life0 self, response: Response, ) -> Pin<Box<dyn Future<Output = Result<MiddlewareAction<Response>, SpiderError>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait,

Intercepts a successful response after download.

Typical uses include cache population, adaptive throttling, cookie extraction, or retry decisions based on status/body.

Return:

  • Continue(response) to forward the response to later middleware and parsing
  • Drop to stop processing the response
  • Retry(request, delay) to reschedule work after an optional wait
Source

fn handle_error<'life0, 'life1, 'life2, 'async_trait>( &'life0 self, _request: &'life1 Request, error: &'life2 SpiderError, ) -> Pin<Box<dyn Future<Output = Result<MiddlewareAction<Request>, SpiderError>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait, 'life2: 'async_trait,

Handles downloader errors for a request.

The default behavior propagates the error unchanged. Override this for retry policy, selective suppression, or custom recovery behavior.

Return:

  • Continue(request) to resubmit immediately
  • Drop to swallow the error and stop processing
  • Retry(request, delay) to resubmit after waiting

Implementors§