ParseContext

Struct ParseContext 

pub struct ParseContext<'a, S>
where S: Spider + ?Sized,
{ /* private fields */ }
Expand description

Core runtime types and traits used to define and run a crawl. Parse-time context passed into Spider::parse.

This bundles the current Response, shared spider state, and the async output sink into a single value so user-facing parse signatures stay small.

The context dereferences to Response, which means selector-heavy code can keep the natural cx.css(...) style without manually reaching through a nested response field.

Implementations§

§

impl<'a, S> ParseContext<'a, S>
where S: Spider + ?Sized,

pub fn state(&self) -> &'a <S as Spider>::State

Returns the shared spider state for this parse call.

pub fn response(&self) -> &Response

Returns the current response explicitly.

pub fn response_mut(&mut self) -> &mut Response

Returns the current response as a mutable reference.

pub fn output(&self) -> &ParseOutput<<S as Spider>::Item>

Returns the underlying async parse output sink.

pub async fn add_item( &self, item: <S as Spider>::Item, ) -> Result<(), SpiderError>

Emits a scraped item into the runtime.

pub async fn add_items( &self, items: impl IntoIterator<Item = <S as Spider>::Item>, ) -> Result<(), SpiderError>

Emits multiple scraped items into the runtime.

pub async fn add_request(&self, request: Request) -> Result<(), SpiderError>

Emits a follow-up request into the runtime.

pub async fn add_requests( &self, requests: impl IntoIterator<Item = Request>, ) -> Result<(), SpiderError>

Emits multiple follow-up requests into the runtime.

pub fn into_parts( self, ) -> (Response, &'a <S as Spider>::State, ParseOutput<<S as Spider>::Item>)

Consumes the context and returns the inner response, state reference, and output sink.

pub fn into_response(self) -> Response

Consumes the context and returns the inner response.

Methods from Deref<Target = Response>§

Source

pub fn request_from_response(&self) -> Request

Reconstructs the original Request that led to this response.

This method creates a new Request with the same URL and metadata as the request that produced this response. Useful for retry scenarios or when you need to re-request the same resource.

§Example
let original_request = response.request_from_response();
Source

pub fn get_meta(&self, key: &str) -> Option<Value>

Returns a cloned metadata value by key.

Source

pub fn meta_value<T>(&self, key: &str) -> Result<Option<T>, Error>

Deserializes a metadata value into the requested type.

Source

pub fn discovery_rule_name(&self) -> Option<String>

Returns the runtime discovery rule name attached to this response, if any.

Source

pub fn matches_discovery_rule(&self, rule_name: &str) -> bool

Returns true when the response was reached through the named discovery rule.

Source

pub fn insert_meta(&mut self, key: impl Into<String>, value: Value)

Inserts a metadata value, lazily allocating the map if needed.

Source

pub fn clone_meta(&self) -> Option<Arc<DashMap<String, Value>>>

Returns a clone of the internal metadata map, if present.

Source

pub fn json<T>(&self) -> Result<T, Error>

Deserializes the response body as JSON.

§Type Parameters
  • T: The target type to deserialize into (must implement DeserializeOwned)
§Errors

Returns a serde_json::Error if the body cannot be parsed as JSON or if it cannot be deserialized into type T.

§Example
let data: Data = response.json()?;
Source

pub fn css(&self, query: &str) -> Result<SelectorList, SpiderError>

Applies a builtin CSS selector to the response body using a Scrapy-like API.

Supports standard CSS selectors plus terminal extraction suffixes:

  • ::text
  • ::attr(name)
§Example
let heading = response.css("h1::text")?.get().unwrap_or_default();
let next_href = response.css("a::attr(href)")?.get();
§Errors

Returns SpiderError::Utf8Error when the body is not valid UTF-8 and SpiderError::HtmlParseError when the selector is invalid.

Source

pub fn text(&self) -> Result<&str, Utf8Error>

Returns the response body as UTF-8 text.

Source

pub fn page_metadata(&self) -> Result<PageMetadata, Utf8Error>

Extracts structured page metadata from HTML responses.

Returns a customizable iterator of links discovered in the response body.

Unlike Response::links, this method does not deduplicate results. Callers that need uniqueness can collect into a set or use Response::links.

§Example
let links: Vec<_> = response
    .links_iter(LinkExtractOptions::default())
    .collect();
assert!(!links.is_empty());

Extracts all unique, same-site links from the response body.

This method discovers links from:

  • HTML elements with href or src attributes (<a>, <link>, <script>, <img>, etc.)
  • URLs found in text content (using link detection)

Only links pointing to the same site (same registered domain) are included.

§Returns

A [DashSet] of Link objects containing the URL and link type.

§Example
let links = response.links();
for link in links.iter() {
    println!("Found {:?} link: {}", link.link_type, link.url);
}

Trait Implementations§

§

impl<S> Deref for ParseContext<'_, S>
where S: Spider + ?Sized,

§

type Target = Response

The resulting type after dereferencing.
§

fn deref(&self) -> &<ParseContext<'_, S> as Deref>::Target

Dereferences the value.
§

impl<S> DerefMut for ParseContext<'_, S>
where S: Spider + ?Sized,

§

fn deref_mut(&mut self) -> &mut <ParseContext<'_, S> as Deref>::Target

Mutably dereferences the value.

Auto Trait Implementations§

§

impl<'a, S> !Freeze for ParseContext<'a, S>

§

impl<'a, S> !RefUnwindSafe for ParseContext<'a, S>

§

impl<'a, S> Send for ParseContext<'a, S>
where S: ?Sized,

§

impl<'a, S> Sync for ParseContext<'a, S>
where S: ?Sized,

§

impl<'a, S> Unpin for ParseContext<'a, S>
where S: ?Sized,

§

impl<'a, S> !UnwindSafe for ParseContext<'a, S>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

§

impl<T> Pointable for T

§

const ALIGN: usize

The alignment of pointer.
§

type Init = T

The type for initializers.
§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
§

impl<T> PolicyExt for T
where T: ?Sized,

§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] only if self and other return Action::Follow. Read more
§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] if either self or other returns Action::Follow. Read more
Source§

impl<P, T> Receiver for P
where P: Deref<Target = T> + ?Sized, T: ?Sized,

Source§

type Target = T

🔬This is a nightly-only experimental API. (arbitrary_self_types)
The target type on which the method may be called.
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more