Browse Snap-ins
sitemap-crawler
Sitemap Crawler
by DevRev
Categories
Automation
Documentation
Knowledge Base

This snap-in enables crawling of SharePoint and Single-Page Application (SPA) websites where documents are isolated without discoverable links, such as query-parameter–based URLs by fetching sitemap artifacts.

Features

  • Artifact-Based Sitemap
    Fetches sitemap from DevRev artifacts with automatic signed URL refresh (7-day expiry).

  • Act-As Token Creation
    Automatically creates act-as user tokens required for web crawler API permissions.

  • Explicit URL Indexing
    Crawls all URLs explicitly listed in the sitemap without relying on link discovery.

  • Periodic Refresh Support
    Designed for timer-based execution to handle artifact URL expiry.

Installation

  1. Add the node in the workflow after a timer trigger.
  2. Pass the id of artifact to the node.