This snap-in enables crawling of SharePoint and Single-Page Application (SPA) websites where documents are isolated without discoverable links, such as query-parameter–based URLs by fetching sitemap artifacts.
Features
Artifact-Based Sitemap
Fetches sitemap from DevRev artifacts with automatic signed URL refresh (7-day expiry).Act-As Token Creation
Automatically creates act-as user tokens required for web crawler API permissions.Explicit URL Indexing
Crawls all URLs explicitly listed in the sitemap without relying on link discovery.Periodic Refresh Support
Designed for timer-based execution to handle artifact URL expiry.
Installation
- Add the node in the workflow after a timer trigger.
- Pass the id of artifact to the node.