Articles in this section
Category / Section

How to Make Your Site Reliable for AI Support in BoldDesk

Updated:

To make your website a dependable reference source for BoldDesk AI, you can add your website’s publicly accessible pages to the Web Pages section within the AI module. Once added, BoldDesk AI indexes this content and uses it to generate accurate, context‑aware responses in tickets and customer interactions.
This guide explains prerequisites, supported URL types, and all available methods for adding your web content to BoldDesk’s AI knowledge base.

Requirements and permissions

Before adding web pages, ensure the following:

  • The URL must be publicly accessible. Pages requiring login, returning authentication errors, or blocked by firewalls cannot be indexed.
  • Only HTTP or HTTPS URLs are supported.
  • If an agent does not have the required permissions to access certain content sources, AI will not use that material when assisting them.
  • AI Copilot analyses the most recent ticket message when generating replies.
  • Ensure your website allows crawling. If bot restrictions are in place, whitelist the BoldDesk crawler.
  • Trial and active accounts have different data source limits. Trial users may add up to 50 web pages, while active plans support up to 10,000 pages.

How to add Web Pages as a Knowledge Source

BoldDesk provides a dedicated workflow for adding web pages to the AI module. Follow the steps below.
Steps:

1: Log In
Sign in to the BoldDesk Agent Portal using your agent or admin account.

2: Navigate to Web Pages
Go to:
AI > Content > Web Pages
This opens the Web Pages screen, where all previously added URLs are listed.

3: Add a New Web Page
Click the “Add Web Page” button.
A Web Page dialog box appears with three available methods for adding source URLs:

  • Crawl Links
  • Import Sitemap
  • Add Single URL

Each method supports different indexing requirements, depending on the structure of your documentation.

Methods for Adding Web Pages

METHOD 1: Crawl Links
Use this when you want BoldDesk to start from a single base URL and automatically crawl eligible pages within the same domain.
Fields include:

  • Brand: Select the brand this data source belongs to.

  • URL: Enter the base URL (e.g., https://example.com).

  • Include Only These Paths (optional): Restrict crawling to specific sections, such as /docs or /support.

  • Exclude Paths (optional): Prevent crawling of areas like /blog or /legal.

  • Crawl Depth: Controls how many levels deep the crawler follows links.

  • Pages to Crawl: Maximum number of pages allowed to be indexed.

  • Visibility: Choose Public or Private.

  • Sync Frequency: Determine how often pages should be refreshed.

  • Slow Scrapping: Enable this to crawl the website gradually, processing only a small number of pages at scheduled intervals to reduce load on the site.

    Crawl link.png

Click “Add & Sync” to start the indexing process.

METHOD 2: Import Sitemap
Use your domain’s XML sitemap to allow BoldDesk to discover structured URLs.

Fields include:

  • Brand

  • URL (your sitemap, such as https://example.com/sitemap.xml)

  • Include Only These Paths (optional)

  • Exclude Paths (optional)

  • Crawl Depth

  • Pages to Crawl

  • Visibility

  • Sync Frequency

  • Slow Scrapping: Enable this to crawl the website gradually, processing only a small number of pages at scheduled intervals to reduce load on the site.

    Import Sitemap.png

Click “Add & Sync” to begin sitemap processing.
This method is ideal for documentation-rich sites or portals with multiple sections.

  • Crawl depth limit defaults to 5 when no value is specified.
  • Maximum crawl limit defaults to 1,000 pages when no value is specified.

METHOD 3: Add Single URL
Use this method when you want BoldDesk AI to index only one specific page.
Fields include:

  • Brand

  • URL: Provide the exact page link.

  • Visibility

  • Sync Frequency

    Single URL.png

Click “Add & Sync” to save and index the page.

Additional Indexing Rules

  • Providing a root domain, such as https://www.bolddesk.com/, will result in the processing of all pages within that domain. If a specific path is provided, such as https://www.bolddesk.com/pricing, only the nested pages within that path (for example, https://www.bolddesk.com/pricing/team-based) will be processed. Other URLs will not be crawled.
  • Additionally, during crawling, only links to pages within the same domain will be considered. Links to external domains will be disregarded.
  • Sitemap URLs can also be utilized, and they will be processed accordingly.
  • To simplify whitelisting and approval through the robots.txt file, we provide a dedicated user agent, BoldDesk-Bot. If you have restrictions in place for external crawlers, you can grant access to BoldDesk by whitelisting this specific user agent.
  • To remove URLs from the AI’s knowledge base, navigate to AI → Web Pages, select the relevant website, and delete the URLs from the displayed list. Once deleted, the AI will no longer index, retrieve, or reference content from these URLs in any future interactions. This procedure ensures that outdated, irrelevant, or undesired information is excluded, thereby maintaining the accuracy, relevance, and integrity of the AI’s responses.

Troubleshooting Web Page Sync Errors

Possible causes of error when syncing a URL in the Web Pages section:

  • The URL might contain characters or formatting that BoldDesk doesn’t accept.
  • If the URL points to a page that doesn’t exist or returns a 404/403 error, BoldDesk may reject it.
  • If the URL is set to Private or requires authentication, BoldDesk might not be able to access or sync it.
  • Only https:// or http:// links are typically supported. Anything else (like ftp://) could cause issues.
  • If the site has an expired or invalid SSL certificate, BoldDesk may block the URL for security reasons.
  • If the target site blocks bots or external requests (via CORS or firewall settings), BoldDesk might not be able to fetch the content.

Removing Individual Page Entries

To remove a specific indexed page:

  1. Open the primary website entry under AI → Web Pages.

  2. Select the individual page you want to remove.

  3. Delete it from the list.

    Deleting_Web_page.gif

Changes reflect in AI responses within approximately 10 minutes.

FAQs

1. Can I add multiple documentation domains for one brand in BoldDesk?
Yes. Each brand in BoldDesk supports multiple Web Page data sources, allowing you to combine documentation from different subdomains or portals.

2. Does BoldDesk automatically re‑index website updates?
Yes. Indexing occurs according to the Sync Frequency you configure when adding or editing a Web Page source.

3. Will BoldDesk index PDF or file content from my website?
No. The Web Pages feature indexes HTML page content only. File-based knowledge must be added through the Files section of the AI module.

4. What happens if my site temporarily goes offline?
If a page cannot be accessed at the time of sync, BoldDesk will skip it. It will be re‑indexed during the next scheduled sync when the site becomes available again.

5. Can Web Pages sources be restricted to internal use only?
Yes. Set the Visibility field to Private to restrict access to internal agents.

6. Does BoldDesk support multilingual website indexing?
Yes. As long as the URL is publicly accessible, BoldDesk AI can index and interpret multilingual documentation pages.

Related Articles

Was this article useful?
Like
Dislike
Help us improve this page
Please provide feedback or comments
Comments (0)
Access denied
Access denied