If a third-party vendor needs to crawl your Webflow site for legal compliance archiving, you'll need to whitelist their IP addresses or user agents by configuring your site's settings properly.
1. Understand Webflow’s Security Settings
- Webflow does not provide built-in IP whitelisting.
- However, if you’re using site-wide password protection or restricted page access, you might need to share credentials with your vendor.
2. Check if Webflow’s Firewall is Blocking the Vendor
- Webflow automatically blocks some bot traffic to prevent abuse.
- Ensure that the vendor’s crawler is not being flagged as a harmful bot.
- Ask the vendor for their IP addresses and user agent(s) so you can provide this to Webflow support if needed.
3. Modify robots.txt to Allow Crawling
- In Webflow, go to Project Settings > SEO and modify the
robots.txt
file. - Add a rule to specifically allow the vendor’s user agent. Example:
```
User-agent: VendorBot
Allow: /
``` - If their bot follows
robots.txt
rules, this will permit access.
4. Use Webflow’s Site Search API (If Available)
- If the vendor only needs specific page data, they might be able to pull it via Webflow’s Collection Lists API or Site Search API, reducing the need for full crawling.
- Since Webflow does not natively support IP whitelisting, you may need to contact Webflow Support for assistance.
- Explain that the vendor requires access for legal compliance and provide their IP address list and user agent name.
Summary
Webflow does not offer direct IP whitelisting, but you can allow third-party crawlers by modifying robots.txt or contacting Webflow Support with vendor details. If they need specific content, an alternative is using Webflow’s API instead of full crawling.