Updated on Mar 7, 2024
The LiteSpeed Cache plugin for WordPress comes with its own crawler. It scans your website and refreshes any expired pages in the cache. That way, your visitors are far less likely to be served outdated content. The crawler itself must be enabled on a server or virtual host level, as it can consume considerable resources. Since it is a Shared Hosting plan, FastCloud Extra does not support this feature. Still, it can be easily enabled on any of our VPS or DS solutions where LiteSpeed has been purchased. There are several tabs, so let’s look at them quickly.
This post includes:
As the name suggests, the Summary tab offers a general overview of the crawler’s activities. Here, you can find the files it has cached, those waiting to be scanned, and even those you have blocked from getting cached.
Additionally, the crawler can be manually run with the Manually Run button or reset to the start position with the Reset Position button. Finally, you can even monitor its progress via the button at the bottom: Show Crawler Status.
The Map tab shows which URIs are currently in the crawler map—basically, which URIs the crawler is scanning or will scan. You can click the Refresh Crawler Map button if the Sitemap List is empty or to get an updated version. There is also a search bar to see if a particular URI has been scanned or is in the queue.
Additionally, adding a URI to the blocklist is as easy as clicking the button next to it in the Sitemap List. You can also see the status of each URI via the colored dots (legend at the bottom of the page), and finally, you can clear the list with the Clear Crawler Map button.
This tab will display any URIs currently blocked. Here, you can also manually remove entries from the blocklist. Once more, the status of each entry is depicted by colored dots, with a legend at the bottom of the page. Finally, the Empty Blocklist button will purge all entries from the list in case you want to start from scratch.
The General Settings tab has all the options for controlling the LiteSpeed Cache crawler. These options only apply if the crawler is enabled on the server or virtual host level. Even if you enable the crawler from this tab, it will not run if it is unavailable from the server. Nonetheless, let’s look at what each option does. The default values are typically good enough, but you should tweak them according to your server’s capabilities.
These options are great if you want to really fine-tune the amount of resources the LiteSpeed Cache crawler consumes. If your server struggles at times, give the crawler more strict timeframes.
The Simulation Settings tab allows you to run the crawler from different points of view. By default, the crawler “views” the website as a non-logged Guest user. However, if you want it to run as a specific user group, you can put that user group’s ID in the field. The crawler will go down the list and pretend it is those user groups.
Additionally, LiteSpeed Cache can crawl for specific cookies. Add a new cookie to simulate and insert its name alongside the values the crawler should look for. We recommend you check out LiteSpeed’s official documentation on this section for more information and examples.
The final tab in the Crawler section is Sitemap Settings. Here, you can give the crawler a specific sitemap to scan instead of your entire website. Third-party tools can generate such sitemaps. The only requirement is that they be in Google XML Sitemap Format. A custom sitemap is useful if you want to tell the crawler to go through particular parts of your website.
Additionally, the Drop Domain from Sitemap switch will have the crawler parse the sitemap first, save it to the database, and then crawl it. Dropping the domain will reduce the sitemap’s size footprint on the database. This is always a helpful feature, but you should not use it if you use multiple domains for one site and those domains are in the sitemap.
The final option in the Crawler section, in general, is the Sitemap Timeout. It dictates how many seconds the crawler will have to scan the sitemap. That concludes our look into the Crawler section of the LiteSpeed Cache plugin for WordPress. All that is left is the Toolbox.