WordPress Robots.txt Generator & Tester

A misconfigured robots.txt file is one of the most damaging technical SEO mistakes a website can make, and one of the easiest to introduce accidentally. When we audit WordPress sites for clients, we see two recurring errors: sites blocking their own CSS and JavaScript (which prevents Google from rendering pages), and staging configurations that accidentally carry over to live sites. This tool generates safe WordPress-specific defaults and lets you test any path before going live.

Includes safe defaults for :contentReference[oaicite:1]{index=1}.

How to Use

Start with the safe defaults and add extra rules only if needed.
Add your sitemap URL(s) if you want them referenced.
Use the tester to check if a path is allowed or blocked for a bot.

Generator

WordPress safe defaults Block all crawlers

WordPress-specific options

Block author archives (/?author=, /author/) Block RSS/Atom feeds (/feed/, /comments/feed/) Block WordPress boilerplate files (readme.html, license.txt, xmlrpc.php…)

Extra rules (one per line)

Sitemap URL(s) (one per line)

robots.txt output

WordPress users: place this file at the root of your domain (https://example.com/robots.txt). Most hosting panels have a File Manager or you can use an SEO plugin such as Yoast SEO or RankMath, which each have a built-in robots.txt editor under Tools → File Editor.

Rules Tester (Google-like: longest match wins)

User-agent

URL path to test

Result

What does robots.txt actually do?

The robots.txt file sits at the root of your domain (yourdomain.com/robots.txt) and instructs web crawlers which paths they’re allowed to request. It’s the first thing Googlebot checks when it visits a site for the first time.

Critical distinction: robots.txt controls crawling, not indexing. A page that’s blocked in robots.txt can still appear in Google search results — Google may have seen it referenced in a link from another site, and it will index the URL even without reading the content. To prevent indexing, you need a noindex meta tag on the page itself. These two tools serve different purposes.

What should a WordPress robots.txt block by default?

Block these:

/wp-admin/ – the admin dashboard (except /wp-admin/admin-ajax.php, which frontend forms depend on)
/wp-includes/ – WordPress core files not needed by crawlers
/trackback/ and /xmlrpc.php – legacy endpoints that generate spam crawl traffic

What should you never block in robots.txt?

Your sitemap URL — always include a Sitemap: directive pointing to your sitemap_index.xml
/wp-content/uploads/ — blocking this hides your images from Google Image Search
/wp-content/themes/ — blocking theme CSS/JS prevents Google from rendering your pages correctly

When is it safe to block all crawlers?

Use Disallow: / only on staging or development environments that should not appear in search results at all. Never deploy this rule to a live production site. A safer approach for staging is to use password protection (Basic Auth) combined with a noindex header rather than relying on robots.txt alone, password protection is absolute, robots.txt is advisory.

Note for Yoast Plugin

If you’re using :contentReference[oaicite:2]{index=2}, your sitemap is typically available at /sitemap_index.xml.

Confirm it on your site before adding it to robots.txt.

FAQs

Should I block /wp-admin/? Usually yes (and allow admin-ajax.php).
Will blocking a URL in robots.txt remove it from Google search results? No. Robots.txt controls whether Googlebot will crawl a URL, not whether Google will index it. A blocked URL can still appear in search results if Google has seen it referenced in another site’s links. To remove a URL from search results, you need a noindex meta tag on the page itself (which requires that Googlebot can access the page to read it), or you can use Google Search Console’s URL Removal tool for temporary removal.
Should I block /wp-admin/ in robots.txt? Yes, with one exception: block /wp-admin/ but explicitly allow /wp-admin/admin-ajax.php. Many WordPress contact forms, WooCommerce cart functions, and frontend plugins route their requests through admin-ajax.php — blocking the entire /wp-admin/ directory without this exception can break frontend functionality.
How do I add my sitemap to robots.txt? Add a Sitemap: directive at the end of your file, on its own line: Sitemap: https://yourdomain.com/sitemap_index.xml. If you use Yoast SEO, your sitemap is at /sitemap_index.xml by default. If you use Rank Math, it’s at /sitemap_index.xml as well. Confirm by visiting the URL in your browser before adding it.
Can robots.txt block AI crawlers like GPTBot? Yes. You can add specific User-agent rules for AI crawlers: User-agent: GPTBot followed by Disallow: / will prevent OpenAI’s crawler from indexing your content for training data. Common AI crawler user-agent strings include GPTBot (OpenAI), CCBot (Common Crawl), and Google-Extended (Google’s AI training crawler).
Why does Google sometimes ignore my robots.txt? Google treats robots.txt as advisory, not absolute. If a URL has external links pointing to it, Google may still index the URL (showing it in results without the full content). Additionally, if your robots.txt file itself returns an error (4xx), Googlebot may temporarily ignore the file entirely and crawl everything. Keep your robots.txt accessible and return a 200 status code.
Can I block bad bots here? You can discourage them, but true blocking is done at the server/WAF level.

Technical SEO issues, like misconfigured crawl rules, indexation problems, Core Web Vitals failures, are included in every SEO audit we run for clients. We fix them, not just report them.

See Our SEO Services

Back to Tools