Note: This plugin requires at least version 3.3 of WordPress.
The Name Redactor is a WordPress plugin which allows WordPress users to
hide personal data from search engines. As the name of the plugin implies,
the type of content we are talking about in this context are personal
names. The plugin works by checking whether the visitor to the site is
human or a search engine robot. If the visitor is a search engine robot,
the plugin will redact any personal names before delivering the content,
replacing them with the text [redacted]. To human visitors, the names
will appear as normal.
The web is full of personal names, which is usually attached to some
contextual data (e.g. utterances, images, etc.). If these personal names
are indexed by search engines, along with the contextual
data attached to them, both will be discoverable by anyone searching for a
specific name. While some such discoveries may be beneficial to the
subject, others may be harmful. The purpose of the Name Redactor is not to
block search engines from accessing your WordPress site or indexing your
content. The purpose is to avoid having personal names being indexed along
with contextual data attached to those names.
The Name Redactor plugin works by detecting if a visitor to the site is a
search engine robot, and if so, the plugin will redact any personal names
(which have been tagged with <redact content="name"></redact>) before
delivering the content, replacing them with the text [redacted].
The tagging can either be done manually by the publisher, or automatically
by the program.
When you install the plugin for the first time, it is set by default to
only redact names that have been manually tagged. If you go to add a new
post, page, or comment (or edit already existing content) and select the
Text Editor, you will see that a new button has been added to the
pre-existing ones. This button, labeled redact, allows you to tag a
name in the text. Simply select the name you want to tag, and press the
redact button. Alternatively, place the cursor before a name, press the
redact button to add the name redact tag, place the cursor after the name,
and press the redact button again to close the tag. Note that these tags
will only be visible in the page source of the website. Before publishing
something, you can view the text from a bot’s point of view by pressing
the ‘Preview’ button (note that you first need to select this option from
the plugin settings menu).
Also note that when uninstalling the plugin, any manually tagged names
will remain tagged. If you want to remove the tags, you will have to
remove them manually as well, by going back and editing the content.
You can also set the plugin to automatically try to detect personal names,
and redact them accordingly. This automatic name detection is accomplished
by using a simple set of rules, written as regular expressions:
1. It will match a single word with the first letter capitalized, as long
as that word is not at the beginning of the sentence.
2. It will match two or more consecutive words starting with the first
letter capitalized, as long as the first word is not at the beginning
of the sentence.
Names that have been tagged manually will continue to be tagged until the
tags are manually removed (so if you at a later date should wish to remove
tags from a name, you will have to go back and edit the post, comment or
page in question). Automatic tagging is done on the spot whenever the
content is requested by a search engine bot. This means that the content
in the database is left unchanged, and no tags are saved along with the
text.
Detecting whether or not a visitor to the site is a web crawler, is done
by checking the “User-Agent” header of the client software originating the
request (see the
Wikipedia page for
more information on this). Whenever a visitor requests to view the content,
be it a page, comment, or post, the plugin will check the user-agent
string up against a list containing a set of known search engine bot
names. If the User-Agent matches a name in the list, the plugin will
redact any tagged content before returning it to the bot. Upon
installation, the plugin will add a default set of bot names to the list.
The user can then freely add or delete names to or from the list.
Note that while the plugin is primarily meant as a way of preventing
search engines from indexing personal names, it can, in theory, also be
used to prevent disclosure of other types of personal data, by manually
tagging it in the same manner as you would do names.
You can change the settings for the plugin in the ‘Name Redactor Settings’
sub menu, located in the ‘Tools’ menu in the admin panel. The Name
Redactor settings menu is organized into three different options pages,
with tabs to make navigation easier. The option pages are organized as
follows: Options, Opt-in/opt-out, and Bots.
Credit
The original idea for this plugin comes from Gisle Hannemyr
http://hannemyr.com/index.php.