HrefHawk

HrefHawk

Details
View on WordPress

HrefHawk is an internal linking plugin that reads your content, extracts meaningful phrases, scores them for relevance, and suggests where one post should link to another. No external API calls. No third-party dependencies. Everything runs inside your WordPress installation.

The Lexical engine scans every published post through a seven-stage cleaning pipeline that strips Gutenberg markup, shortcodes, HTML tags, entities, unicode artefacts, and punctuation. The cleaned text is split into sentences, then into phrases at configurable depth (1-word, 2-word, 3-word). Each phrase is indexed with an MD5 hash and mapped back to its source post with frequency, position, heading presence, title presence, and first-paragraph presence recorded as scoring signals.

After scanning, the scoring engine evaluates every phrase-to-post relationship using five weighted scorers: TF-IDF (term uniqueness across your corpus), phrase length (longer phrases score higher), structural context (heading and first-paragraph bonuses), title matching (exact phrase appears in the target post title), and category matching (source and target share a taxonomy term). Recency scoring biases suggestions toward fresher content. Each scorer contributes a weighted component to the final score.

Suggestions appear in the post editor sidebar under the HrefHawk panel. Each suggestion shows the source phrase, the target post, and the computed score. You accept or reject each suggestion individually. Accepted suggestions are committed as live links in your post content. Rejected suggestions are hidden permanently. You can revoke an accepted link at any time, which removes the anchor tag from the content and returns the suggestion to pending state.

The scan runs on a burst-aware pipeline that calibrates itself to your server’s timeout limits. A loopback test measures available execution time, then the Dispatcher fires sequential bursts that process posts in batches, saving resume position on timeout. Long scans across thousands of posts complete reliably without hitting PHP timeout limits or memory ceilings.

Auto-rescan fires 60 seconds after each post save, rescanning only the saved post. This keeps the phrase index current without requiring a full site-wide scan after every edit. Auto-rescan can be paused during bulk edits from the Settings page.

Daily maintenance runs automatically: orphan cleanup removes database rows referencing deleted posts, and weekly table optimisation reclaims fragmented space and rebuilds indexes.

All diagnostic output routes through a dedicated Logger that writes daily-rotated log files with .htaccess protection. Debug logging is off by default and toggled from the Settings page.

Upgrade Notices

0.1.0

Initial release.

Details

Plugin code:
href-hawk
Plugin version:
0.1.0
Author:
Outdated:
No
WP version:
6.0 or higher
PHP version:
7.4 or higher
Test up to WP version:
7.0
Total installations:
0
Last updated:
2026-06-04
Rating:
Times rated:
0
content-links
interlinking
internal-links
link-suggestions
seo