BackLink - Check
Sometimes it can be helpful to check external links to the own page: Is there a link, allows robots.txt and meta name='robots' every searchengine to read and follow the link, is the link on this page or is there an iFrame or JavaScript?
First the program allows to add a lot of urls, some as destinations, for each destination the list of pages with external links. For each domain, the programs checks the robots.txt if there is access to the page. Then the page is loaded, all comments and javascripts have to remove. The a-Elements with the destination are count and the first InnerHtml / InnerText is added to the table.
Between two scans, the program sleeps 2 seconds, so it is slow, but doesn't catch too much pages. It accepts the robots exclusion protocol, so it can be completely banned by adding 'User-agent: BackLink-Check', 'Disallow: /' to the robots.txt of a server. The list of urls is stored in a xml-file, which can be edited directly, using NotePad. It's a NET1.1 - Tool, so no installation / deinstallation is required: Unzip it to a new folder (.exe and .exe.xml) and use it or remove the folder, no registry is used.
Additional Information (2004/05/20): The Source- HTML-Pages must have correct html / body - Elements to find links. There is no official protocol how search-engines should handle links in the head- or in noframe-elements or if search-engines should accept HTML-Links outside a body-Element . If a link can't be found, check the Html-Source directly.
|