uk.ac.starlink.xdoc
Class LinkChecker

java.lang.Object
  extended by uk.ac.starlink.xdoc.LinkChecker

public class LinkChecker
extends Object

Checks an XHTML document to see that the links it references are valid URLs.

Author:
Mark Taylor (Starlink)

Constructor Summary
LinkChecker(URL context, boolean attemptExternal)
          Constructs a new LinkChecker with a given home context.
 
Method Summary
 boolean checkLinks(Source xsltSrc, Source xmlSrc)
          Checks the result of an XML transformation to see if the links in the result are OK or not.
 boolean checkLinks(Source xsltSrc, Source xmlSrc, Map params)
          Checks the result of an XML transformation to see if the links in the result are OK or not, with an optional list of parameters.
 URLConnection followRedirectsWithTimeout(URLConnection conn)
          Takes a URLConnection and repeatedly follows 303 redirects until a non-303 status is achieved.
 int getExternalFailures()
          Returns the total number of external link resolution failures this checker has come across.
 int getLocalFailures()
          Returns the total number of local link resolution failures this checker has come across.
 int getTimeout()
          Returns the network timeout used for retrieving URLs.
protected  void logMessage(String msg)
          Interface through which short messages about progress can be logged.
static void main(String[] args)
          Checks the links of the result of a given transformation to XHTML (or an HTML-like result).
 void setTimeout(int timeoutSecs)
          Sets the network timeout used for retrieving URLs.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

LinkChecker

public LinkChecker(URL context,
                   boolean attemptExternal)
Constructs a new LinkChecker with a given home context. This should be the URL of the document being checked, or at least its directory, if relevant. It is used for relative link resolution.

Parameters:
context - document context
attemptExternal - true if you want to check external (http) links; if false, only local ones will be checked
Method Detail

checkLinks

public boolean checkLinks(Source xsltSrc,
                          Source xmlSrc)
                   throws TransformerException,
                          MalformedURLException
Checks the result of an XML transformation to see if the links in the result are OK or not.

Parameters:
xsltSrc - source for the XSLT stylesheet which converts to HTML or an HTML-like output format
xmlSrc - source for the XML document which will be transformed by xsltSrc to produce the HTML to test
Returns:
true iff all the links in the resulting XHTML document can be successfully resolved
Throws:
TransformerException
MalformedURLException

checkLinks

public boolean checkLinks(Source xsltSrc,
                          Source xmlSrc,
                          Map params)
                   throws TransformerException,
                          MalformedURLException
Checks the result of an XML transformation to see if the links in the result are OK or not, with an optional list of parameters.

Parameters:
xsltSrc - source for the XSLT stylesheet which converts to HTML or an HTML-like output format
xmlSrc - source for the XML document which will be transformed by xsltSrc to produce the HTML to test
params - stylesheet parameter map (or null)
Returns:
true iff all the links in the resulting XHTML document can be successfully resolved
Throws:
TransformerException
MalformedURLException

setTimeout

public void setTimeout(int timeoutSecs)
Sets the network timeout used for retrieving URLs.

Parameters:
timeoutSecs - timeout in seconds

getTimeout

public int getTimeout()
Returns the network timeout used for retrieving URLs.

Returns:
timeout in seconds

getLocalFailures

public int getLocalFailures()
Returns the total number of local link resolution failures this checker has come across. Local ones are those which correspond to hrefs representing relative URLs or file-type URLs.

Returns:
total number of bad local links

getExternalFailures

public int getExternalFailures()
Returns the total number of external link resolution failures this checker has come across. External links are ones that aren't local.

Returns:
total number of bad non-local links
See Also:
getLocalFailures()

logMessage

protected void logMessage(String msg)
Interface through which short messages about progress can be logged.

Parameters:
msg - message to log

followRedirectsWithTimeout

public URLConnection followRedirectsWithTimeout(URLConnection conn)
                                         throws IOException
Takes a URLConnection and repeatedly follows 303 redirects until a non-303 status is achieved. Infinite loops are defended against.

Parameters:
conn - initial URL connection
Returns:
target URL connection (if no redirects, the same as hconn)
Throws:
IOException

main

public static void main(String[] args)
                 throws MalformedURLException,
                        TransformerException
Checks the links of the result of a given transformation to XHTML (or an HTML-like result). For any link which fails to resolve correctly in the transformation result, a short warning message is output during processing. At the end, a summary of any bad links is also output. There will be an error status exit (1) if any of the local links fail to resolve; if the only bad links are ones corresponding to non-local (hrefs than don't start with "#" or "file:") then although warnings are logged, the exit status is zero.

Usage: LinkChecker stylesheet xmldoc

Parameters:
args - arguments
Throws:
MalformedURLException
TransformerException


Copyright © 2015 Central Laboratory of the Research Councils. All Rights Reserved.