Network Downloads : LinkExaminer Documentation /// AnalogX

Introduction 
------------ 
I imagine most people who have downloaded LinkExaminer have a pretty good 
idea of what it is and what it does, but on the off chance that you don't - simply 
put it scans a website and returns a bunch of information about the pages on the 
site to help whoever runs it.  How does it help them out?  First it shows you some 
of the more obvious things, such as links that are broken both internally on your 
site and externally as well.  LinkExaminer also gives insight into SEO properties 
such as the existence of page titles, keywords, etc , and the ability to generate 
completely custom reports based on scans performed. 


The Basics of Use 
----------------- 
Using LinkExaminer is pretty straightforward - launch the application, under 
the menu Scan choose 'Set URL', enter in your main URL, click 'Start' and 
you're off and running!  The amount of time it takes to scan a site can vary 
widely based on the options that are selected, so keep that in mind.  It's 
also possible to abort a scan, tweak the configuration, and then resume the 
scan from where you stopped - but keep in mind that if you do this any changes 
made won't affect any pages that were previously scanned. 

When entering your URL into the LinkExaminer it's usually a good idea to be 
as explicit as possible - the program will try to make its best guess for 
anything missing, but if you tell it, then it doesn't need to guess. 


What does it scan? 
------------------ 
The current version can scan any type of HTTP and HTTPS content, so basically 
anything that a webserver can transmit.  The parsing engine is able to fully 
understand HTML and should do an intelligent job with anything you throw at 
it.  CSS is not quite as fully understood by the parser but it shouldn't have 
any issues extracting links - that being said, it won't be able to easily tell 
whether or not particular images are used on a page (so it will check them 
all).  JavaScript is barely supported - enough to extract any simple links 
that might be present, but not sophisticated enough to be able to interpret 
the code in any meaningful way.  The code for handling SSL is provided by the 
excellent OpenSSL library, all other code was internally developed. 

The Main Display 
---------------- 
On the main display (listview) you see the results of the scan as it goes. The 
background of each line also indicates the current processing state - red 
indicates an error, green means everything is fine, and the shades of grey 
(dark to light) mean it's either not processed -> retrieved -> parsed and 
inserted.  A right-click context menu is also available to give additional 
information about selected rows; more detail about what each function does 
is covered later in the documentation. 

It's worth noting that while the report and sitemaps will not be affected by 
anything removed or sorted on the main listview, the CSV and text file exports 
will save data as it's seen in the main display.  This also is true of the 
view filters if they are currently in use. 


Columns Explained 
----------------- 
Hopefully the columns themselves are relatively self explanatory, but for the 
sake of clarity here's what I think they are: :) 

    URL 
        The full URL that this row is referring to 

    HTTP Code 
        This is the HTTP code returned by the server.  There are also error 
        codes with parens () around them, that are greater than 900 - these are 
        internal error codes generated by the harvester (not the server). 

    HTTP Message 
        This is simply the human-readable version of the HTTP code. 

    Internal 
        This indicates whether or not the link checker considers the link 
        internal or external to the site being scanned.  What constitutes 
        internal links can be tweaked in the configuration. 

    Robots.txt 
        This indicates whether or not a URL matches the criteria of a robots.txt 
        file.  Depending on the configuration settings, these links may or may 
        not be examined. 

    NoFollow 
        This indicates whether or not a link contains the NoFollow attribute - 
        this tells the search engines not to index or assign any weight to the 
        link. 

    Dynamic 
        This indicates whether or not a link (or page) looks to be dynamically 
        generated.  It's also possible for a scripted page to be returning 
        static content to be considered dynamic. 

    Relative 
        This indicates whether or not a link was explicit, meaning it included 
        the full domain name, instead of just the path or a starting location 
        relative to where the user is on the site. 

    SEO 
        This shows whether or not some basic SEO components are missing from 
        a page.  Currently it checks to make sure that a page title, meta 
        keywords and meta description are all present - if they are not, then 
        it displays which are missing. 

    Title 
        This is either the text of the first link going into the page, or once 
        a page is harvested it becomes the title text from the page (if 
        available). 

    Depth 
        This is the minimum number of links that need to be clicked on to get 
        to this particular page. 

    In 
        This is the number of inbound links pointing to this page. 

    Out 
        This is the number of outbound links on the page.  Depending on config 
        settings this may include duplicates. 

    Content Type 
        This displays the Content-Type header returned by the webserver. 

    Size 
        The size (in K) of the links contents. 

    Last Modified 
        The last date the file was marked as modified by the server. 

    Link Type 
        These are what type of links are pointing into this page.  For example, 
        if it was an image file, the link type might be CSS and IMG. 

    Duration 
        This is the time (in fractional seconds) it took to get the page. 

    Similarity 
        This shows how similar this page is to other pages, if similarity is 
        turned on. 

As with most listviews, you can move around the column headers as well as sort 
a particular column by clicking on it.  These changes will be saved if you have 
'Save column info' turned on in the configurations. 


The Status Bar 
-------------- 
Across the bottom of the main dialog is the status bar, which gives you 
some up-to-the-second statistics of what's going on.  The first section tells 
you what the application is currently doing, the second one is the URL that's 
currently set as active.  The third is how many links per second (lps) that 
(on average) are being scanned followed by a listing of exactly how many 
pages are new, remaining to be harvested, as well as the total count.  The 
final section is how much memory the application is currently using - this 
can be helpful in giving you a heads up when you're scanning a larger site 
or are caching the harvested pages. 


The Exporters 
------------- 
The following exporters get their data from what is currently being displayed 
in the main dialog.  This means that if you have the view to only show errors, 
then they will only export errors.  Also, if you have some kind of sorting 
being applied, that will also be reflected in the exported data. 

    Text File (URLs) 
        This is pretty simple, it's just the URL, each one on its own line. 

    CSV file 
        Suitable for importing into other applications, this contains almost 
        everything displayed in the main dialog. 

The next exporters get their data directly from the internal database maintained 
by LinkExaminer, so they are not influenced by most things you'd do in the 
dialog.  The only notable exception is that if you remove a page (via the right- 
click menu), it's removed from these reports as well.  All of these exporters 
are generated using user-generated templates, so it's possible for you to make 
your own; if you're interested in that, check out the Custom Templates section. 

    Sitemap (XML) 
        This is a template-driven sitemap generator; the default template is 
        for the type of sitemaps used by the search engines like Google, etc. 

    Report (HTML) 
        This is a template-driven HTML report that can be highly customized 
        to include what you want.  The default report just contains the basic 
        items that most other link checkers report. 


View Source 
----------- 
This right-click option is only available when Caching is enabled under the 
Parser configuration.  With this, you can view the raw HTML source that was 
downloaded from the site. 


View Link Details 
----------------- 
This is probably the option you'll be using the most - it gives you a view of 
how links flow into and out of a page.  In the top section it shows all of the 
inbound pages that link in, as well as what type of link it was that they 
referenced the page with, and at what depth in the website hierarchy they are. 
The bottom pane contains all of the outbound links from the page, with similar 
additional information to the inbound links.  You can also right-click on any 
link to open it in a browser or copy the URL to the clipboard. 


View Parser HTML 
---------------- 
The parser HTML view lets you see how the HTML parsing engine views the page - 
if for some reason it doesn't seem to be able to find a link, then looking at 
this is a good way to get a sense for what might be going wrong.  This is the 
actual parser output (as opposed to the page source), so it will automatically 
format it as well as break up tags from content. 

The first column is what line in the source code that the corresponding HTML 
came from - this can be helpful if you spot something wrong, to get to the 
exact place in your code.  The second column is the tag depth, which is 
similar to the link depth, except that this looks at tag open and closes for 
this parsed page only.  You can use this to spot whether or not your page may 
be missing a closing tag, etc - if everything is fine you should start at 
depth 0 and when you scroll to the bottom it should end at depth 0 - if it's 
anything else, then there are either too many closes or not enough to map to 
the actual HTML tags.  The third column is how the parser interprets the chunk 
of code: whether it's a tag, if it's a close, if it's a single (a tag which 
either doesn't expect a close or has one specified at the end of the tag), or 
if it's text.  The final column is the parsed HTML output back into a sort of 
formatted way, so it looks like what you'd expect to see. 

As with View Source, this option is only available if you have Caching turned 
on in the parser configuration. 


View Parser Content 
------------------- 
This viewer lets you get a sense of how a search engine sees your page - it's 
only the extracted content, removed from the HTML.  Some of this may not be 
quite as readable as the HTML would be, because it will also be including the 
alt text tags from links, images, etc. 


Configuration: General 
---------------------- 
    Save column info 
        With this turned on, any changes made to the columns in the main 
        display (such as order or size) will be saved. 

    Auto scan after URL entry 
        When you enter a URL, the scanner will automatically start. 

    Clear data before scan 
        When a scan is started, all the previous data is removed. 

    Colorize Listview 
        This toggles the background coloring on the main display. 

    Display after processing 
        With this turned on, links will only show up on the main display after 
        they have been harvested, instead of as they are discovered. 

    Refresh 
        This changes the interval that the main display is updated.  If you 
        have a particularly large size (more than a million pages), then you 
        might want to increase this, but in most cases 1 second is fine. 

    Double-click action 
        With this you can customize what happens on the main listview when you 
        double-click on a row. 


Configuration: Exclude 
---------------------- 
The exclusion rules are a powerful aspect of the scanning engine.  If you would 
like to exclude certain files or directories from the search, then you can do 
so from here.  Multiple rules can be specified, each rule should be on a new 
line.  A test URL section is provided below the rules, and this will let you 
see whether or not a particular URL would get excluded as well as which rule 
(or rules) in particular it matched. 

The exclusion engine itself is more like a mini search engine, and it has some 
understanding of URL incorporated.  For instance, let's say we had the URL: 

    http://www.testurl.com/gallery/favorites/thumbnails.php 

If you added the rule: 

    favorites 

Then it would exclude this site - but, if you just had the word "favorite" 
without the s at the end, then it would NOT match.  This is because it 
understands certain things like slashes (/), underscores (_), etc are 
indications of word boundaries.  Now, let's say that you wanted it to match 
either case, such as favorite or favorites; there are three ways you could 
do this: 

    Two rules: 
                            favorite 
                            favorites 
    Character Wildcard: 
                            favorite? 
    Wildcard: 
                            favorite* 

The first way is obvious, just explicitly list the two words that you want to 
match.  The second example uses a character wildcard, the question mark (?); 
this will consider any character, or the absence of a character, as a match. 
The character wildcard doesn't need to appear at the end, it can appear at the 
beginning or in the middle, and you can use as many as you'd like.  The final 
method is to use the normal wildcard, this says that as long as it matched the 
first part, then anything after that it doesn't care about.  So, if it ran 
into the word "favoritely" (yes, I know it's not a word), then it would NOT 
match with the character wildcard match, but it WOULD match with the normal 
wildcard.  The wildcard (*) can be used either at the start or the end of the 
search term, and you could put one on either (or both) ends of the word, but 
never in the middle. 

Now, let's say that you want to exclude everything in the favorites directory 
EXCEPT the thumbnails; you could do that with the following: 

    favorites -thumbnails 

You can use the plus (+) or minus (-) sign to indicate if something must exist 
or must not exist in the match.  So, in this instance it sees that favorites 
exists, but then it also sees that thumbnails exists - but thumbnails has the 
minus (must not) at the start, so the rule doesn't match.  So, you might be 
wondering if this means that "favorites" is the same as "+favorites", right? 
The answer is: sort of.  :)  In this case, there's no difference between the 
plus (must) version and nothing at all, but let's say you used the following 
rule: 

    favorite favorites -thumbnails 

This is effectively saying that it can match EITHER favorite or favorites and 
must not match thumbnails.  So if there are multiple things that you might 
want match against, then you can just add them in the same rule. 

There are a couple of other rules, but they're a bit less likely to be used. 
The first is the number wildcard (#), this functions just like the character 
wildcard, except that it only matches numbers.  You can use a (^) if you need 
to match only against the start or the end of the URL.  For example, if you 
wanted to exclude any gif's, you could use the rule: 

    *.gif^ 

You could also leave off the "*." and it would match as well since it considers 
periods as word breaks.  Finally, you can use quotes to encapsulate phrases, 
such as: 

    "happy cat" 

Otherwise it would see that as two words, happy OR cat.  Ultimately what you 
do with these is really limited only by your imagination. 


Configuration: Harvester 
------------------------ 
    Obey robots.txt 
        A website can include a file which limits what automated systems (such 
        as this) go to.  With this turned on, LinkExaminer will skip whatever 
        a search engine would skip. 

    Obey NoFollow 
        It's possible in HTML to tag a link as nofollow - now search engines 
        still follow the link, they just don't assign any weight or track the 
        fact that you link to it.  With this checked, the harvester won't 
        follow these links at all. 

    Check external links 
        With this checked, it will check that external links are valid, but it 
        will not add any links found on the page to the scan. 

    Allow parent walk 
        If you start on a subdirectory, such as 
                http://localhost/blog/index.htm 
        The engine will not go any higher or into any link that doesn't start 
        with '/blog'.  With this turned on, it will walk back to the root and 
        include everything. 

    Consider URL case sensitive 
        In most cases you won't want this turned on, but if you do, then it 
        will consider index.htm to be different than index.HTM, and will check 
        both links. 

    Subdomains internal 
        With this turned on, subdomains will be considered internal; so, if 
        you were scanning localhost, it would also consider image.localhost 
        or files.localhost to be internal. 

    Accept cookies 
        With this turned on, cookies are accepted and persist throughout the 
        scan.  Cookies are specific to each thread and not shared. 

    Max depth 
        This is the maximum depth that will be walked, if you want it to go 
        as deep as possible, it should be set to all 9's. 

    Max redirects 
        When a redirection happens, and it points to another redirection, 
        this is considered "Redirection Depth", and this value limits how many 
        redirection to redirection links will be followed.  In general you 
        don't want this too large to avoid redirection loops. 

    Max links 
        You can limit the maximum number of links that will be scanned; once 
        the number is met the scan will stop. 

    Threads 
        This is the number of concurrent threads that will be used to retrieve 
        links.  In most cases, the default of 10 is fine, but if you have CPU 
        and bandwidth to spare, you can increase this to shorten your scan 
        times. 

    Retry count 
        If it is unable to connect to a page, how many times it will attempt to 
        retry before going on to the next link. 

    Retry timeout 
        How long (in seconds) it will wait before it considers a request to 
        have failed. 

    Max filesize (k) 
        This is the maximum amount of data (in kilobytes) it will download. 

    User Agent 
        This is the user agent that is reported by the harvester while it scans 
        pages.  It's recommended that this remains the default. 


Configuration: Parser 
--------------------- 
    Cache pages 
        With this turned on, a copy of all HTML pages downloaded will be saved 
        in memory.  This enables some of the advanced features that rely on 
        content analysis. 

    Count duplicate links 
        With this turned on, if a page links to the same URL twice, it will 
        consider this to be two 'hits' for that page; if it was turned off, it would 
        only be counted once. 

    Walk forms 
        With this enabled, it will attempt to walk any form fields it finds. 

    Spaces in URL as '+' 
        Some webservers want to see spaces as '+' instead of %20, if this is 
        the case (it normally won't be), then check this. 

    Skip graphic elements 
        With this turned on, anything linked through a graphic tag, such as 
        IMG, will not be added to the scan list. 

    Ignore anchors 
        With this turned on, it will not check to see whether or not anchors 
        exist on a page.  This can radically increase your link count if turned 
        off and your site extensively uses anchors. 

    Identify duplicate pages 
        With this turned on, a SHA1 is calculated for each page (the RAW HTML) 
        and it is used to find identical content. 

    Last-Modified <5m is dynamic 
        This helps the harvester to identify dynamic content - if the Date and 
        the Last-Modified date reported by the server are within 5 minutes of 
        each other, then it's considered dynamic. 

    Find similar pages 
        This requires caching to be turned on as well as increasing the memory 
        usage considerably.  Once the scan completes, it will compare the 
        contents of all the HTML pages and report on the maximum similarity 
        encountered as well as how many pages were close to the same. 

    No Last-Modified is dynamic 
        If a webserver responds without a Last-Modified date, the page will be 
        considered dynamic. 

    Similar spread 
        This is the percentage tolerance in matches necessary to be considered 
        a hit in the similarity analysis. 


Configuration: Export 
--------------------- 
    Auto save report when done 
        With this turned on, when a scan has been completed it will automatically 
        save a copy (using the auto save filename). 

    Auto open report 
        With this enabled, when a report is generated it will automatically open 
        it in the browser. 

    Add date/time to filename 
        This automatically adds the current date and time to the filename, so 
        reports won't be overwritten. 

    Sitemap template 
        This specifies what sitemap template to use when generating sitemaps. 

    Report template 
        This specifies what report template to use when generating reports. 


Custom Templates 
---------------- 
Some of the exporters can be customized using a template/scripting system.  The 
template system itself was originally developed with HTML in mind, so it looks 
similar to it (especially when generating html), but also makes it well suited 
for XML (although it can really be used for any text-based format). 

The templating system (in use) is actually very simple - there is one section 
of the page specified for layout and the rest are scripted sections.  The 
layout section is the first called, and it (unsurprisingly) controls how the 
page is laid out.  Each template section is contained inside of an HTML 
comment, and starts with BEGINPAGE: followed by the name of the section. 
Here's an example of the layout section: 

<!-- BEGINPAGE:Layout --> 

Another common type of section is a loop section - these are sections that 
are going to be looped for each member of a list.  They typically start out 
with a Header section, followed by a Member and close up with a Footer.  When 
this is actually generated, it will output the Header once, the Member as 
many times as there are members in the list, and the Footer when it’s done.  If 
you would like to alternate the format (to alternate the background color, etc) 
then you can have multiple Member sections followed by a number, for example: 

<!-- BEGINPAGE:PageListHeader --> 
<!-- BEGINPAGE:PageListMember1 --> 
<!-- BEGINPAGE:PageListMember2 --> 
<!-- BEGINPAGE:PageListFooter --> 

In this case it would do Header, Member1, Member2, Member1, ...etc..., Footer. 
If you don't want multiple Member sections, then just leave the number off. 


Advanced Scripting 
------------------ 
The HTML Report and Sitemap Generator both have their own template-driven 
scripting system - this means that it's possible to make custom reports with 
a focus on whatever you're interested in.  While it is a simple system, it 
does allow you a large amount of flexibility. 

The system level variables are the most basic of any of the variables, they 
just print out some high-level info about the system itself: 

    {System.CurrentTime} 
        This displays the current time.  Time is handled specially in the 
        scripting system, so it's possible to format it in a variety of ways. 

    {System.Version} 
        This displays the program's name and version information 

When it comes to actually generating some output, you use the {Display.*} 
functions - these run through all the members in special ways based on the 
settings you've chosen. 

    {Display.OverallSummary} 
        This displays the overall summary information 

    {Display.PageList} 
        This iterates through the list of pages 

    {Display.SiteMap} 
        This generates a sitemap 

    {Display.LinkInList} 
        This iterates through any pages linking into the current page 

    {Display.LinkOutList} 
        This iterates through any pages linked from the current page 

    {Display.HTTPCodeSummary} 
        This prints a summary of the HTTP codes 

    {Display.ContentTypeSummary} 
        This prints a summary of the content types 

When calling these, it's also possible to change which page template they will 
use when outputting information - in this way it's easy to customize the way 
things are output.  When the page template is overridden, it will always look 
for 4 sections: Empty, Header, Member and Footer.  So, if you were using 
{Display.PageList} the default sections would be PageListEmpty, PageListHeader, 
etc - now if you call {Display.PageList-Example} instead, it will use the page 
templates ExampleEmpty, ExampleHeader, etc.  Now just tweak those for their 
specific application - pretty cool, eh? 

Also, if there isn't a predefined Display (such as SiteMap, PageList, etc), it 
will pass it on as a template section, so you could call just one of your own 
sections.  You can see this in the 'report-template.htm', at the end of layout 
you'll notice it doesn't include the body and html close, but instead calls 
{Display.Footer}.  This then looks for the template section Footer, which is 
at the bottom of the document - this is done so that the entire page is 
contained in the HTML document and looks (somewhat) correct in the browser 
even when it hasn't been generated. 

Options are special in that they don't actually print anything, they're used 
to change the way a display request is processed: 

    {Option.Reset} 
        Reset all the options 

    {Option.SetPageListFilter} 
        Probably the most powerful and most frequently used, this lets you 
        filter which pages are returned - so anything put as a parameter to 
        this WILL NOT be displayed.  Most of the filter types have positive 
        and negative modes, and you can use multiple ones in a filter by 
        using commas between each term.  For example: 

            {Option.SetPageListFilter-NoError} 

        Will skip any pages that don't have an error - so effectively returning 
        an error list.  Now, let's say you only wanted to show internal errors - 
        just change it to this: 

            {Option.SetPageListFilter-NoError,External} 

        See?  Now we're filtering out pages without errors, and that are 
        external.  Here's a list of all the filters: 

                Redirect 
                NoRedirect 
                Error 
                NoError 
                Internal 
                External 
                RobotBlocked 
                NoRobotBlocked 
                NoFollow 
                NoNoFollow 
                Relative 
                Literal 
                Dynamic 
                Static 



    {Option.SetPageListContentTypeAllow} 
        This will limit the returns only to the content type specified, so this 
        works differently than the filter.  For example: 

            {Option.SetPageListContentTypeAllow-text/html} 

        Will only return html files.  As with the filter, multiple content 
        types can be specified using the comma. 

    {Option.SetPageListBiggerThan} 
        Pretty straightforward, only pages bigger than the number (specified 
        in k/bytes) will be returned 

    {Option.SetPageListSmallerThan} 
        Have a guess?  Only pages smaller than the number are returned 

    {Option.SetPageListSlowerThan} 
        As with the above options, this is related to the transfer time, and 
        is in fractional seconds - so to only return pages that took more than 
        1.5 seconds, you would do: 

            {Option.SetPageListSlowerThan-1.5} 

When running any of the Summary-type subqueries, they will return their 
contents into the following variables: 

    {Summary.Value} 
        If there is a value associated with this summary 

    {Summary.String} 
        If there is a string associated with the summary 

    {Summary.Count} 
        The number of matches to this particular summary row 

    {Summary.Percent} 
        The percentage of this row (basically count/total) 

    {Summary.Total} 
        The total number of entries related to this summary 

    {Summary.SizeAverage} 
        The average filesize (if appropriate) 

    {Summary.SizeTotal} 
        The total filesize (if appropriate) 

It's important to point out that not all the summary variables will be filled in 
with valid info - it depends on which summary is being generated.  It won't hurt 
anything to try using one - it just won't contain any information. 

Now, when iterating through relationships, each pass will load the contents of 
the page referenced - so if you're going through LinkInList, all of the {Page.*} 
variables will be related to the LinkIn, not to the page it's linking to.  There 
are also a couple of relationship-specific members: 

    {Relationship.Hits} 
        This is the total number of relationship hits 

    {Relationship.LinkType} 
        This is the type of inbound link from the current page (a, img, etc) 

    {Relationship.InLinkCount} 
        This is the number of links coming into this page from others 

    {Relationship.OutLinkCount} 
        This is the number of links going out from this page to other pages 


There are a couple of scan-level variables, these are things that deal with 
the scan as a whole instead of an individual page: 

    {Website.URL} 
        This is the main URL that the scan was done on 

    {Website.MaxDepth} 
        This is the deepest depth of the walk 

The following variables are all pretty straightforward, they map to what you 
see in the listview and their names are what they are: 

    {Page.Count} 
    {Page.CurrentCount} 
    {Page.URL} 
    {Page.HTTPCode} 
    {Page.HTTPCodeString} 
    {Page.RawHTTPCode} 
    {Page.RawHTTPCodeString} 
    {Page.IsInternal} 
    {Page.IsRobotBlocked} 
    {Page.IsNoFollow} 
    {Page.IsDynamic} 
    {Page.IsRelative} 
    {Page.Title} 
    {Page.Depth} 
    {Page.Rank} 
    {Page.RedirectDepth} 
    {Page.Hits} 
    {Page.Links} 
    {Page.ComponentHits} 
    {Page.ClickableHits} 
    {Page.ContentType} 
    {Page.Size} 
    {Page.LastModifiedTime} 
    {Page.LinkType} 
    {Page.Duration} 
    {Page.SimilarityPercent} 
    {Page.SimilarityHits} 
    {Page.ParsedContent} 


Scripting Time 
-------------- 
As mentioned above, Time is handled slightly differently than normal variables, 
and has all sorts of variations you can use.  For example, {System.CurrentTime} 
has the following variations: 

    {System.CurrentTime} 
    {System.CurrentTimeRFC822} 
    {System.CurrentTimeDate} 
    {System.CurrentTimeSecond} 
    {System.CurrentTimeSecondPad} 
    {System.CurrentTimeMinute} 
    {System.CurrentTimeMinutePad} 
    {System.CurrentTimeHour24} 
    {System.CurrentTimeHour24Pad} 
    {System.CurrentTimeHour} 
    {System.CurrentTimeHourPad} 
    {System.CurrentTimeAMPM} 
    {System.CurrentTimeDay} 
    {System.CurrentTimeDayPad} 
    {System.CurrentTimeMonth} 
    {System.CurrentTimeMonthPad} 
    {System.CurrentTimeMonthName} 
    {System.CurrentTimeMonthNameAbbreviation} 
    {System.CurrentTimeYear2} 
    {System.CurrentTimeYear4} 

Most (if not all) should be pretty obvious, but basically any variable that 
ends with the word Time probably has all these modifiers. 


Scripting Modifiers 
------------------- 
Speaking of modifiers, it's possible to have the scripting system do a bit of 
processing on any variables it sees.  To do this, simply add a dash followed 
by the modifier, so if you wanted the year {Page.CurrentTimeYear4} to be 
displayed with a comma (I'm not sure why), then you would do the following: 
{Page.CurrentTimeYear4-Comma}.  Here's a list of the modifiers currently 
available: 

        -Checked 
            If the string immediately following the modifier matches the 
            variable, then it returns "CHECKED".  Used for inputs. 

        -Selected 
            If the string immediately following the modifier matches the 
            variable, then it returns "SELECTED".  Used for inputs. 

        -ESC 
            This does escapes necessary for the string to be used in quotes. 

        -HTML 
            This does any escaping necessary for HTML, such as converting 
            greater than/less than, etc. 

        -CGI 
            This does the appropriate escaping to be sent as a parameter in a 
            URL. 

        -YESNO 
            This converts the variable 0/1 into Yes or No 

        -ONOFF 
            This converts the variable 0/1 into On or Off 

        -IsEmpty 
            This prints either "Yes" or "No" depending on the variable. 

        -Clip 
            This limits the length of a string to the number specified.  If 
            it must clip the string, then it does so and adds "..." to the 
            end. 

        -Chop 
            This does the same thing as clip, but doesn't add the "..." 

        -Left 
            This is used with percentages (0-100), it returns what's left, so 
            if the percentage was 25%: 

                {Summary.Percent}           would print         24 
                {Summary.Percent-Left}      would print         76 

        -SizeConvert 
            This is a bit special, it takes 2 (or 3) parameters after it.  The 
            first is the format of the incoming variable, the second is what 
            format to convert it to, and the third (optional) parameter is how 
            much precision it should use.  For instance, let's say you have the 
            filesize in bytes of the page, which is 1234567: 

                {Page.Size}                     would print         1234567 
                {Page.Size-SizeConvertBK}       would print         1206 
                {Page.Size-SizeConvertBM}       would print         1 
                {Page.Size-SizeConvertBM3}      would print         1.206 

            The types are: 

                            B = byte            b = bit 
                            K = kilobyte        k = kilobit 
                            M = megabyte        m = megabit 
                            G = gigabyte        g = gigabit 
                            T = terabyte        t = terabit 

        -Precision 
            Just like with SizeConvert, this can take a floating point variable 
            and reduce the number of digits to the right of the decimal. 

        -Approx 
            This changes an integer into an approximation 

        -Comma 
            This adds commas to an integer, so: 

                {Page.Size}                     would print         1234567 
                {Page.Size-Comma}               would print         1,234,567 

It should be pointed out that only one modifier at a time can be used - there 
is no chaining of them, so make sure to choose carefully!  :) 


Scripting Example 
----------------- 
So, let's try an example of adding something to the report-template.htm - open 
it up in your favorite editor and add the following: 

    <h2>All:</h2> 
    {Option.Reset} 
    {Display.PageList} 

Hopefully you have some idea of what this will do at this point, and if you 
don't, take a look at the header tag, it might give you a clue.  :)  Just 
generate a report and it should also now include a listing of all the pages 
that were harvested. 

One quick tip for you, when you're working on the template, it will reload 
it each time you generate a report.  This means you can perform a scan once, 
then just work on the template and generate the report over and over until 
you've got what you want. 


Conclusion 
---------- 
Hopefully you found this readme file useful (for the few who actually read 
these) - my least favorite type of thing to write is docs (my most favorite 
being code).  I hope you enjoy LinkExaminer and it serves you well... 
Last updated on Tuesday, March 9, 2021 12:24:47 PM PST. AnalogX trade is a registered trademark of AnalogX, LLC. All other trademarks are the sole property of their respective owners. All contents copyright ©1998-2009, AnalogX. All rights reserved.