Appendix 2

Xenu
update history

As of June 2006

.

1.6.2006 (1.2h)


Major improvements:
- Tip of the day
Minor improvements:
- ALT part of <IMG > used for the title column
- [Options] FailSimilarHosts=0 (current behaviour and default is 1)
- more statistics for managers (min size with link, max size with link, avg size)
- "In Links" and "Out Links" in headings for better readability when small
- correct error message for empty ftp orphan directories
- error message for empty local orphan directories
- error message for non existing local orphan directories
- orphan list sorted
- (Test / by request only) IGNOREFRONTPAGEORPHANS
- ftp host field allows port number, ftphostname:port
- ftp dialog fields stored in .INI file
- ftp default page (e.g. index.html, home.html, default.asp, etc)
- ftp dialog does not appear when Xenu is launched with "-url", but is still available in "corporate" version
- ReportBroken2 more efficient
- 8.6.2005 Switched to InnoSetup 5
- slight change in .TAB format: Status-Code and Status-Text instead of Status only
- prevent empty input in NEW dialog
- Ignore "error" HTTP_STATUS_ACCEPTED (for user with VMware, host Fedora Core 9 who has NAT problems)
- changed handling of "%XX" with file:// orphan files
- AfxMyParseURL removes "%XX" with file:// URLs
- include/exclude wildcard test thanks to http://www.codeproject.com/string/wildcmp.asp
- better text for ftp orphan dialog
- 18.3.2006: currently selected URL is first next new thread
- 19.3.2006: ftp/gopher segment only when such URLs exist
- 19.3.2006: put include/exclude settings into report
- 1.4.2006: link to Google Sitemaps in report
Bug fixes:
- in file://///UNC-Host/Share, leading "//" is not an error
- &#xnnn; now recognised (in addition to &#nnn;)
- vNormalizeURL() when reading URL List
- need space or semicolon before a "name", "href", etc
- process % when checking an ftp URL on an ftp server

18.3.2004 (1.2g)
Major improvements:
- Attempt at javascript thanks to
http://www.codeguru.com/Cpp/Cpp/string/regex/article.php/c2779/
details explained at
http://home.snafu.de/tilman/xenulink.html#javascript
Minor improvements:
- Show elapsed time in status bar [15.1.2005 changed archiving format]
- TARGET=_blank instead of TARGET=Xenu in report
- New Version 2.44 of CSMTPConnection http://www.naughter.com/smtp.html
- "//" in local files is always an error
- mailed report as "XXXX.htm" instead of "XXXX.tmp.htm"
- Version String in .XEN file
- vTimeoutSimilarHosts() more efficient with huge sites
- Faster local link checking (no copying to %temp% file)
- HTTP_STATUS_REDIRECT_KEEP_VERB (307)
- ERROR_INTERNET_CLIENT_AUTH_CERT_NEEDED (12044) error handling
- passive ftp mode in orphan dialog box
- Send XENU.INI file as mail test instead of CONFIG.SYS
- Orphan check also for https://
- "New" Dialogbox can be used to enter a ftp link (no crawling!)
- Cookies allowed when [Options] AllowCookies=1
don't use this if you have links that delete or change something!
- (Test / by requests only) MSOEXCLUDE, PARSETEST
Bug fixes:
- better error handling for error 12003 in FTP orphan check
- _findclose in local orphan check (to unlock directory!)
- /> bug fixed in META REFRESH
- &# handling in vReplaceAmpStuff() and in bProcessLink()
- handle redirection target as a possibly relative link
- No empty URLs in URL list
- date, size for file:///
- alexa, google cache and wayback only for http:// and https://
- offset in ParseTag as int instead of short for tags > 64K
- cut off after '?' in remote orphan check
- exclude excluded URLs in Orphan list
- WINVER 0x0400

6.8.2004 (1.2f)
Major improvements:
- Real setup
- Status code for redirections
- Context menu: Open in Google Cache
- Context menu: Open in Wayback Machine
- Context menu: Open Alexa
Minor improvements:
- report as "XXXX.htm" instead of "XXXX.tmp.htm"
- Max-Level also "connected" to the URL
- Compiled on Windows XP
- List of unfinished threads when closing
- Don't display ODP context menu for broken http://editors.dmoz.org links
- "Display error" mentioned in Properties Box when too many links
- Look for subdirectories when doing orphan searches
- Remove "file:///" when launching local URLs without DDE
- Change "\" to "/" for "file://" URLs because of problems with Opera 7.5
Bug fixes:
- Deletes TGH*.* files also when limited number of levels
- Can work with http://www.dbdebunk.com: "location: " instead of "Location: "
- Correct time in report (minute and second were mismatched)
- ReportStatistics and ReportOrphans flags in .XEN file
- No error message when click "not on a line"
- Prevent re-entrancy of vAttention() when e-mailing report


28.9.2003 (1.2e)
Major improvements:
- Remote Orphans
- Bugfix for sites > 65535 links: m_FromTab set to 32bit
- timeout feature (default: 60 secs)
- STOP button in addition to the PAUSE toolbar button
- Scan https:// websites with bad certificate
- Validate URL with right mouse click
Minor improvements:
- Skip irc://, mms://, rtsp://, pnm://, wtai://
- <hr> instead of "=========" in report
- in report, so that it is correct HTML
- "Normalization" of URLs in include/exclude list
- Len = 0 when file error with http GET
- OpenRequest() with INTERNET_FLAG_NO_COOKIES
- Site Map recursion warning
- "//" in URL after the host name is not "broken" when part of
"http://" or "https://"
- empty line in report after local link error for a page
- Local orphans case insensitive
- Automatic retries only when m_bBusy
- CInternetSession local to thread, to make STOP possible
- "http://dmoz.org" instead of "http://dmoz.org/" comparison, to avoid
extra menu item for dmoz-internal links
- Properties at right-mouse-key always the last item
- Make current item visible after sort
- More random spidering to balance the load
- Url Sort case unsensitive
- Buffer overflow bug in unknown errors removed

14.9.2002 (1.2d)
- "//" in URL after the host name is not "broken" when after a "?"
- Corrected bug that local non-HTML files would be downloaded in full
- Corrected GUI bug in "new" dialog
- Converted %5F to _
- Change in cmdline version about profile reading
(Matching now done before Normalization)

16.7.2002 (1.2c)
- <BLOCKQUOTE CITE
- Consider unexisting types like "httttp" as "not found"
- Editing of ODP websites in the right-mouse-menu
(useful for editors at http://dmoz.org)
- For local files, launch related applications (e.g. viewer, editor)
with the right-mouse-menu
- Corrected bug that had root page twice in Xenu list
- "//" in URL after the host name is always an error
- Prevent closing when threads running
- "R" launches "Properties" in right-mouse-menu
- Save directory of "Browse" location
- Enlarged "New" Dialogbox
- Retry also for error 403
- Local checks for "#"
- HTTP_STATUS_PROXY_AUTH_REQ handling not dependent of password setting
- Ignore "error" HTTP_STATUS_RESET_CONTENT
(at http://www.vietnamthink.com/ )
- Corrected '%' bug with Orphan files
- "\" not a bug when after a "?"
- Correct # of Threads and URLs in status line when finished
- Corrected Bug with stuff like "nohref=" or "classname=" inside

30.11.2001 (1.2b)
- !!!!! Moved the xenu.ini file from
\windows or \winnt to the current working directory
- Corrected bug with </Script>
- <TR BACKGROUND
- <TH BACKGROUND

6.10.2001 (1.2a)
- extra column: time spent
- Correct count for broken links in report
- Can get size of some ftp files
- .<TABLE BACKGROUND
- Append header information from redirected files even if a body exists,
because of http://wap.loop.de
- Look up MIME type for local files
- Unofficial Option in XENU.INI:
[Options] UseDDE=0 to disable DDE on some systems
- Combined html and wml (WAP) scanning
- <INPUT SRC="image.gif"> checked
- Skip <SCRIPT>...</SCRIPT>
- Logo in About-Box changed
- Min Level can be 0
- CTRL-Numpad-ADD to resize all columns
- Attempt at Orphan files
- Improved speed
- Better method for Url lookup
- no UrlTable search in ctor of CLinkInfo
- check for "txt", "jpg" etc more efficient
- m_csRootURL tested in bIncluded()
- CLinkInfo::vAddFromURL more efficient
- Internal function bHasBrokenToURLs() more efficient
- Corrected weird bug in initial Combo-Box
- Changed Text in NEW Dialogbox
- Compiled with VC++ 6

22.7.2001 (1.1f)
- Changed User-Agent string to
Xenu Link Sleuth
because of problems with many websites, e.g. www.sptimes.com

21.7.2001 (1.1e)
- CTRL-W and CTRL-Q shortcuts for Close and Exit
- Ability to consider hard redirections as errors
- Changed character in User-Agent string from ' to

2.7.2001 (1.1d)
- new error "no info to return" for empty web pages
- corrected bug about saving to tab file when file exists
- added statistics for managers :-)
- HEAD command also for .zip, .exe .swf (saves bandwidth)
- serializing requests for name/password
- changed include/exclude so as to work only on the *beginning* of URLs
(don't forget to start them with "http"!)

(1.1c)
- Added some extra error messages
- Saving columns width
- Adjusting column width with double-click
- e-mail feature
- removed mailto:www-request@infoseek.com from report
- added LAYER SRC, IFRAME SRC and IMG LOWSRC
- sort URLs in broken link section of the report
- HEAD command also for .txt, .png, .rtf and .pdf (saves bandwidth)

(1.1b)
- Added <TD BACKGROUND="">
- file:/// instead of file://
- added BGSOUND
- Compiled with VC++ 5.0, smaller
- Can now launch URLs even with registry poorly configured
- URLs of the report open in new window
- Property box with Link Text / Title
- URLs for include/exclude are "bound" to the URL

(1.1a)
- [ and ] in URLs
- corrected bug in CODEBASE (must add "/" if not there)
- corrected bug that deleted include/exclude fields
- improved include/exclude dialog
- added text for error 300
- corrected bug about password sites

(1.0w 14.4.2000)
- PLUGINSPACE in EMBED tag now checked
- APPLET now checked, with CLASS and ARCHIVE, relative to CODEBASE

(1.0v 7.4.2000)
- EMBED tag now checked
- "Options" in the "New" dialog
- "Return to top" in Report
- Corrected bug in site map: broken links are not included
- Now converting more &blah; characters
- Titles get also converted
- converting &blah; characters before normalizing
- convert &# characters in URL
- can now handle URLs like http://user:password@host/ or ftp or https
- export always exports *all*, regardless of the view.
- sadly an old bug is back in: URLs with "\" are not recognised as broken.
- Links that start with "/../" are considered to be broken

(1.0u 15.10.1999)
- "skip these" feature - this really excudes URLs
- &U for Check URL menu

(1.0t 9.9.1999)
- corrected /./ bug
- added CTRL-B to switch between views

(1.0s 12.8.1999)
- "normalizing" received URLs. Advantage: hostnames always converted
into lower case.
- considering all pending URLs with the same host as failed when
timeout, connection failed, or similar
- moved the "Browse..." button
- changed the URL combining method, now using Microsoft's
InternetCombineURL() instead of my own algorithm
- proxy authentication now supported
- corrected bug with '

(1.0r 29.5.1999)
- Corrected bug with image maps

(1.0q 29.5.1999)
- include titles of links
- include / exclude
- allowing the use of '
- corrected bug re: e.g. "src" being used *before* the actual "src" word
- new tags: link, script (the applet tag will come in a later version)
- removed empty <ul></ul> sequences in the report
- date in the title of the report
- corrected bug re: HTML pages with CR only
- set "text/html" for local files
- save size of columns

(1.0p 8.1.1999)
- corrected bug about URL-in-URL
- convert & when in URLs
- REFRESH META Tag
- Focus set to OK after entering local file
- remote URLs with "\" now always fail (because netscape cannot handle them)

13.10.1998 (1.0o)
- corrected bug that prevented checking local files with a space in it
- corrected bug that thread count was not updated when finished
- corrected bug that ignored http:/host/directory error
- added banners

If anyone has locations that offer banners, please e-mail me.
I would advertise for non-profit organisations that deal with
human rights or environmental topics. Attention - I will only
use banners that I like, and link to organisations that I like.

5.9.1998 (1.0n)
- Can check local files - useful for people who don't want to install
a local WWW server; simplified toolbar / initial window
- "Check External" in INI file for new windows
- "Show Broken Links Only" in INI for new windows
- Corrected "//" bug for www.workstation.digital.com
- Added random seed for banner (actually, uploaded this already on 17.7)
- included HTML file in the ZIP file
- RegisterShellFileTypes(FALSE) to prevent the "new" and "print"
in the registry for new users
- Errors between 1 and 199 are also "errors"
- maximize MDI child when opening
- Randomize checking, so that there is less volume on just one host
(reduces peak volume on the ISP who hosts the site being checked)
- Slight change in report because of OPERA bug with <PRE> after <H2>

16.7.1998 (1.0m)
- Added banners in report
- corrected the "406" bug

24.6.1998 (1.0l)
- Added a column at the right (error text).
- removed "DELETE_ON_CLOSE" technique, didn't work on Windows NT
due to different OS behaviour. Sorry!

5.6.1998 (1.0k)
- Changed ftp access completely. It is now reliable, but won't work with proxies.
- more than 32767 URLs
- Optimized HTML parser

18.4.1998 (1.0j)
(I was on vacation, and I am still behind in my other activities,
so no "big" new feature this time)
- no need to enter "http://" in the NEW dialog box
- Cool Xenu icon! See on the page above.
- CTRL-R for "retry broken links"
- Removed "search" from context menu (nothing was associated with it)

6.2.1998 (1.0i)
- URL launch should now work properly with Netscape Communicator

1.2.1998 (1.0h)
- added "export to TAB separated file" for Excel (for Marc)
- added max level
- 100% CPU usage problem solved (Miguelito) / changed idle processing
- Site Map
- URL launching improved (but still not perfect)

25.12.1997 (also 1.0g)
- corrected "%26" endless loop bug (Electronic Telegraph)

24.12.1997 (1.0g)
- added lots of new options (for Stu)
- chose what you want to have in the report
- chose to "fail" passworded sites
- changed the way that URLs are launched: now with DDE so that only
one instance (but another window) of Netscape comes up. Behaviour
with IE and Opera might be different
- corrected "text/html;...." bug (for Hanno)
- you can now launch URLs with ENTER
- you can now get the property box with ALT-ENTER
- force reload for every call --> INTERNET_FLAG_RELOAD (for Doug)
- changed initial dialog box, after two users didn't realize that one
has to input only one URL, and not every page of the site
- removed unused toolbar icons and menu elements

23.11.1997 (1.0f)
- corrected bug that made it difficult to check local or very fast sites
- corrected minor bug in Properties Dialog
- Added column with link level
- Added error message for wrong input
- Added different tries for image maps

12.10.1997 (1.0e)
- list of redirected URLs (useful because certain ISPs, e.g. www.primenet.com do not provide proper error returns, instead they redirect to an error page)
- checking of targets of redirected URLs (this often leads to more broken links, as lots of sites make automatic redirection without checking if the target site exists)
- ftp & gopher list for manual check
- added tips how to repair broken links in the FAQ
- retry mechanism enhanced (for sites that fail with the HEAD command)
- error handler improved (open file problem)
- status line accuracy improved

7.9.1997 (1.0d)
- "Find" dialog box
- # of threads can be configured (watch your TCP/IP line glow!)
- corrected bug related to titles that do not end
- added authorization for "simple" password sites (HTTP error 401)
(will not work with web-based passwords, e.g. NY Times)

24.8.1997 (1.0c)
- HTML report, so that you can view with your browser
- % of checked URLs in the status bar
- URL list to chose from in "new" dialog box
- Automatic retry with GET when certain conditions are
met that suggest that the server cannot process the HEAD
command (www.amazon.com , www.wildkidz.com, www.dejanews.com )
- corrected display bug in "Reset Item" feature
- corrected bug when http:// in the middle of an URL
(www.sueddeutsche.de used this)
- corrected bug that incorrectly processed URLs that started
with a space
- corrected bug when saving while busy, that made reloading crash

15.8.1997 (1.0b)
- now handled correctly (www.trancenet.org used it)
- "Reset Item" feature to recheck a single broken URL
- Automatic saving of window placement in INI file
- Error msg when trying to check non-http/https sites
- Reports are deleted when the next report is made
(*** Please go to your temp directory and delete all the TGH*.* files)
- "Scroll bug" found and removed!
- Now possibility to check your bookmark file
- found column click bug, corrected, implemented time sorting
- New column: server.
- New column: title.
- Properties Dialog Box

10.8.1997
- ability to save & restore
- complete list of URLs (good to submit to a search engine)
- new icons
- # of threads in status line
- correct size of dynamic html files
- "copy" and "launch URL" function in menu and popup menu
- launch report all the time