Content Comparison

...

Zebra TesterZebraTester's Page Scanner function browses and explores web pages of a web server automatically in a recursive way - similar to a Web Spider or a Web Crawler.

Page Scanner's Purpose

Primary

...

: To turn a "normal" web surfing session

...

into a load test program

...

. This provides a simplified way to create a web surfing session

...

instead of recording single web pages manually.

However, Page Scanner can only be used to acquire web surfing sessions that do not require HTML form-based authentication. This tool is not a replacement for recording web surfing sessions of real web applications.

Other

...

: Page Scanner allows the detection of broken links inside a

...

website and provides statistical data about the largest and slowest web pages. It also supports searching for text fragments overall scanned web pages.

Info
Note 1: Page Scanner does not interpret JavaScript code and does not submit forms. Only hyperlinks are considered. Cookies are automatically supported.

Info
Note 2: Page Scanner keeps the entire scanned

...

website in its transient memory (RAM) in compressed form. This means that large

...

websites can be scanned, but it also means that transient memory is not unlimited.

Note
Please note that the Page Scanner tool may return no result

...

or

...

return an incomplete result because some

...

websites or web pages contain malformed HTML code

...

or because old, unusual HTML options have been used within the scanned web pages. Although this tool has been intensively tested, we

...

cannot provide any warranty for error-free behavior. Possible

...

website--or webpage-related errors--may be impossible to fix because of divergent requirements

...

or

...

complexity. The functionality and behavior

...

are similar to other search engines, which also have

...

similar restrictions.

Overview

...

GUI Display

...

The window is divided into two parts.

Scan Result: The upper part of the window shows the progress of the scan, scan's progress or the scan results when the scan it has been completed.

...

Page Scanner Input Parameter: The lower part of the window allows the setting of scan input parameters and starting a scan.

...

Page Scanner Parameter Inputs

`Starting Web Page`	The scan starts from this URL. Optionally, scan only parts of a website by entering a deep-linked URL path; for example, `http://www.example.com/sales/customers.html`. In this case, only web pages below, or at, the same level of the URL path are scanned.
`Char Encoding`	The default value, `Auto Detect`, can be overridden in case some or all web pages are wrongly coded, such that the HTML header-specified character set does not match the character set which is actually used within the HTML body of the web pages (malformed HTML at server-side). You can try `ISO-8859-1` or `UTF`as a workaround if Page Scanner cannot extract hyperlinks (succeeding web pages) from the starting web page.
`Exclude Path Patterns`	Excludes one or more URL path patterns from scanning. Commas separate the path patterns.
`Follow Web Servers`	Include content and web pages from other web servers within the scan; for example, when images embedded in the web pages are located on another web server. Enter several additional web servers, separated by commas. Example: `http://www.example.com` , `https://imgsrv.example.com:444`. The protocol (HTTP or HTTPS), the hostname (usually www), the domain, and the TCP/IP port are considered, but URL paths are NOT considered.
`Verify External Links`	Verify all external links to all other web servers. This is commonly used to detect broken hyperlinks to other web servers.
`Include`	Affects which sets of embedded content types should also be included in the scan. Page Scanner uses the URL paths' file extensions to determine the content type (if available) because this can be done before the hyperlink of the embedded content itself is processed. This saves execution time, but it might affect a few URLs for excluded content types flow into the result from scanning because the MIME type of the received HTTP response headers is only used in detecting web pages. Remove these unwanted URLs after the scan has been completed by using the "remove URL" form in the Display Result window.

...


Content-Type Sets	Corresponding File Extensions
Images, Flash, CSS, JS	`.img`, `.bmp`, `.gif`, `.pct`, `.pict`, `.png`, `.jpg`, `.jpeg`, `.tif`, `.tiff`, `.tga`, `.ico`, `.swf`, `.stream`, `.css`, `.stylesheet`, `.js`, `.javascript`
PDF Documents	`.pdf`
Office Documents	`.doc`, `.ppt`, `.pps`, `.xls`, `.mdb`, `.wmf`, `.rtf`, `.wri`, `.vsd`, `.rtf`, `.rtx`
ASCII Text Files	`.txt`, `.text`, `.log`, `.asc`, `.ascii`, `.cvs`
Music and Movies	`.mp2`, `.mp3`, `.mpg`, `.avi`, `.wav`, `.avi`, `.mov`, `.wm`, `.rm`, `.mpeg`
Binary Files	`.exe`, `.msi`, `.dll`, `.bat`, `.com`, `.pif`, `.dat`, `.bin`, `.vcd`, `.sav`
`Include Options`	Allows you to select or de-select specific file extensions using the keywords -add or -remove. Example: -remove .gif -add .mp2
`Max Scan Time`	Limits the maximum scan time in minutes. The scan will be stopped if this time is exceeded.
`Max Web Pages`	Limits the maximum number of scanned web pages. The scan will be stopped if the maximum number of web pages is exceeded.
`Max Received Bytes`	Limits the maximum size of the received data (in megabytes), measured over the entire scan. The scan will be stopped if the maximum size of the received data is exceeded.
`Max URL Calls`	Limits the maximum number of executed URLcalls, measured over the entire scan. The scan will be stopped if the maximum number of executed URL calls is exceeded.
`URL Timeout`	Ddefines the response timeout, in seconds, per single URL call. If this timeout expires, the URLcall will be reported as failed (no response from web server).
`Max Path Depth`	Limits the maximum URL path depth of scanned web pages. Example: `http://www.example.com/docs/content/about.html`has a path depth of 3.
`Follow Redirections`	Limits the total number of followed HTTP redirects during the scan.
`Follow Path Repetitions`	Limits the number of path repetitions which can occur within a single URL path. This parameter acts as protection against endless loops in scanning, and should usually be set to 1 (default) or 2. Example: `http://www.example.com/docs/content/about.html`has a path repetition value of 3.
`Follow CGI Parameters`	This (by default disabled) option acts as protection against receiving almost identical URLs many times if they differ only in their CGI parameters. If disabled, only the first similar URL will be processed. Example: the first URL`http://www.example.com/showDoc/context=12` will be processed, but subsequent similar URLs `http://www.example.com/showDoc?context=10` and `http://www.example.com/showDoc?context=13`, will not be processed.
`Authentication`	Allows scanning protected web sites (or web pages).
`Browser Language`	Sets which default language should be preferred when scanning multilingual web sites.
`Use Proxy`	Apply the Personal Settings menu's Next Proxy Configuration when scanning through an (outgoing) proxy server.
`SSL Version`	Select the SSL protocol version to communicate with HTTPS servers (encrypted connections).
`Annotation`	Enter a short comment about the scan.

...

Analyze Scan

Convert Scan Result

A Page Scanner result can be converted into a “normal” web surfing session, which can be used to create creating a load test program.

Version	Old Version 6	New Version 7
Changes made by	Glenn Huang	Glenn Huang
Saved on	Jan 14, 2020	Feb 16, 2021

Versions Compared

Key