The window is divided into two parts.

Scan Result: The upper part of the window shows the scan's progress or the scan results when it has been completed.	Image Modified
Page Scanner Input Parameter: The lower part of the window allows scan input parameters and starting a scan.	Image Modified

...

Page Scanner Parameter Inputs

`Starting Web Page`	The scan starts from this URL. Optionally, scan only parts of a website by entering a deep-linked URL path; for example, `http://www.example.com/sales/customers.html`. In this case, only web pages below, or at, the same level of the URL path are scanned.
`Char Encoding`	The default value, `Auto Detect`, can be overridden in case some or all web pages are wrongly coded, such that the HTML header-specified character set does not match the character set which is actually used within the HTML body of the web pages (malformed HTML at server-side). You can try `ISO-8859-1` or `UTF`as a workaround if Page Scanner cannot extract hyperlinks (succeeding web pages) from the starting web page.
`Exclude Path Patterns`	Excludes one or more URL path patterns from scanning. Commas separate the path patterns.
`Follow Web Servers`	Include content and web pages from other web servers within the scan; for example, when images embedded in the web pages are located on another web server. Enter several additional web servers, separated by commas. Example: `http://www.example.com` , `https://imgsrv.example.com:444`. The protocol (HTTP or HTTPS), the hostname (usually www), the domain, and the TCP/IP port are considered, but URL paths are NOT considered.
`Verify External Links`	Verify all external links to all other web servers. This is commonly used to detect broken hyperlinks to other web servers.
`Include`	Affects which sets of embedded content types should also be included in the scan. Page Scanner uses the URL paths' file extensions to determine the content type (if available) because this can be done before the hyperlink of the embedded content itself is processed. This saves execution time, but it might affect a few URLs for excluded content types flow into the result from scanning because the MIME type of the received HTTP response headers is only used in detecting web pages. Remove these unwanted URLs after the scan has been completed by using the "remove URL" form in the Display Result window.
Content-Type Sets	Corresponding File Extensions
Images, Flash, CSS, JS	`.img`, `.bmp`, `.gif`, `.pct`, `.pict`, `.png`, `.jpg`, `.jpeg`, `.tif`, `.tiff`, `.tga`, `.ico`, `.swf`, `.stream`, `.css`, `.stylesheet`, `.js`, `.javascript`
PDF Documents	`.pdf`
Office Documents	`.doc`, `.ppt`, `.pps`, `.xls`, `.mdb`, `.wmf`, `.rtf`, `.wri`, `.vsd`, `.rtf`, `.rtx`
ASCII Text Files	`.txt`, `.text`, `.log`, `.asc`, `.ascii`, `.cvs`
Music and Movies	`.mp2`, `.mp3`, `.mpg`, `.avi`, `.wav`, `.avi`, `.mov`, `.wm`, `.rm`, `.mpeg`
Binary Files	`.exe`, `.msi`, `.dll`, `.bat`, `.com`, `.pif`, `.dat`, `.bin`, `.vcd`, `.sav`
`Include Options`	Allows you to select or de-select specific file extensions using the keywords -add or -remove. Example: -remove .gif -add .mp2
`Max Scan Time`	Limits the maximum scan time in minutes. The scan will be stopped if this time is exceeded.
`Max Web Pages`	Limits the maximum number of scanned web pages. The scan will be stopped if the maximum number of web pages is exceeded.
`Max Received Bytes`	Limits the maximum size of the received data (in megabytes), measured over the entire scan. The scan will be stopped if the maximum size of the received data is exceeded.
`Max URL Calls`	Limits the maximum number of executed URLcalls, measured over the entire scan. The scan will be stopped if the maximum number of executed URL calls is exceeded.
`URL Timeout`	Ddefines the response timeout, in seconds, per single URL call. If this timeout expires, the URLcall will be reported as failed (no response from web server).
`Max Path Depth`	Limits the maximum URL path depth of scanned web pages. Example: `http://www.example.com/docs/content/about.html`has a path depth of 3.
`Follow Redirections`	Limits the total number of followed HTTP redirects during the scan.
`Follow Path Repetitions`	Limits the number of path repetitions which can occur within a single URL path. This parameter acts as protection against endless loops in scanning, and should usually be set to 1 (default) or 2. Example: `http://www.example.com/docs/content/about.html`has a path repetition value of 3.
`Follow CGI Parameters`	This (by default disabled) option acts as protection against receiving almost identical URLs many times if they differ only in their CGI parameters. If disabled, only the first similar URL will be processed. Example: the first URL`http://www.example.com/showDoc/context=12` will be processed, but subsequent similar URLs `http://www.example.com/showDoc?context=10` and `http://www.example.com/showDoc?context=13`, will not be processed.
`Authentication`	Allows scanning protected web sites (or web pages).
`Browser Language`	Sets which default language should be preferred when scanning multilingual web sites.
`Use Proxy`	Apply the Personal Settings menu's Next Proxy Configuration when scanning through an (outgoing) proxy server.
`SSL Version`	Select the SSL protocol version to communicate with HTTPS servers (encrypted connections).
`Annotation`	Enter a short comment about the scan.

Convert Scan Result

Authentication Method	Description
.
`Authentication`	Allows scanning protected web sites (or web pages).
Supported Authentication Methods
Basic	Apply HTTP Basic Authentication (Base64 encoded username:password send within all HTTP request headers). You should also enter a username and password into the corresponding input fields.
NTLM	Apply NTLM authentication for all URL calls (if requested by the Web server). The NTLM configuration of the Personal Settingsmenu will be used.
PKCS#12 Client Certificate	Apply a HTTPS/SSL client certificate for authentication. The active PKCS# 12 client certificate of the Personal Settings menu will be used.

Scan Options


Image Added ABORT: You can abort a running scan by clicking on the “Abort Scan” “X“Icon	Image Added
Image Added DISPLAY: Display the scan result	Image Added
Image Added CONVERT: Converts the Page Scanner Result into a “normal” Web Surfing Session `.prxdat` creating a load test program for additional ZebraTester actions.	Image Added A filename, without path or file extension, is required. An annotation is recommended to provide a hint in Project Navigator Click Convert and Save when ready. Optionally display the newly converted session in the Main Menu.
`Filename`	The filename of the web surfing session. You must enter a "simple" filename, with no path and no file extension. The file extension is always `.prxdat`. The file will be saved in the selected Project Navigator directory.
`Web Pages`	Selects the scanned web pages which should flow into the web surfing session. “All Pages” means that all scanned web pages are selected. Alternatively, the option “Page Ranges” allows you to select one or several ranges of page numbers. If you use several ranges, they must be separated by commas. Example: "1, 3-5, 7, 38-81"
`Max. URL Calls:`	Limits the number of URL calls that should flow into the web surfing session. Tip: Apica recommends not converting more than 1,000 URL calls into a web surfing session.
`Annotation`	Enter a short comment about the web surfing session. This will become a hint in Project Navigator.
`Load Session into`	Optionally loads the web surfing session into the transient memory area of the Main Menu, or into one of two memory Scratch Areas of the Session Cutter.
Image Added SAVE: When a scan has completed, save the scan result to a file. The file will be saved in the selected Project Navigator directory and will always have the file extension `.prxscn`. Scan results can be restored and loaded back into the Page Scanner by clicking on the corresponding "Load Page Scan" icon inside Project Navigator Image Added	Image AddedImage Added
Image Added DISCARD	Discards the Scan Results

...

Analyzing the Scan Result

...

The most important statistical data about the scan are shown in the summary/overview, near the top of the window. Below the overview, select the various scan result details.

The search form, on the right side near the scan result detail selection, allows you to search for an ASCII text fragment overall web pages of the scan result. By default, the text fragment is searched for within all HTTP request headers, all HTTP response headers, and all HTTP response content data.

The remove URL form, which is shown below the scan result detail selection, allows you to remove specific sets of URLs from the scan result. The set of removed URLs is selected by the received MIME-type (examples: IMAGE/GIF, APPLICATION/PDF, ..), and linked with a logical AND condition with the received HTTP status code for the URLs (200, 302, ..), or with a Page Scanner error code, such as "network connection failed".

`with content MIME type`	selects a specific MIME type). The input field is case insensitive (upper and lower case characters will be processed as identical). any means that all MIME types are selected, independent of their value. none means that only URL calls whose HTTP response headers do NOT contain MIME type information (HTTP response header field "Content-Type" not set) will be selected.
`HTTP status code`	selects an HTTP status code or a Page Scanner error code.

Note: A few URLs with excluded content types may flow into the scan result (not selected by scan input parameter). You can use the "remove URL" form to clean up the scan result, and to remove any unwanted URLs. The most common case is to remove PDF documents from the scan result.

Scan Result Details

...

The Scan Input Parameter displays all input parameters for the scan (without authentication data).

...

Scan Statistic displays some additional statistical data about the scan. Similar Web Pages are the number of web pages with duplicate content (same content but different URL path). Failed URL Calls are the number of URL calls which failed, such that no HTTP status code was available (no response received from a web server), or that the received HTTP status was an error code (400..599).

...

Non-Processed Web Servers displays a summary of all web servers which have been found in hyperlinks, but whose web pages or page elements have not been scanned. The number before the server name shows the number of times the hyperlink was ignored by Page Scanner.

...

Scan Result per Web Page: displays all scanned web pages. The embedded content of a web page, such as images, is always displayed in a Web Browser Cached View. For example, this can mean that a particular (unique) image is only shown once inside the web page in which it has been referenced for the first time. All subsequent web pages will not show the same embedded content. This behavior is more or less equal to what a web browser does - it caches duplicate references over all web pages within a web surfing session.

More details about a specific URL call can be shown by clicking on the corresponding URL hyperlink.

...

Broken Links displays a list of all broken hyperlinks.

...

Duplicated Content displays a list of URLs with duplicate content (same content but different URL path).

Largest Web Pages displays a list of the largest web pages.

...

Slowest Web Pages display a list of the slowest web pages.

...

Tip: you can click the bars to display the corresponding page details.

Converting a Scan Result into a Web Surfing Session

A Page Scanner result can be converted into a “normal” web surfing session, creating a which can be used to create a load test program.

...

Input Fields

`Filename`	The filename of the web surfing session. You must enter a "simple" filename, with no path and no file extension. The file extension is always `.prxdat`. The file will be saved in the selected Project Navigator directory.
`Web Pages`	allows you to select the scanned web pages which should flow into the web surfing session. “All Pages” means that all scanned web pages are selected. Alternatively, the option “Page Ranges” allows you to select one or several ranges of page numbers. If you use several ranges, they must be separated by commas.	Example: "1, 3-5, 7, 38-81"
`Max. URL Calls`	limits the number of URL calls that should flow into the web surfing session.

Note

Note
Tip: Apica recommends not converting more than 1,000 URL calls into a web surfing session.

`Annotation`	we recommend that you enter a short comment about the web surfing session.
`Load Session into`	also loads the web surfing session into the transient memory area of the main menu, or into a scratch area of the Session Cutter.

...

After the web surfing session has been stored, it will be automatically loaded into the Main Menu if the “Load Session into” checkbox was selected. After this, you can generate the load test program.

...

Version	Old Version 7	New Version 8
Changes made by	Glenn Huang	Glenn Huang
Saved on	Feb 16, 2021	Feb 16, 2021

Versions Compared

Key

Page Scanner Parameter Inputs

Analyze Scan

Convert Scan Result

Supported Authentication Methods

Scan Options

Analyzing the Scan Result

Scan Result Details

Converting a Scan Result into a Web Surfing Session

Input Fields

Content Comparison

Versions Compared

Key

Page Scanner Parameter Inputs

Analyze Scan

Convert Scan Result

Supported Authentication Methods

Scan Options

Analyzing the Scan Result

Scan Result Details

Converting a Scan Result into a Web Surfing Session

Input Fields