...
The window is divided into two parts.
Scan Result: The upper part of the window shows the scan's progress or the scan results when it has been completed. | |
Page Scanner Input Parameter: The lower part of the window allows scan input parameters and starting a scan. |
...
Page Scanner Parameter Inputs
| The scan starts from this URL. Optionally, scan only parts of a website by entering a deep-linked URL path; for example, |
| The default value, |
| Excludes one or more URL path patterns from scanning. Commas separate the path patterns. |
| Include content and web pages from other web servers within the scan; for example, when images embedded in the web pages are located on another web server. Enter several additional web servers, separated by commas. Example: |
| Verify all external links to all other web servers. This is commonly used to detect broken hyperlinks to other web servers. |
| Affects which sets of embedded content types should also be included in the scan. Page Scanner uses the URL paths' file extensions to determine the content type (if available) because this can be done before the hyperlink of the embedded content itself is processed. This saves execution time, but it might affect a few URLs for excluded content types flow into the result from scanning because the MIME type of the received HTTP response headers is only used in detecting web pages. Remove these unwanted URLs after the scan has been completed by using the "remove URL" form in the Display Result window. |
Content-Type Sets | Corresponding File Extensions |
Images, Flash, CSS, JS |
|
PDF Documents |
|
Office Documents |
|
ASCII Text Files |
|
Music and Movies |
|
Binary Files |
|
| Allows you to select or de-select specific file extensions using the keywords -add or -remove. Example: -remove .gif -add .mp2 |
| Limits the maximum scan time in minutes. The scan will be stopped if this time is exceeded. |
| Limits the maximum number of scanned web pages. The scan will be stopped if the maximum number of web pages is exceeded. |
| Limits the maximum size of the received data (in megabytes), measured over the entire scan. The scan will be stopped if the maximum size of the received data is exceeded. |
| Limits the maximum number of executed URLcalls, measured over the entire scan. The scan will be stopped if the maximum number of executed URL calls is exceeded. |
| Ddefines the response timeout, in seconds, per single URL call. If this timeout expires, the URLcall will be reported as failed (no response from web server). |
| Limits the maximum URL path depth of scanned web pages. Example: |
| Limits the total number of followed HTTP redirects during the scan. |
| Limits the number of path repetitions which can occur within a single URL path. This parameter acts as protection against endless loops in scanning, and should usually be set to 1 (default) or 2. Example: |
| This (by default disabled) option acts as protection against receiving almost identical URLs many times if they differ only in their CGI parameters. If disabled, only the first similar URL will be processed. Example: the first URL |
| Allows scanning protected web sites (or web pages). |
| Sets which default language should be preferred when scanning multilingual web sites. |
| Apply the Personal Settings menu's Next Proxy Configuration when scanning through an (outgoing) proxy server. |
| Select the SSL protocol version to communicate with HTTPS servers (encrypted connections). |
| Enter a short comment about the scan. |
Analyze Scan
Convert Scan Result
. | |
| Allows scanning protected web sites (or web pages). |
Supported Authentication Methods | |
Authentication Method | Description |
---|---|
Basic | Apply HTTP Basic Authentication (Base64 encoded username:password send within all HTTP request headers). You should also enter a username and password into the corresponding input fields. |
NTLM | Apply NTLM authentication for all URL calls (if requested by the Web server). The NTLM configuration of the Personal Settingsmenu will be used. |
PKCS#12 Client Certificate | Apply a HTTPS/SSL client certificate for authentication. The active PKCS# 12 client certificate of the Personal Settings menu will be used. |
Scan Options
ABORT: You can abort a running scan by clicking on the “Abort Scan” “X“Icon | |
DISPLAY: Display the scan result | |
CONVERT: Converts the Page Scanner Result into a “normal” Web Surfing Session |
|
| The filename of the web surfing session. You must enter a "simple" filename, with no path and no file extension. The file extension is always |
| Selects the scanned web pages which should flow into the web surfing session. “All Pages” means that all scanned web pages are selected. Alternatively, the option “Page Ranges” allows you to select one or several ranges of page numbers. If you use several ranges, they must be separated by commas. Example: "1, 3-5, 7, 38-81" |
| Limits the number of URL calls that should flow into the web surfing session. |
| Enter a short comment about the web surfing session. This will become a hint in Project Navigator. |
| Optionally loads the web surfing session into the transient memory area of the Main Menu, or into one of two memory Scratch Areas of the Session Cutter. |
SAVE: When a scan has completed, save the scan result to a file. The file will be saved in the selected Project Navigator directory and will always have the file extension | |
DISCARD | Discards the Scan Results |
...
Analyzing the Scan Result
...
The most important statistical data about the scan are shown in the summary/overview, near the top of the window. Below the overview, select the various scan result details.
The search form, on the right side near the scan result detail selection, allows you to search for an ASCII text fragment overall web pages of the scan result. By default, the text fragment is searched for within all HTTP request headers, all HTTP response headers, and all HTTP response content data.
The remove URL form, which is shown below the scan result detail selection, allows you to remove specific sets of URLs from the scan result. The set of removed URLs is selected by the received MIME-type (examples: IMAGE/GIF, APPLICATION/PDF, ..), and linked with a logical AND condition with the received HTTP status code for the URLs (200, 302, ..), or with a Page Scanner error code, such as "network connection failed".
| selects a specific MIME type). The input field is case insensitive (upper and lower case characters will be processed as identical). any means that all MIME types are selected, independent of their value. none means that only URL calls whose HTTP response headers do NOT contain MIME type information (HTTP response header field "Content-Type" not set) will be selected. |
| selects an HTTP status code or a Page Scanner error code. |
Note: A few URLs with excluded content types may flow into the scan result (not selected by scan input parameter). You can use the "remove URL" form to clean up the scan result, and to remove any unwanted URLs. The most common case is to remove PDF documents from the scan result.
Scan Result Details
...
The Scan Input Parameter displays all input parameters for the scan (without authentication data).
...
Scan Statistic displays some additional statistical data about the scan. Similar Web Pages are the number of web pages with duplicate content (same content but different URL path). Failed URL Calls are the number of URL calls which failed, such that no HTTP status code was available (no response received from a web server), or that the received HTTP status was an error code (400..599).
...
Non-Processed Web Servers displays a summary of all web servers which have been found in hyperlinks, but whose web pages or page elements have not been scanned. The number before the server name shows the number of times the hyperlink was ignored by Page Scanner.
...
Scan Result per Web Page: displays all scanned web pages. The embedded content of a web page, such as images, is always displayed in a Web Browser Cached View. For example, this can mean that a particular (unique) image is only shown once inside the web page in which it has been referenced for the first time. All subsequent web pages will not show the same embedded content. This behavior is more or less equal to what a web browser does - it caches duplicate references over all web pages within a web surfing session.
More details about a specific URL call can be shown by clicking on the corresponding URL hyperlink.
...
Broken Links displays a list of all broken hyperlinks.
...
Duplicated Content displays a list of URLs with duplicate content (same content but different URL path).
Largest Web Pages displays a list of the largest web pages.
...
Slowest Web Pages display a list of the slowest web pages.
...
Tip: you can click the bars to display the corresponding page details.
Converting a Scan Result into a Web Surfing Session
A Page Scanner result can be converted into a “normal” web surfing session, creating a which can be used to create a load test program.
...
Input Fields
| The filename of the web surfing session. You must enter a "simple" filename, with no path and no file extension. The file extension is always | |
| allows you to select the scanned web pages which should flow into the web surfing session. “All Pages” means that all scanned web pages are selected. Alternatively, the option “Page Ranges” allows you to select one or several ranges of page numbers. If you use several ranges, they must be separated by commas. | Example: "1, 3-5, 7, 38-81" |
| limits the number of URL calls that should flow into the web surfing session. |
|
Note |
---|
Note |
---|
Tip: Apica recommends not converting more than 1,000 URL calls into a web surfing session. |
| we recommend that you enter a short comment about the web surfing session. |
| also loads the web surfing session into the transient memory area of the main menu, or into a scratch area of the Session Cutter. |
...
After the web surfing session has been stored, it will be automatically loaded into the Main Menu if the “Load Session into” checkbox was selected. After this, you can generate the load test program.
...