How to Capture Multiple Web Pages at Once (Bulk URL Capture)

Page Vault’s Batch Web Pages feature allows you to make captures by submitting a list of URLs to our preservation system. A static PDF will be generated and saved to your portal for each URL submitted. 

Note that If a video URL (youtube, Vimeo, etc.) is submitted, the Webpage option will take a screenshot of the page, but will not capture the video file on the page. To capture the video file, submit the URL separately as a video capture job.

Supported Pages

Batch-Web Page can capture a static PDF of most regular web URLs.

We do not recommend using Batch to capture web apps, messaging systems, or social media. URLs that require a login can be attempted using cookies (see instructions below), but will likely return a login page as the capture, especially if there is a CAPTCHA prompt. For these URLs, captures will need to be made using Page Vault Browser rather than Batch.

Some social media websites are not supported for Batch web page capture. URLs from these websites will not be attempted:

  • Facebook
  • Twitter
  • Instagram
  • Pinterest

Capturing a full website with Batch

If you need to capture an entire website, you will need a list of all URLs on that website. Please see the Batch FAQs for more information on crawling websites for a full list of URLs.

How to capture web pages with Batch

  1. Gather your URLs: you will need the URL of each webpage you would like to capture. Each URL entered will be captured as a single PDF.

    The list of URLs can be from many different websites, or from the same site, and up to 500 URLs can be submitted in each Batch Job request.
  2. Access the Batch feature from the top of your Page Vault Portal
  3. If you are not currently on the New Batch Job Request page, click on “New Job Request” in the top navigation bar.

  4. Click the “Web pages” button.
  5. Select the folder in your Page Vault Portal account where you would like the Batch job to be saved. An additional sub-folder will be created for each Batch job submitted; so, if you simply select My Folders, a sub-folder will be created with the Job Name (see next step).

    All resulting captures from this Batch job request will be created in the sub-folder (e.g. if you submit a Batch Job with 90 unique URLs, those 90 webpage captures will be deposited into the sub-folder as the job is processed.)
  6. Enter a job name for the capture. We recommend a unique name for each job, as a sub-folder with the job name will be created in your Portal. This will also appear in the Jobs History list for the job.
  7. Optional: If you would like the Case Matter ID to be associated with the captures, enter it here.
  8. Optional: Set the advanced options. See “Advanced Options” below for further details.
  9. Paste in your list of URLs. Please note:
    • One URL per line
    • Include the http:// or https:// in the URL
    • Webpage Batch jobs are limited to 500 URLs per job. Please split larger URL groups into smaller Batch jobs.
    • Your account may have up to 10 job requests that are not “done processing” at one time. (see explanation on the Job History help article)
  10. Click “Start Batch Capture”. If there are any unsupported URLs in your list, you will receive an error notice and the job will not submit. Make any changes suggested in the error notice and click “Start Batch Capture” to submit.
  11. To view the status of the job, see the Job History tab.

Using cookies to log in and remove pop-ups

If there are pop-ups on the page or the page requires a log in, you can try submitting your browser’s cookies with the job to attempt to access the gated content.  This method is not guaranteed to work.
There are many ways for a browser to validate your login or to close pop-ups. If that method is not cookie-based, entering your cookies will not give you access to the logged-in URL or close the pop-ups.

In the event that Batch cannot capture a URL correctly, please use the Page Vault Browser to navigate to the desired URL and log in or close pop-ups before capturing.

Refer to this help page to learn more about how and when to use Cookies.

Advanced Options

In addition to submitting cookies, the Advanced Options allow you to:

  • Set the screen width: This is how wide the browser will be when capturing. The default is 1200 pixels; if you make a capture and it looks squished or formatted incorrectly, try increasing the browser width.
  • Overlap: This is the amount of prior PDF page of the capture that will show up on the next PDF page, which validates that no content is missing from the capture.
  • Remove cover page: Click this if you do not want the standard Page Vault PDF cover page on your capture. Note that if you have cover pages turned off in your account settings (in the Portal) you do not need to select this option.
Updated on June 24, 2024
Can't find what you need?