View all our articles

Exporting only a specific set of pages in Python with urllib3

In this guide, we'll show you how to export only a specific set of pages from a document using Python and the urllib3 library to convert them to PDF using PDFShift's API.

When you're converting a document, you might want to export only a specific set of pages from a document. This can be done by setting the pages parameter to the request.

It behaves exactly like the page parameter you can see when you print a document from your browser.

It accepts the following pattern:

  • A number, such as 2. This will print the page 2
  • A range, such as 2-4. This will print the pages 2 to 4 (2, 3 and 4)
  • A list, such as 2,4,5,9. This will print the pages 2, 4, 5 and 9

This way, you can only export the pages that you really need and get immediately the result you want in the PDF, without having to edit it.

Here's an example:

import urllib3, json, base64

# You can get an API key at https://pdfshift.io
api_key = 'sk_xxxxxxxxxxxx'

params = {
    'source': 'https://en.wikipedia.org/wiki/PDF',
    'pages': '2-4'
}

http = urllib3.PoolManager()
response = http.request(
    'POST',
    'https://api.pdfshift.io/v3/convert/pdf',
    headers={
        'Content-Type': 'application/json',
        'Authorization': 'Basic {}'.format(
            base64.b64encode('api:{}'.format(api_key).encode('utf-8')).decode('utf-8')
        )
    },
    body=json.dumps(params).encode('utf-8')
)

if response.status != 200:
    raise ValueError(f"Request failed with status code {response.status}: {response.data.decode('utf-8')}")

with open('result.pdf', 'wb') as f:
    f.write(response.content)

print('The PDF document was generated and saved to result.pdf')

For further details on the pages property and its usage, please refer to our dedicated documentation.

We hope this guide was helpful. If you have any questions or noticed any issues on the code above,
feel free to drop us a line.