View all our articles

Convert an HTML document to PDF from raw HTML in Python with aiohttp

In this guide, we'll show you how to convert an HTML document to PDF using Python and the aiohttp library. Providing the HTML document as raw as a lot of great advantages :

  • It avoids PDFShift to make a network request to fetch the HTML document.
  • It allows you to convert HTML documents that are not publicly accessible.
  • It improves the speed of the conversion

If, on top of that, you can provide the styles and javascript also inline in the document (<style> and <script> tags), it will also drastically reduce the duration of the conversion.

For converting a raw document, we use the same parameter as for the URL, which is the source parameter. All you have to do is provide an HTML document.

Here's an example:

import aiohttp, asyncio, json, base64

# You can get an API key at https://pdfshift.io
api_key = 'sk_xxxxxxxxxxxx'

params = {
    'source': '<html><body><h1>This will be a PDF document</h1><p>This will generate a basic PDF to show how you can add raw HTML</body></html>',
}

response = None
try:
    auth = base64.b64encode(
        'api:{}'.format(api_key).encode('utf-8')
    ).decode('utf-8')

    async with aiohttp.ClientSession() as session:
        async with session.post(
            'https://api.pdfshift.io/v3/convert/pdf',
            headers={'Authorization': f'Basic {auth}'},
            json=params
        ) as response:
            if response.status >= 400:
                raise Exception('Invalid request: {}'.format(await response.text()))

            with open('result.pdf', 'wb') as f:
                f.write(await response.read())

            print('The PDF document was generated and saved to result.pdf')
except asyncio.TimeoutError:
    raise Exception('The request took too long to process')
except aiohttp.ClientError as e:
    raise Exception(f'An error occurred: {e}')
except Exception as e:
    # We highly recommend you to handle exceptions. Often, PDFShift will provide you with a clear explanation about what happened.
    # Moreover, in case of error, no PDF are returned !
    raise Exception(f'An error occurred: {e}')

Providing the source parameter as a raw HTML document is by far the best recommended method to convert HTML documents in PDFShift as it reduce the amount of network requests, loading time of each documents (image, css, javascript, etc) and thus improve the conversion time.

For further details on the property and its usage, please refer to our dedicated documentation.

We hope this guide was helpful. If you have any questions or noticed any issues on the code above,
feel free to drop us a line.