As a web developer, you may have wanted to accomplish a PDF file of a web page to share with your clients, use it in presentations, or add it as a new affection in your web app. No matter your reason, Puppeteer, Google’s Node API for headless Chrome and Chromium, makes the task quite simple for you.
As a web developer, you may have wanted to accomplish a PDF file of a web page to share with your clients, use it in presentations, or add it as a new affection in your web app. No matter your reason, Puppeteer, Google’s Node API for headless Chrome and Chromium, makes the task quite simple for you.
In this tutorial, we will see how to catechumen web pages into PDF with Puppeteer and Node.js. Let’s start the work with a quick addition to what Puppeteer is.
What is Puppeteer, and why is it awesome?
In Google’s own words, Puppeteer is, “A Node library which provides a high-level API to ascendancy headless Chrome or Chromium over the DevTools Protocol.”
What is a headless browser?
If you are alien with the term headless browsers, it’s simply a browser after a GUI. In that sense, a headless browser is simply just addition browser that understands how to render HTML web pages and action JavaScript. Due to the lack of a GUI, the interactions with a headless browser take place over a command line.
Even though Puppeteer is mainly a headless browser, you can configure and use it as non-headless Chrome or Chromium.
What can you do with Puppeteer?
Puppeteer’s able browser-capabilities make it a absolute applicant for web app testing and web scraping.
To name a few use cases where Puppeteer provides the absolute functionalities for web developers,
- Generate PDFs and screenshots of web pages
- Automate form submission
- Scrape web pages
- Perform automatic UI tests while befitting the test ambiance up-to-date.
- Generating pre-rendered agreeable for Single Page Applications (SPAs)
Set up the activity environment
You can use Puppeteer on the backend and frontend to accomplish PDFs. In this tutorial, we are using a Node backend for the task.
Initialize NPM and set up the usual Express server to get started with the tutorial.
Make sure to install the Puppeteer NPM amalgamation with the afterward command before you start.
Convert web pages to PDF
Now we get to the agitative part of the tutorial. With Puppeteer, we only need a few lines of code to catechumen web pages into PDF.
First, create a browser instance using Puppeteer’s launch
function.
Then, we create a new page instance and visit the given page URL using Puppeteer.
We have set the waitUntil
option to networkidle0
. When we use networkidle0
option, Puppeteer waits until there are no new arrangement access within the last 500 ms. It is a way to actuate whether the site has accomplished loading. It’s not exact, and Puppeteer offers other options, but it is one of the most reliable for most cases.
Finally, we create the PDF from the crawled page agreeable and save it to our device.
The print to PDF function is quite complicated and allows for a lot of customization, which is fantastic. Here are some of the options we used:
- printBackground: When this option is set to true, Puppeteer prints any accomplishments colors or images you have used on the web page to the PDF.
- path: Path specifies where to save the generated PDF file. You can also store it into a memory stream to avoid autograph to disk.
- format: You can set the PDF format to one of the given options: Letter, A4, A3, A2, etc.
- margin: You can specify a margin for the generated PDF with this option.
When the PDF conception is over, close the browser affiliation with browser.close()
.
Build an API to accomplish and acknowledge PDFs from URLs
With the ability we gather so far, we can now create a new endpoint that will accept a URL as a query string, and then it will stream back to the client the generated PDF.
Here is the code:
If you start the server and visit the /pdf
route, with a target
query param absolute the URL we want to convert. The server will serve the generated PDF anon after ever autumn it on disk.
URL example: http://localhost:3000/pdf?target=https://google.com
Which will accomplish the afterward PDF as it looks on the image:

That’s it! You have completed the about-face of a web page to PDF. Wasn’t that easy?
As mentioned, Puppeteer offers many customization options, so make sure you play around with the opportunities to get altered results.
Next, we can change the viewport size to abduction websites under altered resolutions.
Capture websites with altered viewports
In the ahead created PDF, we didn’t specify the viewport size for the web page Puppeteer is visiting, instead used the absence viewport size, 800×600px.
However, we can absolutely set the page’s viewport size before ample the page.