


This can then be parsed using packages like Cheerio. It directly sends a get request to the web page and receives HTML content. The first method uses packages e.g., Axios. Generally, there are two methods of accessing and parsing web pages. There are a few methods to accessing and parsing web pages, but in this tutorial we will be covering how to do it with Google Puppeteer. const browser = await puppeteer.Web scraping and automation with JavaScript has evolved a lot in recent years. To change this, we can manually set the height and width of the viewport we prefer in options passed to puppeteer.launch(). Defining the browser viewportīy default, Puppeteer will use a 800px by 600px viewport size for the browser. To do that, I had to make a few modifications to the default example from above. Taking a screenshot for the embedįor my use case, I needed to take a screenshot of the embed and save it as binary data, to later be uploaded to Cloudinary (affiliate link). This will take a screenshot of the page,, and save a PNG file to the current directory with the name screenshot.png. Here is how that looks: const puppeteer = require('puppeteer') Ĭonst browser = await puppeteer.launch() Īwait page.screenshot() To take a screenshot using Puppeteer, we have to go through four steps:

Puppeteer brings the power of headless chrome to a simple node API, enabling us to use headless chrome almost anywhere. chrome -headless -disable-gpu -screenshot To take a screenshot, we can use the -screenshot flag instead. To get the DOM contents of a page, for example, we can use the -dump-dom flag. Since version 59, headless Chrome has been available via the chrome command (Note: you may need to add an alias to use the command).

In this article, I will cover the first part of this solution, which is using a headless browser to create a screenshot of the page.
