Puppeteer headless scrape → grid

When Firecrawl is not enough — full-browser scraping with login + JS rendering, then push the rows to an Instadash grid.

#scraping#puppeteer#browser

What it builds

✓A headless Chromium session with form-based login
✓DOM extraction via page.evaluate()
✓Failure-snapshot capture for selector drift
✓Push to a private grid — credentials never leave the runtime

The key step

const browser = await puppeteer.launch({ headless: true })
const page    = await browser.newPage()
 
await page.goto('https://news.ycombinator.com/login')
await page.type('input[name="acct"]', HN_USER)
await page.type('input[name="pw"]',   HN_PASS)
await Promise.all([
  page.click('input[type="submit"]'),
  page.waitForNavigation(),
])
 
const rows = await page.evaluate(() => /* scrape DOM */)
await fetch('https://instadash.io/ingest', { /* … */ })

note ▸

This is the core of the recipe. The full file (including setup, error handling, and the surrounding scaffolding) lives in the GitHub folder linked below — clone or copy it directly.

Run it

bash⎘ copy

git clone https://github.com/instadashio/instadash-recipes
cd instadash-recipes/puppeteer-headless-scrape
npm install
cp .env.example .env       # fill in your keys
npm start