#StackBounty: #node.js #automated-tests #puppeteer #end-to-end #puppeteer-cluster Is Puppeteer-Cluster Stealthy enough to pass bot tests?

Bounty: 100

I wanted to know if anyone using Puppeteer-Cluster could elaborate on how the Cluster.Launch({settings}) protects against sharing of cookies and web data between pages in different context.

Do the browser contexts here, actually block cookies and user-data is not shared or tracked? Browserless’ now infamous page seems to think no, here and that .launch({}) should be called on the task, not ahead of the queue.

So my question is, how do we know if puppeteer-cluster is sharing cookies / data between queued tasks? And what kind of options are in the library to lower the chances of being labeled a bot?

Setup: I am using page.authenticate with a proxy service, random user agent, and still getting blocked(403) occasionally by the site which I’m performing the test.

async function run() {
// Create a cluster with 2 workers
  const cluster = await Cluster.launch({
    concurrency: Cluster.CONCURRENCY_BROWSER, //Cluster.CONCURRENCY_PAGE,
    maxConcurrency: 2, //5, //25, //the number of chromes open
    monitor: false, //true,
    puppeteerOptions: {
      executablePath,
      args: [
        "--proxy-server=pro.proxy.net:2222",
        "--incognito",
        "--disable-gpu",
        "--disable-dev-shm-usage",
        "--disable-setuid-sandbox",
        "--no-first-run",
        "--no-sandbox",
        "--no-zygote"
      ],
      headless: false,
      sameDomainDelay: 1000,
      retryDelay: 3000,
      workerCreationDelay: 3000
    }
  });

   // Define a task 
      await cluster.task(async ({ page, data: url }) => {
         extract(url, page); //call the extract
      });

   //task
      const extract = async ({ page, data: dataJson }) => {
         page.setExtraHTTPHeaders({headers})

         await page.authenticate({
           username: proxy_user, 
           password: proxy_pass
         });

       //Randomized Delay
         await delay(2000 + (Math.floor(Math.random() * 998) + 1));

         const response = await page.goto(dataJson.Url);
 }

//loop over inputs, and queue them into cluster
  var dataJson = {
      url: url
      };

  cluster.queue(dataJson, extract);

 }

 // Shutdown after everything is done
 await cluster.idle();
 await cluster.close();

}


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.