#StackBounty: #python #selenium #web-scraping #scrapy #web-crawler how to run spider multiple times with different input

Bounty: 50

I’m trying to scrape information from different sites about some products. Here is the structure of my program:

product_list = [iPad, iPhone, AirPods, ...]

def spider_tmall:
    self.driver.find_element_by_id('searchKeywords').send_keys(inputlist[a])

# ...


def spider_jd:
    self.driver.find_element_by_id('searchKeywords').send_keys(inputlist[a])

# ...

if __name__ == '__main__':

    for a in range(len(inputlist)):
        process = CrawlerProcess(settings={
            "FEEDS": {
                "itemtmall.csv": {"format": "csv",
                                  'fields': ['product_name_tmall', 'product_price_tmall', 'product_discount_tmall'], },
                "itemjd.csv": {"format": "csv",
                               'fields': ['product_name_jd', 'product_price_jd', 'product_discount_jd'], },
        })

        process.crawl(tmallSpider)
        process.crawl(jdSpider)
        process.start()

Basically, I want to run all spiders for all inputs in product_list. Right now, my program only runs through all spiders once (in the case, it does the job for iPad) then there is ReactorNotRestartable Error and the program terminates. Anybody knows how to fix it?
Also, my overall goal is to run the spider multiple times, the input doesn’t necessarily have to be a list. It can be a CSV file or something else. Any suggestion would be appreciated!


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.