Jumia Web Scrapper for Jomartt.com
- Node JS
- Puppetter
- Bull
- CSV (Node Module)
- Stringify
- The primary scripts are two namely consumer.js and producer.js
- The producer.js script parses 'categories.csv' for the various category links and pushes them unto the Queue.
- The consumer.js script parses each category link on the Queue, navigates to the product pages, extracts the relevant information and saves it into the CSV.
- An example file 'example_categories.csv' can be referenced for the expected format in which the categories should be saved. You can manually create the categories.csv file with the relevant categories you want to scrape by following the format.
- There's also a categoriesProducer.js file that automatically generates the 'categories.csv' file for you using the categories index page on jumia.com
- Install Node.js
- Run
npm install
to install all the necessary packages - Run
node categoriesProducer.js
to generate the 'categories.csv' file - Run
node producer.js
to parse the 'categories.csv' file and add the categories to the Queue for processing - Run
node consumer.js
to extract the information from the product pages and create the categories CSV files