Skip to content

Cheerio in Production

CodeBreaker edited this page Sep 12, 2024 · 72 revisions

Companies using cheerio in production

  • Weiji Lab srl uses cheerio to process EPUB and other html-based editorial contents.
  • The Old Net uses Cheerio to remove modern JavaScript that would break vintage computer web browsers.
  • Sysco Labs uses Cheerio to parse scraped websites
  • Cre8teGPT uses cheerio for website, article, and sitemap generation.
  • Scraper API uses Cheerio to parse scraped websites
  • AfterShip uses Cheerio to parse the couriers tracking results.
  • Walmart uses Cheerio to host the server rendering of its mobile website
  • Cloudup uses Cheerio to provide a better viewing experience for certain websites
  • Kimono uses Cheerio to parse the scraped websites
  • Courseoff uses cheerio to scrape college course catalogs and schedule listings
  • Iframely uses cheerio to parse specific domains and generic patterns, such as microformats or oEmbed. Plus, to analyze detected embed codes, make it responsive, check SSL support, etc.
  • Higher Ed Careers Canada uses Cheerio automatically to verify details about job postings and to add "nofollow" to links in submitted HTML.
  • Workray uses Cheerio to parse Job Alert emails from Job Boards and extract the job listings, or extract application information from application confirmation emails.
  • BotFactory uses Cheerio to parse wishlists from Amazon, AliExpress as well as couriers tracking results
  • ZenLocator uses Cheerio to strip out JavaScript from customer templates, and in pre-rendering custom dashboard controls.
  • GitHub Trending API uses Cheerio to scrape GitHub trending projects.
  • InspectorHub uses Cheerio to parse the e commerce top products
  • Vingle uses Cheerio to detect suspicious content, and to parse XML-based content data.
  • OWEB uses Cheerio to parse scraped websites.
  • Affiliate Stats Tracker uses Cheerio to extract data scraped using Puppeteer.
  • REVOL uses Cheerio to parse scraped websites
  • Remotehour uses Cheerio to modify meta attributes
  • Talon uses Cheerio to parse insurance carrier and health plan portals.
  • BikeSleepBike uses Cheerio to parse blog posts about bikepacking and bicycle travel.
  • JishinAlert uses Cheerio to parse scraped disaster prevention data from government sources such as NHK, National Research Institute for Earth Science and Disaster Resilience (NIED), and the Japan Meteorological Agency.
  • Tibbo uses Cheerio to parse XML files and output HTML topics for its documentation platform.
  • ScrapeNinja uses Cheerio to extract JSON data from scraped web pages (from Puppeteer rendered web pages, as well), and provides free Cheerio Sandbox for quick Cheerio syntax testing in browser (think regex101 for cheerio).
  • The Distance App Developers - for pulling clients' web data into their app solutions (via AWS Lambda).
  • promptmate.io - enrich LLMs such as ChatGPT with structured data via Cheerio.

Libraries Built with cheerio

  • x-ray is a web scraper
  • Backbone.LayoutManager
  • breakdance is a HTML-to-markdown converter that uses cheerio to parse the HTML
  • itteco/iframely is the library behind Iframely
  • fruit-loops Walmart's isomorphic javascript environment
  • CheerioBin run Cheerio and jQuery commands simultaneously
  • AkashaCMS is a content management system which produces static HTML files. It uses Cheerio extensively for DOM manipulation of generated HTML pages before writing to disk.
  • Postxml is a tool for transforming html/xml with plugins based on cheerio.
  • CheerioGetCssSelector Extends cheerio to get a unique css selector for any cheerio element.
  • jsonframe-cheerio brings a crazy simple way to input/output json structured data
  • temme Concise and convenient jQuery-like selector for node crawlers.
  • Jason the Miner harvests data at the <html> mine. Cheerio enables Jason to express simple yet powerful schemas definition allowing DOM element selection, matching and extraction.
  • SeaSite New approach to simple static website generation using jQuery-like selectors. Convenient predefined plugins and tasks solve every-day problems for complex website building.
  • icsd-scraper Retrieves details about professors and courses from University of the Aegean department ICSD to help students to their academic projects. It uses Cheerio for DOM manipulation and data collection.
  • Tumblweed is a fully cross-platform Tumblr blog downloader, using Cheerio to scrape posts to extract embedded media for download.
  • Typi is a scraper that uses headless browser along with cheerio to scrape anything that can be viewed by a real user
  • @luxdamore/nuxt-prune-html Nuxt module to prune html before sending it to the browser (it removes elements matching CSS selector(s)), useful for boosting performance showing a different HTML for bots by removing all the scripts with dynamic rendering.
  • scrapio is a very simple json template based scraper, using cheerio.
  • iam-floyd uses Cheerio to generate code from the AWS documentation
  • cairn an npm package and CLI tool for saving the web page as a single HTML file.
  • @get-set-fetch/scraper - web scraper supporting multiple databases and headless clients
  • ees-announcements-scraper-v2 provides effortless access to the most recent Engineering - Economics Master's Programme announcements from the National Technical University of Athens site.
  • Fredy Fredy will constantly scrape new listings on real estate pages like Immoscout or Immowelt and send new results to you, so that you can focus on more important things in life ;).

Feel free to add your company or library using cheerio!