forked from scrapy/scrapy.org
-
Notifications
You must be signed in to change notification settings - Fork 0
/
companies.html
521 lines (468 loc) · 17.3 KB
/
companies.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
---
layout: default
title: Companies using Scrapy
permalink: companies/
---
{% assign stable = site.data.scrapy.stable %}
{% assign oldstable = site.data.scrapy.oldstable %}
{% assign devel = site.data.scrapy.development %}
<div class="container">
<h1>Companies that are using Scrapy</h1>
<p>Check who is using Scrapy to do business and make an impact on the world.
<br />
(<span class="highlight">Should you be on this list?</span>
Fork <a href="https://github.com/scrapy/scrapy.org">this Github repo</a>,
add yourself, and send a pull request!)</p>
<div class="companies-container">
<div class="company-box">
<a href="http://scrapinghub.com/">
<img src="../img/shub-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Scrapinghub:</span>
From the <a href="http://scrapinghub.com/about">creators of Scrapy</a>,
Scrapinghub is a leading technology and professional services company,
providing successful web crawling and data processing solutions. </p>
</div>
<div class="company-box">
<a href="http://parsely.com/">
<img src="../img/01-parsely-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Parsely:</span>
Uses Scrapy to scrape articles from hundreds of news sites.
Its CTO talks about Scrapy in <a href="https://speakerdeck.com/amontalenti/web-crawling-and-metadata-extraction-in-python">this talk.</a> </p>
</div>
<div class="company-box">
<a href="http://directemployersfoundation.org/">
<img src="../img/02-direct-employers-logo.png" />
</a>
<hr />
<p>
<span class="highlight">DirectEmployers Foundation:</span>
Uses Scrapy to scrape job postings from many websites,
which are published on the <a href="http://www.my.jobs/">My.jobs</a> site. </p>
</div>
<div class="company-box">
<a href="http://www.weotta.com/">
<img src="../img/03-weotta-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Weotta:</span>
Uses Scrapy to crawl data for post-processing <a href="http://twitter.com/japerk/status/79304855486865408">(tweet)</a>. </p>
</div>
<div class="company-box">
<a href="http://www.flax.co.uk/">
<img src="../img/06-flax-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Flax:</span>
Is a search consulting company based in Cambridge (UK) that uses Scrapy
to power the crawling needs of their solutions <a href="http://www.flax.co.uk/blog/2013/02/22/cambridge-search-meetup-a-night-of-crawling-and-scraping/">(blog post)</a>. </p>
</div>
<div class="company-box">
<a href="http://medialab.sciences-po.fr/">
<img src="../img/07-media-sciences-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Médialab Sciences Po:</span>
In Paris is using Scrapy to develop a web mining tool for Social Sciences researchers <a href="https://groups.google.com/d/topic/scrapy-users/ApfTGGokSKo/discussion">(announcement here)</a>. </p>
</div>
<div class="company-box">
<a href="http://www.lyst.com/">
<img src="../img/08-lyst-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Lyst:</span>
Uses Scrapy to crawl and scrape the fashion websites they index. </p>
</div>
<div class="company-box">
<a href="https://scraperwiki.com/">
<img src="../img/09-scraper-wiki-logo.png" />
</a>
<hr />
<p>
<span class="highlight">ScraperWiki:</span>
Is a data services company based in Liverpool providing bespoke solutions
for data scraping and aggregation using Scrapy as a core technology <a href="http://blog.scraperwiki.com/2013/03/14/tools-of-the-trade/">(blog post)</a>. </p>
</div>
<div class="company-box">
<a href="http://www.data.gov.uk/">
<img src="../img/10-datagovuk-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Data.Gov.Uk:</span>
UK government data aggregation site <a href="http://twitter.com/bfirsh/status/8025368963">(tweet)</a>. </p>
</div>
<div class="company-box">
<a href="http://www.oposicionesaldia.com/">
<img src="../img/11-oposiciones-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Oposicionesaldia:</span>
Uses Scrapy to collect data from jobs postings, scholarships and online free courses in Spain. </p>
</div>
<div class="company-box">
<a href="http://www.iberstudios.com/">
<img src="../img/12-iberestudios-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Iberestudios:</span>
Uses Scrapy to collect data from masters degrees, doctorates and postgradute degrees in Spain. </p>
</div>
<div class="company-box">
<a href="http://www.usedaywatch.com/">
<img src="../img/13-daywatch-logo.png" />
</a>
<hr />
<p>
<span class="highlight">DayWatch:</span>
Is an Internet Market Intelligence tool that uses Scrapy to empower real-time business information retrieval from Daily Deal sites. </p>
</div>
<div class="company-box">
<a href="http://www.pricewiki.com/">
<img src="../img/14-pricewiki-logo.png" />
</a>
<hr />
<p>
<span class="highlight">PriceWiki:</span>
Uses Scrapy to scrape various websites for cost of living information. </p>
</div>
<div class="company-box">
<a href="http://dealshelve.com/">
<img src="../img/15-dealshelve-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Dealshelve:</span>
Uses Scrapy to scrape daily deals from many sites. </p>
</div>
<div class="company-box">
<a href="http://www.careerbuilder.com/">
<img src="../img/16-careerbuilder-logo.png" />
</a>
<hr />
<p>
<span class="highlight">CareerBuilder:</span>
Uses Scrapy to scrape job offers from many sites. </p>
</div>
<div class="company-box">
<a href="http://grablab.org/">
<img src="../img/17-grablab-logo.png" />
</a>
<hr />
<p>
<span class="highlight">GrabLab:</span>
Is a Russian company which specializes in web scraping, data collection and web automation tasks. </p>
</div>
<div class="company-box">
<a href="http://www.simplespot.it/">
<img src="../img/18-simplespot-logo.png" />
</a>
<hr />
<p>
<span class="highlight">SimpleSpot:</span>
Uses Scrapy to build their geolocalized information service. </p>
</div>
<div class="company-box">
<a href="http://www.monetate.com/">
<img src="../img/19-monetate-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Monetate:</span>
Uses Scrapy daily to collect catalog information from their clients. </p>
</div>
<div class="company-box">
<a href="http://www.clanslots.com/">
<img src="../img/20-clanslots-logo.png" />
</a>
<hr />
<p>
<span class="highlight">ClanSlots:</span>
Uses Scrapy daily to collect levels and plugins for games they host. </p>
</div>
<div class="company-box">
<a href="http://www.alisverisrobotu.com/">
<img src="../img/21-alisveris-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Alışveriş Robotu:</span>
Is a Turkish price comparison site that uses Scrapy to collect data from hundreds of retailers everyday. </p>
</div>
<div class="company-box">
<a href="http://www.tuvalabs.com/">
<img src="../img/22-tuvalabs-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Tuvalabs:</span>
Uses Scrapy to scrape the web and find the most interesting articles of significant news stories
taking place around the world and transforming them into interactive math learning units. </p>
</div>
<div class="company-box">
<a href="http://www.alistek.com/">
<img src="../img/23-alistek-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Alistek:</span>
Uses Scrapy for updating partner related information in their OpenERP based back-office system,
by scraping various data sources, both on the web and off-line. </p>
</div>
<div class="company-box">
<a href="http://www.zhitongba.com/">
<img src="../img/24-Zhitongba-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Zhitongba:</span>
Is a company trying to help people better commute within big cities in China.
They use Scrapy to scrape ride-sharing information from multiple sources. </p>
</div>
<div class="company-box">
<a href="http://www.offertazo.com/">
<img src="../img/25-offertazo-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Offertazo:</span>
Uses Scrapy to scrape offers from many Spanish websites. </p>
</div>
<div class="company-box">
<a href="http://www.lionseek.com/">
<img src="../img/26-lionseek-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Lionseek:</span>
Is a search engine that uses Scrapy to find items for sale in forums. </p>
</div>
<div class="company-box">
<a href="http://www.stilivo.com/">
<img src="../img/27-stilivo-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Stilivo:</span>
Is a discovery shopping site that uses Scrapy to collect product information from e-commerce sites. </p>
</div>
<div class="company-box">
<a href="http://www.mapado.com/">
<img src="../img/28-mapado-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Mapado:</span>
Uses Scrapy to find local activities on the web. </p>
</div>
<div class="company-box">
<a href="http://oony.com/">
<img src="../img/29-oony-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Oony:</span>
Is a deal aggregator in more than 16 countries.
They currently have more than 500 Scrapy spiders running to gather their information. </p>
</div>
<div class="company-box">
<a href="http://woppu.my/">
<img src="../img/30-woppu-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Woppu:</span>
Is using Scrapy to collect product information from online shopping malls in Malaysia and Singapore. </p>
</div>
<div class="company-box">
<a href="http://jobuzu.co.uk/">
<img src="../img/31-jobuzu-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Jobuzu:</span>
Uses Scrapy to scrape over 100,000 jobs daily from UK job boards. </p>
</div>
<div class="company-box">
<a href="http://www.zopper.com/">
<img src="../img/32-zopper-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Zopper:</span>
Uses Scrapy to crawl hundreds of ecommerce portals for classifying the products sold online. </p>
</div>
<div class="company-box">
<a href="http://wp-rocket.me/">
<img src="../img/34-wprocket-logo.png" />
</a>
<hr />
<p>
<span class="highlight">WP Rocket:</span>
Uses Scrapy to preload the cache of all customer sites. </p>
</div>
<div class="company-box">
<a href="http://competera.net/">
<img src="../img/36-competera-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Competera:</span>
Is a price intelligence service that uses Scrapy to collect price, availability and promo data
from over the million product pages every day. </p>
</div>
<div class="company-box">
<a href="http://www.tarlabs.com/">
<img src="../img/37-tarlabs-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Tarlabs:</span>
Uses Scrapy for information and text processing and automated testing. </p>
</div>
<div class="company-box">
<a href="http://www.jobijoba.com/">
<img src="../img/38-jobijoba-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Jobijoba:</span>
Uses Scrapy to scrape job offers daily from many job boards.
Operate in France, several European countries, Russia, Mexico and Australia. </p>
</div>
<div class="company-box">
<a href="http://dataquarry.co.uk/">
<img src="../img/39-dataquarry-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Data Quarry:</span>
Designs data scrapers specifically to address the needs of e-commerce users and offers custom scrapy devlopment. </p>
</div>
<div class="company-box">
<a href="http://utero.pe/">
<img src="../img/40-utero-logo.png" />
</a>
<hr />
<p>
<span class="highlight">El Útero de Marita:</span>
Peruvian leading news blog uses Scrapy to download public documents from governmental institutions in Peru for data journalism. </p>
</div>
<div class="company-box">
<a href="http://www.shimply.com/">
<img src="../img/41-shimply-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Shimply:</span>
Building world's largest online marketplace connecting sellers and buyers.
Parsing over 500 large and small sites daily. <a href="https://twitter.com/rajatgarg79/status/508132805440581632">(tweet)</a>. </p>
</div>
<div class="company-box">
<a href="https://allclasses.com/">
<img src="../img/42-allclasses-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Allclasses:</span>
Uses Scrapy to collect over 100,000 online and local classes from recreational through advanced education sites - providing more accessibility to education. </p>
</div>
<div class="company-box">
<a href="http://www.monkeylearn.com/">
<img src="../img/43-monkeylearn-logo.png" />
</a>
<hr />
<p>
<span class="highlight">MonkeyLearn:</span>
Is a cloud platform that allows any company to extract relevant data from unstructured text using machine learning.
It uses Scrapy to get data to train its algorithms. </p>
</div>
<div class="company-box">
<a href="http://neu.land/" title="neu.land GmbH"><img src="../img/44-neuland-logo.png" alt="neu.land GmbH Logo"/></a>
<hr />
<p><span class="highlight">neu.land GmbH:</span>
Uses Scrapy to crawl client websites, allowing to identify possibe optimization measures
with the aim to make websites faster, more accessible, and user-friendly.</p>
</div>
<div class="company-box">
<a href="http://lavoweb.net/" title="LavoWeb">
<img src="../img/45-lavoweb-logo.png" alt="LavoWeb SAS Logo"/>
</a>
<hr />
<p><span class="highlight">LavoWeb:</span>
Uses Scrapy to crawl e-commerce websites, make SEO audit and Magento migration.</p>
</div>
<div class="company-box">
<a href="http://sayonetech.com/" title="SayOne"><img src="../img/46-sayone-logo.png" alt="SayOne Logo"/></a>
<hr />
<p><span class="highlight">SayOne:</span>
uses Scrapy to crawl data for their clients and thereby develop more customer-centric applications for them.
</div>
<div class="company-box">
<a href="http://www.videdressing.com/">
<img src="../img/47-videdressing-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Videdressing.com:</span>
Uses Scrapy to crawl and collect data on fashion products, clothes and accessories.</p>
</div>
<div class="company-box">
<a href="http://zimigo.com/">
<img src="../img/48-zimigo-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Zimigo.com:</span>
Is a vertical search engine for cars, real estate, jobs and products classified ads, in more than 30 countries.
Uses Scrapy to crawl and collect ads on partner classified websites.</p>
</div>
<div class="company-box">
<a href="https://uphail.com/">
<img src="../img/49-uphail-logo.png" />
</a>
<hr />
<p>
<span class="highlight">Up Hail:</span>
Does for taxi services what so many websites have done for other travel costs — compares them in real time to find the best deal.
Uses Scrapy to crawl and scrape taxi and transportation sites to gather rates, coverage zones, and deals.</p>
</div>
<div class="company-box">
<a href="http://raystorm.place/">
<img src="../img/50-raystorm-logo.png" />
</a>
<hr />
<p>
<span class="highlight">raystorm:</span>
is a data services and consulting company which built their Impulse crawling framework on top of Scrapy.</p>
</div>
</div>
</div>
<div class='fourth-row'>
<div class="container">
<div class="block-left">
<h2><span class="regular">Want to be part of this list?</span> <br /></h2>
<p>Fork <a href="https://github.com/scrapy/scrapy.org">this Github repo</a>,
add yourself, and send a pull request!</p>
</div>
<div class="block-right">
<a href="../support/">
<h2 class="float"><span class="regular">Commercial support?</span> <br /> Meet the Scrapy pros</h2>
<img src="../img/scrapy-pros.png" />
</a>
<p><span class="big-font">Be a part of the Community</span> <br />
<a href="../community/">Join our channels and collaborate!</a>.</p>
</div>
</div>
</div>