-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.txt
37 lines (34 loc) · 854 Bytes
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
= About SuperMario =
SuperMario is an advance web cralwer library written in python. It
provides a number of methods to mine data from kinds of sites.
== License ==
BSD License
See 'LICENSE' for details.
== Requirements ==
Platform: *nix like system (Unix, Linux, Mac OS X, etc.)
Python: 2.5+
Storage: mongodb
Some other python models:
- simplejson
- BeautifulSoup
- eventlet
- PIL
- pycurl
- chardet
- feedparser
- mongokit
- templatemaker
- flickrapi
- pyyaml
- MySQLdb
- dateutil
== Features ==
+ robots.txt protocol supported;
+ cache URL 's HTML;
+ normalize URL;
+ convert all content into unicode;
+ extract MainText from HTML by specific a * link-threshold *
+ convert partial RSS feed to full RSS feed;
+ proxies list support;
+ cookie keep support;
+ login support;