Skip to content

Vitalkrilov/mediawiki_downloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Mediawiki downloader

This python script scans website for pages and downloads them. Notice: it will not download media files like photos, videos and so on.

Download directory: script execution directory.

It indexes all pages and creates a file "num2name.txt" where you can get original page name for each number of xml-document (see below). After this it downloads all pages in XML format with all history. Names look like "0.xml", "1.xml", ....

Tested on Linux where it works. Should be work on Mac or Windows too.

Requirements

  • Obviously, python

  • BeautifulSoup4

         pip install bs4

Usage

python mediawiki_downloader.py "url"

URL should be like in example below (with 'http://' or 'https://' at beginning).

Example:

python mediawiki_downloader.py "http://wiki.example.com"

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages