Metadrive helps control information from different Internet resources (e.g. Linkedin, Halfbakery, etc). It provides one API to rule them all at the operating system filesystem level, via allowing to mount and syncing web resources, as if they are disks (mounted filesystems) on your operating system. To gather the information from a specific resource, there must be so called driver written specifically for the resource. There are drivers which already exist. For example,
- Halfbakery: halfbakery_driver
- Linkedin: linkedin_driver
- Metaculus: metaculus_driver
- HTH Worldwide: hthworld_driver
- Kompass: kompass_driver
- ResearchGate: researchgate_driver
- Versli Lietuva: verslilietuva_driver
Many drivers are awaiting to be implemented at [drivernet][https://github.com/drivernet]. Studying the Metadrive will help developers to write the drivers for those resources which are needed them right now. A unified API is the killer feature of Metadrive and allows writing drivers to have a unified UI to the whole world.
sudo apt install virtualenv python3 python3-dev build-essential chromium-browser chromium-chromedriver pandoc
The guide provides for the instructions on how to install Metadrive to a virtual environment, so create and activate it first, running the following commands:
pip install metadrive
You might need to pip install -U pandas
, as at this point, the library is not updated.
Define a config in ~/.metadrive/config
, like:
[GITHUB]
username = mindey
[PROXIES] # leave empty, if none
http = socks5h://127.0.0.1:9999
https = socks5h://127.0.0.1:9999
[GPG]
key = 5AFDB16B89805133F450688BDA580D1D5F5CC7AD
[DRIVERS]
auto_upgrade = False
[SELENIUM]
headless = False
[DRIVER_BACKENDS]
chrome = /usr/bin/chromedriver
Note: by default, all sessions are stored at ~/.metadrive/sessions/
, under the subfolder of underscored "metadrive", e.g., _selenium
default session is at ~/.metadrive/sessions/_selenium/default
, or _requests
default session data is at ~/.metadrive/sessions/_requests/default
drive = metadrive._selenium.get_drive(profile='default')
drive = metadrive._selenium.get_drive(headless=False, profile='default', proxies={'socksProxy': '127.0.0.1:7777'})
Mounting site to ~/Sites
or to custom location:
drive halfbakery.com # defaults to /home/<user>/Sites
drive halfbakery.com /my/custom/location
The command above will ask you to type your GitHub username. When you are done, the .metadrive/config
will be created in your home directory and the server will start. The example of how .metadrive/config
may look like:
[GITHUB]
username = mindey
[DRIVER_BACKENDS]
chrome = /usr/bin/chromedriver
[PROXIES]
http =
https =
[GPG]
key = 5AFDB16B89805133F450688BDA580D1D5F5CC7AD
import metadrive
# Examples:
drive = metadrive._requests.get_drive() # metadrive: 'requests', driver: None, profile: 'default'
drive = metadrive._requests.get_drive(profile='novel') # metadrive: 'requests', driver: None, profile: 'novel'
drive = metadrive._selenium.get_drive(headless=False) # metadrive: 'selenium', driver: None, profile: 'default'
# Examples:
drive = metadrive.drives.get('halfbakery-driver') # metadrive: implied, driver: halfbakery-driver, profile: 'default'
drive = metadrive.drives.get('halfbakery-driver:SomeName') # metadrive: implied, driver: halfbakery-driver, profile: 'SomeName'
This installs the pip install halfbakery-driver
, and uses it. Each driver has to have .__site_url__
attribute, and this way, metadrive determines which resource requires which driver to read.
The documentation for Metadrive can be found at https://metadrive.readthedocs.io.
See AUTHORS.
metadrive is available under the Apache License, Version 2.0.