Skip to content

Running Python script

Hyojoon Kim edited this page Jul 28, 2017 · 18 revisions

Note: You probably want to run this in the web server where you want to host the graphs, charts, and tables.

  1. Git clone globes-stats source

    1. $ mkdir ~/globus
    2. $ cd ~/globus
    3. $ git clone https://github.com/PrincetonUniversity/globus-stats
  2. Make sure you have Python and two libraries

    1. Install pytz
      • $ pip install pytz
      • Or build from source.
        1. Download *.zip source from here.
        2. Unzip. Enter created directory.
        3. $ python setup.py install --user (if you have sudo, put sudo in front and remove --user)
    2. Install Globus Python SDK. Follow installation guide here.
      • Generally, use "python pip" if possible.
      • If "pip" is not available or you don't have sudo or not root, build from source:
        1. $ git clone https://github.com/globus/globus-sdk-python.git
        2. $ cd glogus-sdk-python
        3. $ python setup.py install --user (if you have sudo, put sudo in front and remove --user)
    3. Test by running script:
      • $ python get_globus_data.py --help
      • There should be no failures or errors, just a print of how to use the script.
  3. Configure configuration file and create client secret file

    1. Create copy and change name.
      1. $ cd ~/globus/globus-stats
      2. $ cp globus_config_template.json globus_config.json
    2. Fill content in “globus_config.json”.
      • Detailed explanation is in the template JSON file as comments.
      • Basically, you need to fill timezone, client ID, client secret, user IDs ("owner_id" in Globus) you want to exclude, Globus endpoint IDs (UUID) you administer, dates to exclude.
      • You can find Globus endpoint UUID in each endpoint's "Overview" tab.
    3. Create client secret file. Filename should be "client.secret".
  4. Run script

    1. Decide where to store output CSV files. Create that directory. E.g., “~/data”.
      • $ mkdir ~/data/
    2. Specify configuration file (-c option), output directory (-o option), and if you want to pull new raw data from Globus.org (-n option)
      1. $ cd ~/globus/globus-stats
      2. $ python get_globus_data.py -c globus_config.json -o ~/data/ -n
    3. Some information should be printed in the standard output.
    4. See here for more information on running the script and its options. ("Basic usage of the script itself")
  5. Check data

    1. Check if a bunch of CSV files (and some additional files) are stored in the data directory of your choice.