Skip to content

xflr6/gsheets

Repository files navigation

gsheets

Latest PyPI Version License Supported Python Versions Wheel format

Build Codecov Readthedocs stable Readthedocs latest

gsheets is a small wrapper around the Google Sheets API (v4) to provide more convenient access to Google Sheets from Python scripts.

Turn on the API, download an OAuth client ID as JSON file, and create a Sheets object from it. Use its index access (__getitem__) to retrieve SpreadSheet objects by their id, or use .get() with a sheet URL. Iterate over the Sheets object for all spreadsheets, or fetch spreadsheets by title with the .find() and .findall() methods.

SpreadSheet objects are collections of WorkSheets, which provide access to the cell values via spreadsheet coordinates/slices (e.g. ws['A1']) and zero-based cell position (e.g. ws.at(0, 1)).

Save WorkSheets (or all from a SpreadSheet) as CSV files with the .to_csv()-method. Create pandas.DataFrames from worksheet with the .to_frame()-method.

Links

Installation

This package runs under Python 3.9+, use pip to install:

$ pip install gsheets

This will also install google-api-python-client and its dependencies, notably httplib2 and oauth2client, as required dependencies.

Quickstart

Log into the Google Developers Console with the Google account whose spreadsheets you want to access. Create (or select) a project and enable the Drive API and Sheets API (under Google Apps APIs).

Go to the Credentials for your project and create New credentials > OAuth client ID > of type Other. In the list of your OAuth 2.0 client IDs click Download JSON for the Client ID you just created. Save the file as client_secrets.json in your home directory (user directory). Another file, named storage.json in this example, will be created after successful authorization to cache OAuth data.

On you first usage of gsheets with this file (holding the client secrets), your webbrowser will be opened, asking you to log in with your Google account to authorize this client read access to all its Google Drive files and Google Sheets.

Create a sheets object:

>>> from gsheets import Sheets

>>> sheets = Sheets.from_files('~/client_secrets.json', '~/storage.json')
>>> sheets  #doctest: +ELLIPSIS
<gsheets.api.Sheets object at 0x...>

Fetch a spreadsheet by id or url:

# id only
>>> sheets['1dR13B3Wi_KJGUJQ0BZa2frLAVxhZnbz0hpwCcWSvb20']
<SpreadSheet 1dR13...20 'Spam'>

# id or url
>>> url = 'https://docs.google.com/spreadsheets/d/1dR13B3Wi_KJGUJQ0BZa2frLAVxhZnbz0hpwCcWSvb20'
>>> s = sheets.get(url)
>>> s
<SpreadSheet 1dR13...20 'Spam'>

Access worksheets and their values:

# first worksheet with title
>>> s.find('Tabellenblatt2')
<WorkSheet 1747240182 'Tabellenblatt2' (10x2)>

# worksheet by position, cell value by index
>>> s.sheets[0]['A1']
'spam'

# worksheet by id, cell value by position
>>> s[1747240182].at(row=1, col=1)
1

Dump a worksheet to a CSV file:

>>> s.sheets[1].to_csv('Spam.csv', encoding='utf-8', dialect='excel')

Dump all worksheet to a CSV file (deriving filenames from spreadsheet and worksheet title):

>>> csv_name = lambda infos: '%(title)s - %(sheet)s.csv' % infos
>>> s.to_csv(make_filename=csv_name)

Load the worksheet data into a pandas DataFrame (requires pandas):

>>> s.find('Tabellenblatt2').to_frame(index_col='spam')
      eggs
spam
spam  eggs
...

WorkSheet.to_frame() passes its kwargs on to pandas.read_csv()

See also

  • gsheets.py – self-containd script to dump all worksheets of a Google Spreadsheet to CSV or convert any subsheet to a pandas DataFrame (Python 2 prototype for this library)
  • gspread – Google Spreadsheets Python API (more mature and featureful Python wrapper, updated to API v4)
  • example Jupyter notebook using gspread to fetch a sheet into a pandas DataFrame
  • df2gspread – Transfer data between Google Spreadsheets and Pandas (build upon gspread, currently Python 2 only, GPL)
  • pygsheets – Google Spreadsheets Python API v4 (v4 port of gspread providing further extensions)
  • gspread-pandas – Interact with Google Spreadsheet through Pandas DataFrames
  • pgsheets – Manipulate Google Sheets Using Pandas DataFrames (independent bidirectional transfer library, using the legacy v3 API, Python 3 only)
  • PyDrive – Google Drive API made easy (google-api-python-client wrapper for the Google Drive API, currently v2)

License

This package is distributed under the MIT license.