Overview

datasheets is a library for interfacing with Google Sheets, including reading data from, writing data to, and modifying the formatting of Google Sheets. It is built on top of Google’s google-api-python-client and oauth2client libraries using the Google Drive v3 and Google Sheets v4 REST APIs.

It can be installed with pip via pip install datasheets.

Basic Usage

Get the necessary OAuth credentials from the Google Developer Console as described in Getting OAuth Credentials.

After that, using datasheets looks like:

import datasheets

# Create a data set to upload
import pandas as pd
df = pd.DataFrame([('a', 1.3), ('b', 2.7), ('c', 3.9)], columns=['letter', 'number'])

client = datasheets.Client()
workbook = client.create_workbook('my_new_workbook')
tab = workbook.create_tab('my_new_tab')

# Upload a data set
tab.insert_data(df, index=False)

# Fetch the data again
df_again = tab.fetch_data()

# Show workbooks you have access to; this may be slow if you are shared on many workbooks
client.fetch_workbooks_info()

# Show tabs within a given workbook
workbook.fetch_tab_names()

Documentation Contents

Functionality

Note that before you can use datasheets you will need to set up your access keys as described in Getting OAuth Credentials.

Workbook Interactions

See all workbooks that exist
client = datasheets.Client()
client.fetch_workbooks_info()

# Optionally limit to a specific folder
client.fetch_workbooks_info(folder='Finance Reports')

This produces a pandas.DataFrame:

_images/fetch_workbooks_info.png
See all folders that exist
client.fetch_folders()

# Optionally limit to only folders you own
client.fetch_folders(only_mine=True)
Get a workbook
workbook = client.fetch_workbook('Marketing Projections')
Create a workbook
client.create_workbook('Q2 Roadmap')
Delete a workbook
client.delete_workbook('Q2 Roadmap')
Get a workbook’s URL
workbook.url

Tab Interactions

See all tabs within a workbook
workbook = client.fetch_workbook('Marketing Projections')
workbook.fetch_tab_names()

This produces a pandas.DataFrame:

_images/fetch_tab_names.png
Get a tab
workbook.fetch_tab('Radio Spend')
Create a tab
workbook.create_tab('Podcast Performance')
Delete a tab
workbook.delete_tab('Podcast Performance')

Sharing and Unsharing

See who is shared on a workbook
workbook.fetch_permissions()

This produces a pandas.DataFrame:

_images/fetch_permissions.png
Share or unshare a workbook
workbook.share(email='bossman@mycompany.com', role='writer', notify=True,
               message="Here's the 2020 revenue projections you'd asked for")

workbook.unshare(email='coworker@mycompany.com')

Data Interactions

Get all data in a tab
tab = workbook.fetch_tab('Radio Spend')
df = tab.fetch_data()

# Optionally return a dict of rows, where the keys are the values of the first row of cells
data = tab.fetch_data(fmt='dict')

# Or return a list of headers (the values of the first row) and a list of rows
data = tab.fetch_data(fmt='list')
Add data to a tab
# Clear all data in a tab and replace it with a new data set
tab.insert_data(df)

# Add more data to a tab
tab.append_data(df)

# For either command, optionally skip uploading the index for the DataFrame
tab.insert_data(df, index=False)

In addition, by setting autoformat=True within the insert_data or append_data methods the data set will be formatted within the tab as shown below:

_images/formatted_upload.png

The dimensions of the tab match the data set, the headers are formatted nicely, all cells are left-aligned, and column widths are auto-scaled to fit their contents.

Remove all data from a tab
tab.clear_data(df)

Tab Formatting Interactions

# Add rows
tab.add_rows(30)

# Add columns
tab.add_columns(5)

# Change horizontal and vertical alignment of all cells in tab
tab.align_cells(horizontal='LEFT', vertical='MIDDLE')

# Alter the number of rows or columns in a tab; if new dimensions are smaller
# than current values the tab will be trimmed down to that size
tab.alter_dimensions(nrows=25, ncols=10)

# Shortcut to color headers dark gray, set all cells to use Proxima Nova size 10, left-align
# and middle-align all cells, resize columns to fit their data, and eliminate empty columns
# and rows from tab
tab.autoformat(n_header_rows=2)

# Resize widths of all columns in a tab to fit their data
tab.autosize_columns()

# Change font and font size for all cells in a tab
tab.format_font(font='Proxima Nova', size=10)

# Set header rows in a tab to be dark gray with off-white text, font Proxima Nova size 10,
# left-aligned and middle-aligned, and rows will be made "frozen" so that when the user
# scrolls these rows stay visible
tab.format_headers(nrows=3)

In addition, anything not explicitly supported by the datasheets library as a stand-alone method can be accomplished using the Workbook.batch_update method and referencing Google Sheets’ spreadsheets.batchUpdate method. More details and an example exist within the docstring for datasheets.Workbook.batch_update().

Getting OAuth Credentials

To utilize datasheets, you’ll need to get it connected to your Google Drive and Google Sheets repositories. There are two possible ways to do this: user-granted authorization (‘OAuth Client ID’) or the use of a service account (‘OAuth Service Account’), both of which are implemented through the underlying Google oauth2client library. A description of both authentication mechanisms follows below, but before we can use either mechanism we need to create a project.

Setting Up A Project

  1. Head to https://console.developers.google.com and sign in (or sign up if you haven’t yet).

  2. Create a new project. You can title this whatever you want; it won’t be shown to you or any other end user by datasheets. Note that if your organization has put restrictions on who can create projects you may have to reach out to get added to an existing datasheets project (if it exists) or ask to have a project created.

    _images/project_creation.png
  3. Use the search bar at the top to search for Google Drive API, then click ‘Enable’.

    _images/search_for_api.png _images/enable_api.png
  4. Do the same as above to enable the Google Sheets API.

  5. Click on ‘Credentials’, go to ‘OAuth consent screen’, and add a product name (e.g. ‘datasheets Python library’) and click Save.

    _images/oauth_consent_screen.png

OAuth Client ID Access

OAuth Client ID access is the kind of auth that shows up when you click ‘Sign in with Google’ on a page: you authorize the application to access the information attached to a Google account. For non-ETL-based use, this is the auth you want. The list of possible accounts to sign in with is based on the gmail accounts tied to that particular browser. The end-user flow, which is entered on instantiation of a datasheets.Client() object, would be as follows.

The list of possible accounts is brought up:

_images/oauth_account_selection.png

After selecting an account (and authorizing access on the next screen, if this is the first time you’ve selected that account), an authorization confirmation screen is presented:

_images/oauth_signin_success_screen.png

To set this up:

  1. Click on ‘Create Credentials’.

    _images/create_credentials_button.png
  2. Choose ‘OAuth client ID’.

    _images/create_oauth_client_id_button.png
  3. Select ‘Web application’, input a name, and enter Authorized JavaScript origins and Authorized redirect URIs. The Authorized JavaScript origins are http://localhost:8888 and http://localhost:8080. The Authorized redirect URIs are the same except with a forward slash (/) appended.


    Note

    Don’t forget the forward slash at the end of the Authorized redirect URIs or you will get an error!

    _images/create_oauth_client_settings.png
  4. After clicking ‘Create’ and then ‘Ok’ on the following screen, click the download button.

    _images/download_oauth_client_secrets.png
  5. Move the file and rename it. By default datasheets will look for this file in ~/.datasheets/client_secrets.json, but if you’d prefer to place this file elsewhere you just have to specify the path in the $DATASHEETS_SECRETS_PATH envvar. Assuming you want to use the default location though, first create the ~/.datasheets folder and move the file you downloaded there, renaming it to client_secrets.json. You can do this in Finder, but if you’d prefer terminal then follow the commands below, using your client_secrets file’s original name:

    mkdir ~/.datasheets
    mv ~/Downloads/client_secret_<really_long>.apps.googleusercontent.com.json ~/.datasheets/client_secrets.json
    

Congratulations! You’re now set up with client auth and can start using the library! If you only plan to use datasheets on your local machine then you’re done, but if you might need service account access then read on!

OAuth Service Account Access

OAuth service accounts are pseudo-users that have their own email address. Documents shared with a service account will be accessible by the service (as a side note, this is how gspread works). This approach is ideal for ETL as a user doesn’t have to manually authorize access each time. However, a major weakness to this method is that anyone who gets hold of the credentials we are about to generate will be able to access all documents shared with this account, which is a weakness that the OAuth Client ID access does not share. With that in mind, the credentials we are about to generate should be kept secure.

To set up service account access:

  1. Click on ‘Create Credentials’ again, this time choosing ‘Service account key’.

    _images/create_service_account_button.png
  2. Select ‘New service account’, under role select Project -> Browser, and key type ‘JSON’, and click ‘Create’. The file will automatically be downloaded.

    _images/create_service_account_settings.png
  3. As before, save the file to ~/.datasheets/, this time naming the file service_key.json. Again, you can do this in Finder, but if you’d prefer terminal follow the commands below, using your service_key file’s original name:

    mv ~/Downloads/datasheets-auth-1e8be8d27209.json ~/.datasheets/service_key.json
    

As before, if you have a preferred location for this file you can instead place it there and specify the path to the file to datasheets with the envvar $DATASHEETS_SERVICE_PATH.

To use datasheets with your service account, create your client with datasheets.Client(service=True).

Congratulations! You’re all set up! Now just pip install datasheets and away you go!

Comparison To gspread

gspread is a popular library for reading data from Google Sheets, and it was a big inspiration for this project. datasheets attempts to improve on gspread by:

  • Supporting uploading of pandas DataFrames into Google Sheets.
  • Ensuring that data pulled from Google Sheets keeps the data type it had within Google Sheets, e.g. datetimes will come in as datetimes, numbers as numbers, etc. Within gspread, non-numbers are generally all converted to strings.
  • Allowing users to authenticate with their own Google account, meaning there is no need to create a service account and share all your files with it (though you can still do both of those things). Service accounts can be a security liability (as described below under “OAuth Service Account Access”); being able to use OAuth Client ID access diminishes that concern.
  • Providing a number of additional tools for interacting with Google Sheets: format them, add/remove rows and columns, create or delete tabs and workbooks, share or unshare a workbook with users, etc. See below for more details.
  • Using more modern, Google-maintained tools (e.g. Google’s google-api-python-client and oauth2client libraries) as opposed to parsing XML feeds.

API Reference

An Index of all functionality also exists.

Client class

class datasheets.Client(service=False, storage=True, user_agent='Python datasheets library')
__init__(service=False, storage=True, user_agent='Python datasheets library')

Create an authenticated client for interacting with Google Drive and Google Sheets

Parameters:
  • service (bool) –

    Whether to authenticate as a user or as a service account. If service=False, you will be prompted to authorize this instance to access the Google Drive attached to one of your Gmail accounts.

    Service-based authorization proceeds using the JSON file located at $DATASHEETS_SERVICE_PATH (default: ~/.datasheets/service_key.json).

    User-based authorization proceeds using the client secrets stored at $DATASHEETS_SECRETS_PATH (default: ~/.datasheets/client_secrets.json). Successful authentication by this method creates a set of credentials. By default these credentials are stored at $DATASHEETS_CREDENTIALS_PATH (default: ~/.datasheets/client_credentials.json), though storage of these credentials can be disabled with storage=False.

  • storage (bool) –

    Whether to use authorized credentials stored at $DATASHEETS_CREDENTIALS_PATH (or the default of ~/.datasheets/client_credentials.json). If credentials have not been stored yet or are invalid, whether to store newly obtained credentials at this location.

    If False, authorization will be requested every time a new Client instance is created. This mode deliberately does not store credentials to disk in order to allow for use of the library in multi-user environments.

  • user_agent (str) – The user agent tied to new credentials, if new credentials are required. This is primarily metadata, and thus unless you have a reason to change this the default value is probably fine.
create_workbook(filename, folders=())

Create a blank workbook with the specific filename

Parameters:
  • filename (str) – The name to give to the new workbook
  • folders (tuple) – An optional tuple of folders to add the new workbook to
Returns:

An instance of the newly created workbook

Return type:

datasheets.Workbook

delete_workbook(filename=None, file_id=None)

Delete a workbook from Google Drive

Either filename (i.e. title) or file_id should be provided. Providing file_id is preferred as it is more precise.

Parameters:
  • filename (str) – The name of the workbook to delete
  • file_id (str) – The ID of the workbook to delete
Returns:

None

fetch_folders(only_mine=False)

Fetch all folders shared with this account

Parameters:only_mine (bool) – If True, limit results to only those folders owned by this user
Returns:One row per folder listing folder name, ID, most recent modified time, and webview link to the folder
Return type:pandas.DataFrame
fetch_workbook(filename=None, file_id=None)

Fetch a workbook

Either filename (i.e. title) or file_id should be provided. Providing file_id is preferred as it is more precise.

Parameters:
  • filename (str) – The name of the workbook to fetch
  • file_id (str) – The ID of the workbook to fetch
Returns:

An instance of the requested workbook

Return type:

datasheets.Workbook

fetch_workbooks_info(folder=None)

Fetch information on all workbooks shared with this account

Parameters:folder (str) – An optional folder name to limit the results to
Returns:One row per workbook listing workbook name, ID, most recent modified time, and webview link to the workbook
Return type:pandas.DataFrame

Workbook class

class datasheets.Workbook(filename, file_id, client, drive_svc, sheets_svc)
__init__(filename, file_id, client, drive_svc, sheets_svc)

Create a datasheets.Workbook instance of an existing Google Sheets doc

This class in not intended to be directly instantiated; it is created by datasheets.Client.fetch_workbook().

Parameters:
  • filename (str) – The name of the workbook
  • file_id (str) – The Google Sheets-assigned ID for the file
  • client (datasheets.Client) – The client instance that instantiated this workbook
  • drive_svc (googleapiclient.discovery.Resource) – An instance of Google Drive
  • sheets_svc (googleapiclient.discovery.Resource) – An instance of Google Sheets
batch_update(body)

Apply updates to a workbook or tab using Google Sheets’ spreadsheets.batchUpdate method

Parameters:body (list) – A list of requests, with each request provided as a dict

The Google Sheets batch update method is a flexible method exposed by the Google Sheets API that permits effectively all functionality allowed for by the Google Sheets.

For more details on how to use this method, see the following:

Explanation of spreadsheets.batchUpdate method:

List of available request types and the parameters they take:

Example

The following is an example request that would perform only one update operation (the ‘repeatCell’ operation):

request_body = {
    'repeatCell': {
        'range': {
            'sheetId': self.tab_id,
            'startRowIndex': 0,
            'endRowIndex': self.nrows
        },
        'cell': {
            'userEnteredFormat': {
                'horizontalAlignment': horizontal,
                'verticalAlignment': vertical,
            }
        },
        'fields': 'userEnteredFormat(horizontalAlignment,verticalAlignment)'
    }
}
body = {'requests': [request_body]}
client

Property for the client instance that instantiated this workbook

create_tab(tabname, nrows=1000, ncols=26)

Create a new tab in the given workbook

Parameters:
  • tabname (str) – The name for the tab
  • nrows (int) – An integer number of rows for the tab to have. The Google Sheets default is 1000
  • ncols (int) – An integer number of columns for the tab to have. The Google Sheets default is 26
Returns:

An instance of the newly created tab

Return type:

datasheets.Tab

delete_tab(tabname)

Delete a tab with the given name from the current workbook

Parameters:tabname (str) – The name of the tab to delete
Returns:None
fetch_permissions()

Fetch information on who is shared on this workbook and their permission level

Returns:One row per email address shared, including the permission level that that email has been granted
Return type:pandas.DataFrame
fetch_tab(tabname)

Return a datasheets.Tab instance of the given tab associated with this workbook

Parameters:tabname (str) – The name of the tab to fetch
Returns:An instance of the requested tab
Return type:datasheets.Tab
fetch_tab_names()

Show the names of the tabs within the workbook, returned as a pandas.DataFrame.

Returns:One row per tabname within the workbook
Return type:pandas.DataFrame
share(email, role='reader', notify=True, message=None)

Share this workbook with someone.

Parameters:
  • email (str) – The email address to share the workbook with
  • role (str) – The type of permission to grant. Values must be one of ‘owner’, ‘writer’, or ‘reader’
  • notify (bool) – If True, send an email notifying the recipient of their granted permission. These notification emails are the same as what Google sends when a document is shared through Google Drive
  • message (str) – If notify is True, the message to send with the email notification
Returns:

None

unshare(email)

Unshare this workbook with someone.

Parameters:email (str) – The email address that will be unshared
Returns:None

Tab class

class datasheets.Tab(tabname, workbook, drive_svc, sheets_svc)
__init__(tabname, workbook, drive_svc, sheets_svc)

Create a datasheets.Tab instance of an existing Google Sheets tab.

This class in not intended to be directly instantiated; it is created by datasheets.Workbook.fetch_tab().

Parameters:
  • tabname (str) – The name of the tab
  • workbook (datasheets.Workbook) – The workbook instance that instantiated this tab
  • drive_svc (googleapiclient.discovery.Resource) – An instance of Google Drive
  • sheets_svc (googleapiclient.discovery.Resource) – An instance of Google Sheets
add_columns(n)

Add n columns to the given tab

Parameters:n (int) – The number of columns to add
Returns:None
add_rows(n)

Add n rows to the given tab

Parameters:n (int) – The number of rows to add
Returns:None
align_cells(horizontal='LEFT', vertical='MIDDLE')

Align all cells in the tab

Parameters:
  • horizontal (str) – The horizontal alignment for cells. May be one of ‘LEFT’, ‘CENTER’, or ‘RIGHT’
  • vertical (str) – The vertical alignment for cells. May be one of ‘TOP’, ‘MIDDLE’, ‘BOTTOM’
Returns:

None

alter_dimensions(nrows=None, ncols=None)

Alter the dimensions of the current tab.

If either dimension is left to None, that dimension will not be altered. Note that it is possible to set nrows or ncols to smaller than the current tab dimensions, in which case that data will be eliminated.

Parameters:
  • nrows (int) – The number of rows for the tab to have
  • ncols (int) – The number of columns for the tab to have
Returns:

None

append_data(data, index=True, autoformat=True)

Append data to the existing data in this tab.

If the new data exceeds the tab’s current dimensions the tab will be resized to accommodate it. Data headers will not be included among the appended data as they are assumed to already be among the existing tab data.

If the dimensions of data are larger than the tab’s current dimensions, the tab will automatically be resized to fit it.

Parameters:
  • data (pandas.DataFrame or dict or list) – The data to be uploaded, formatted as a pandas.DataFrame, a dict of lists, or a list of lists
  • index (bool) – If data is a pandas.DataFrame, whether to upload the index as well
Returns:

None

autoformat(n_header_rows)

Apply default stylings to the tab

This will apply the following stylings to the tab:

  • Header rows will be formatted to a dark gray background and off-white text
  • Font for all cells will be set to size 10 Proxima Nova
  • Cells will be horizontally left-aligned and vertically middle-aligned
  • Columns will be resized to display their largest entry
  • Empty columns and rows will be trimmed from the tab
Parameters:n_header_rows (int) – The number of header rows (i.e. row of labels / metadata)
Returns:None
autosize_columns()

Resize the widths of all columns in the tab to fit their data

Returns:None
clear_data()

Clear all data from the tab while leaving formatting intact

Returns:None
fetch_data(headers=True, fmt='df')

Retrieve the data within this tab.

Efforts are taken to ensure that returned rows are always the same length. If headers=True, this length will be equal to the length of the headers. If headers=False, this length will be equal to the longest row.

In either case, shorter rows will be padded with Nones and longer rows will be truncated (i.e. if there are 3 headers then all rows will have 3 entries regardless of the amount of populated cells they have).

Parameters:
  • headers (bool) – If True, the first row will be used as the column names for the pandas.DataFrame. Otherwise, a 0-indexed range will be used instead
  • fmt (str) – The format in which to return the data. Accepted values: ‘df’, ‘dict’, ‘list’
Returns:

When fmt=’df’ –> pandas.DataFrame

When fmt=’dict’ –> list of dicts, e.g.:

[{header1: row1cell1, header2: row1cell2},
 {header1: row2cell1, header2: row2cell2},
 ...]

When fmt=’list’ –> tuple of header names, list of lists with row data, e.g.:

([header1, header2, ...],
 [[row1cell1, row1cell2, ...], [row2cell1, row2cell2, ...], ...])

format_font(font='Proxima Nova', size=10)

Set the font and size for all cells in the tab

Parameters:
  • font (str) – The name of the font to use
  • size (int) – The size to set the font to
Returns:

None

format_headers(nrows)

Format the first n rows of a tab.

The following stylings will be applied to these rows:

  • Background will be set to dark gray with off-white text
  • Font will be set to size 10 Proxima Nova
  • Text will be horizontally left-aligned and vertically middle-aligned
  • Rows will be made “frozen” so that when the user scrolls these rows stay visible
Parameters:nrows (int) – The number of rows of headers in the tab
Returns:None
insert_data(data, index=True, autoformat=True)

Overwrite all data in this tab with the provided data.

All existing data in the tab will be removed, even if it might not have been overwritten (for example, if there is 4x2 data already in the tab and only 2x2 data is being inserted).

If the dimensions of data are larger than the tab’s current dimensions, the tab will automatically be resized to fit it.

Parameters:
  • data (pandas.DataFrame or dict or list) – The data to be uploaded, formatted as a pandas.DataFrame, a dict of lists, or a list of lists
  • index (bool) – If data is a pandas.DataFrame, whether to upload the index as well
Returns:

None

ncols

Property for the number (int) of columns in the tab

nrows

Property for the number (int) of rows in the tab

tab_id

Property that gives the ID for the tab

workbook

Property for the workbook instance that this tab belongs to

Helpers

datasheets.create_tab_in_existing_workbook(filename, tabname, file_id=None)

Create a new tab in an existing workbook and return an instance of that tab

Either filename (i.e. title) or file_id should be provided. Providing file_id is preferred as it is more precise.

Parameters:
  • filename (str) – The name of the existing workbook in which the tab will be created
  • tabname (str) – The name of the tab to be created
  • file_id (str) – The unique file_id for the workbook. If provided, the file_id will be used instead of the filename
Returns:

An instance of the newly created tab

Return type:

datasheets.Tab

datasheets.create_tab_in_new_workbook(filename, tabname, emails=(), role='reader', notify=True, message=None)

Create a new tab in a new workbook and return an instance of that tab

Parameters:
  • filename (str) – The name of the workbook to be created
  • tabname (str) – The name of the tab to be created
  • emails (str or tuple) – The email address(es) to grant the permission to. This may be one address in string form or a series of addresses in tuple form
  • role (str or tuple) – The type of permission(s) to grant. This can be either a tuple of the same size as emails or a single value, in which case all emails are granted that permission level. Values must be one of ‘owner’, ‘writer’, or ‘reader’
  • notify (bool) – If True, send an email notifying the recipient(s) of their granted permissions. These notification emails are the same as what Google sends when a document is shared through Google Drive
  • message (str) – If notify is True, the message to send with the email notification
Returns:

An instance of the newly created tab

Return type:

datasheets.Tab

datasheets.helpers.convert_cell_index_to_label(row, col)

Convert two cell indexes to a string address

Parameters:
  • row (int) – The cell row number, starting from 1
  • col (int) – The cell column number, starting from 1

Note that Google Sheets starts both the row and col indexes at 1.

Example

>>> sheets.convert_cell_index_to_label(1, 1)
A1
>>> sheets.convert_cell_index_to_label(10, 40)
BH10
Returns:The cell reference as an address (e.g. ‘B6’)
Return type:str
datasheets.helpers.convert_cell_label_to_index(label)

Convert a cell label in string form into one based cell indexes of the form (row, col).

Parameters:label (str) – The cell label in string form

Note that Google Sheets starts both the row and col indexes at 1.

Example

>>> sheets.convert_cell_label_to_index('A1')
(1, 1)
>>> sheets.convert_cell_label_to_index('BH10')
(10, 40)
Returns:The cell reference in (row_int, col_int) form
Return type:tuple

Exceptions

exception datasheets.exceptions.DatasheetsException

Base Exception for all other datasheets exceptions

This is intended to make catching exceptions from this library easier.

exception datasheets.exceptions.FolderNotFound

Attempting to open non-existent or inaccessible folder

exception datasheets.exceptions.MultipleWorkbooksFound

Multiple workbooks found for the given filename

exception datasheets.exceptions.PermissionNotFound

Trying to retrieve non-existent permission for workbook

exception datasheets.exceptions.TabNotFound

Trying to open non-existent tab

exception datasheets.exceptions.WorkbookNotFound

Trying to open non-existent or inaccessible workbook

Developing / Testing

Getting Set Up

First, get your Python environment set up:

mkvirtualenv datasheets
pip install -e . -r requirements-dev.txt

Various testing functionality exists:

  • make test - Run tests for both Python 2 and 3
  • pytest - Run tests for whichever Python version is in your virtualenv
  • make coverage - Check code coverage

Manual tests also exist in the tests/manual_testing.ipynb Jupyter Notebook. To run the manual tests, install Jupyter Notebook (pip install jupyter notebook), then run jupyter notebook, open the file in the browser, and execute each cell.

Releasing A New Version

If you make a PR that gets merged into master, a new version of datasheets can be created as follows.

  1. Increment the __version__ in the datasheets/__init__.py file and commit that change.

  2. Push a new git tag to the repo by doing:

    • Write the tag message in a dummy file called tag_message. We do this to allow multi-line tag messages
    • git tag x.x.x -F tag_message
    • git push --tags origin master
  3. Run make release_pypitest to test that you can release to pypi.

  4. Run make release_pypi to actually push the release to pypi.

Indices and tables