diff --git a/README.md b/README.md index c88ea58..3f40afa 100644 --- a/README.md +++ b/README.md @@ -9,8 +9,10 @@ Content include in this repository are listed below. |----|-----|----| | **[Data_Discovery_CMR_API_Request.ipynb](/python/tutorials/Data_Discovery_CMR_API_Request.ipynb)** | Jupyter Notebook | Demonstrates how to search for Earthdata data collections and granules using CMR API and Request Python package| | **[Data_Discovery_CMR_API_Bulk_Query.ipynb](/python/tutorials/Data_Discovery_CMR_API_Bulk_Query.ipynb)** | Jupyter Notebook | Demonstrates how to search and extract data URLs for an entire collection using Python's `asyncio` package| +| **[create_netrc_file.md](/guides/create_netrc_file.md)** | Markdown | Demonstrates how to create a .netrc file in your home directory | | **[bulk_download_using_curl.md](/guides/bulk_download_using_curl.md)** | Markdown | Demonstrates how to bulk download LP DAAC data using Curl from command line | | **[bulk_download_using_wget.md](/guides/bulk_download_using_wget.md)** | Markdown | Demonstrates how to bulk download LP DAAC data using Wget from command line | +| **[DAACDataDownload.py](/python/scripts/daac_data_download_python)** | Python script | Demonstrates how to download LP DAAC data using a command line executable script | The other guides, tutorials, how-tos and scripts can be accessed in our mission specific repositories. diff --git a/guides/bulk_download_using_curl.md b/guides/bulk_download_using_curl.md index 47de5e6..f49ff0d 100644 --- a/guides/bulk_download_using_curl.md +++ b/guides/bulk_download_using_curl.md @@ -11,40 +11,7 @@ This guide shows how to bulk download [LP DAAC](https://lpdaac.usgs.gov/) data u Save download links for your data as a text file using [Nasa Earthdata Search](https://search.earthdata.nasa.gov/search) or [Common Metadata Repository (CMR)](https://www.earthdata.nasa.gov/eosdis/science-system-description/eosdis-components/cmr) API. Follow the steps in the [Earthdata Search guide](https://github.com/nasa/EMIT-Data-Resources/blob/main/guides/Getting_EMIT_Data_using_EarthData_Search.md) to find your data and save the download links. If you prefer to use an API to find your data and save the download links, a tutorial on how to use the CMR API can be found [here](https://github.com/nasa/LPDAAC-Data-Resources/blob/main/python/tutorials/Data_Discovery_CMR_API_Request.ipynb). ## Step 2: Set up a .netrc file for Authentication -Set up a .netrc file in your home directory. - - -- ### Manual set up - - Download the [.netrc template file](https://github.com/nasa/LPDAAC-Data-Resources/tree/main/data/.netrc) and save it in your home directory. - - Open the .netrc file in a text editor and replace `` with your NASA Earthdata Login username and `` with your NASA Earthdata Login password. - -- ### Create .netrc file from the Command Line - - Enter the following in Terminal: - - #### Windows - To Create a .netrc file, enter the following in the command line. - ``` - NUL >> %userprofile%\.netrc | echo machine urs.earthdata.nasa.gov >> %userprofile%\.netrc - ``` - To insert your NASA Earthdata login username and password into the file, enter the following in the Command Prompt and replace your username and password. - - ``` - echo login Insert_Your_Username >> %userprofile%\.netrc | echo password Insert_Your_Password >> %userprofile%\.netrc - ``` - - #### MacOS: - - To Create a .netrc file, enter the following in the command line. - ``` - touch ~/.netrc | chmod og-rw ~/.netrc | echo machine urs.earthdata.nasa.gov >> ~/.netrc - ``` - To insert your NASA Earthdata login username and password into the file, enter the following in the Command Prompt and replace your username and password. - - ``` - echo login Insert_Your_Username >> ~/.netrc | echo password Insert_Your_Password >> ~/.netrc - ``` - -- ### Programmatically: - - Run [Authentication for NASA Earthdata notebook](https://github.com/nasa/LPDAAC-Data-Resources/blob/main/python/how-tos/Earthdata_Authentication__Create_netrc_file.ipynb) to create _.netrc_ file. - - Alternatively, you can run the [EarthdataLoginSetup script](https://github.com/nasa/LPDAAC-Data-Resources/blob/main/python/scripts/EarthdataLoginSetup.py) in a Python interpreter or from the command line. +Follow the instruction on how to create a `.netrc` file [here](https://github.com/nasa/LPDAAC-Data-Resources/blob/main/guides/create_netrc_file.md) to set up the file using your Earthdata Login Credentials. ## Step 3: Download LP DAAC Data You should now be able to run the command to download data directly from the LP DAAC. diff --git a/guides/create_netrc_file.md b/guides/create_netrc_file.md new file mode 100644 index 0000000..da9f881 --- /dev/null +++ b/guides/create_netrc_file.md @@ -0,0 +1,54 @@ +# How to Set up a `.netrc` file for Authentication +There are several ways to set up a `.netrc` file in your home directory. + +- ### Manual set up + Download the [.netrc template file](https://github.com/nasa/LPDAAC-Data-Resources/tree/main/data/.netrc) and save it in your *home/user/* directory, where *user* is your personal user directory. For example: `C:\Users\user\.netrc` or `home/user/.netrc.` + - Open the `.netrc` file in a text editor and replace with your NASA Earthdata Login username and with your NASA Earthdata Login password. + + After editing, the file should look something like this: + + ![Example .netrc 1](../img/example_netrc1.png) + + or you can also have everything on a single line separated by spaces, like: + + ![example .netrc 2](../img/example_netrc2.png) + + +- ### Create .netrc file from the Command Line + + **For Linux/MacOS:** + + To Create a .netrc file, enter the following in the command line, replacing and with your NASA Earthdata username and password. This will create a file in your home directory or append your NASA credentials to an existing file. + + ```bash + echo "machine urs.earthdata.nasa.gov login password " >>~/.netrc + ``` + + **For Windows:** + + To Create a .netrc file, enter the following in the command line, replacing and with your NASA Earthdata username and password. This will create a file in your home directory or append your NASA credentials to an existing file. + + ```cmd + echo machine urs.earthdata.nasa.gov login password >> %userprofile%\.netrc + ``` + + You can verify that the file is correct by opening with a text editor. It should look like an example in one of the figures above. + +- ### Programmatically: + - #### Python + The [`earthaccess` Python library](https://earthaccess.readthedocs.io/en/latest/) provides a convenient way to authenticate, search, and access NASA Earth science data using Python. It can be used to manage Earthdata Login and generate access tokens. + Run the code below to create a `.netrc` file in your home directory. You will be prompted to enter your Earthdata Login credentials. + ```python + import earthaccess + earthaccess.login(persist=True) + ``` + Instruction on how to install the `earthaccess` library is provided [here](https://earthaccess.readthedocs.io/en/latest/quick-start/). + + - #### R + The [`earthdatalogin` R Package](https://cran.r-project.org/web/packages/earthdatalogin/index.html) provides convenient authentication and access to NASA 'EarthData' products using R. `edl_netrc` function will create a `.netrc` file using your Earthdata Login (EDL) credentials. + ```r + library(earthdatalogin) + edl_netrc(username = Insert_Your_Username, password = Insert_Your_Password, netrc_path = '~/.netrc') + ``` + More details can be found [here](https://github.com/boettiger-lab/earthdatalogin/blob/main/R/edl_netrc.R). + diff --git a/python/how-tos/Earthdata_Authentication__Create_netrc_file.ipynb b/python/how-tos/Earthdata_Authentication__Create_netrc_file.ipynb deleted file mode 100644 index 4cb100e..0000000 --- a/python/how-tos/Earthdata_Authentication__Create_netrc_file.ipynb +++ /dev/null @@ -1,154 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "bright-oregon", - "metadata": {}, - "source": [ - "# Authentication for NASA Earthdata " - ] - }, - { - "cell_type": "markdown", - "id": "cardiac-franchise", - "metadata": {}, - "source": [ - "## Summary\n", - "This notebook creates a hidden `.netrc` file containing your [Earthdata Login](https://urs.earthdata.nasa.gov/) credentials in your home directory. This file is needed to access NASA Earthdata assets from a scripting environment like Python.\n", - "\n", - "### Earthdata Login\n", - "\n", - "An Earthdata Login account is required to access data, as well as discover restricted data, from the NASA Earthdata system. Thus, to access NASA data, you need Earthdata Login. Please visit to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up.\n", - "\n", - "### Authentication via netrc File\n", - "\n", - "You will need a netrc file containing your NASA Earthdata Login credentials in order to execute the notebooks. A netrc file can be created manually within text editor and saved to your home directory. An example of the required content is below.\n", - "\n", - "```text\n", - "machine urs.earthdata.nasa.gov\n", - "login \n", - "password \n", - "```\n", - "\n", - "`` and `` would be replaced by your actual Earthdata Login username and password respectively." - ] - }, - { - "cell_type": "markdown", - "id": "numerical-wilderness", - "metadata": {}, - "source": [ - "## Import Required Packages" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "induced-shell", - "metadata": {}, - "outputs": [], - "source": [ - "from netrc import netrc\n", - "from subprocess import Popen\n", - "from platform import system\n", - "from getpass import getpass\n", - "import os" - ] - }, - { - "cell_type": "markdown", - "id": "dominican-carry", - "metadata": {}, - "source": [ - "The code below will:\n", - "\n", - "1. check if you have an netrc file, and if so, varify if those credentials are for the Earthdata endpoint\n", - "2. create a netrc file if a netrc file is not present." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "signal-slide", - "metadata": {}, - "outputs": [], - "source": [ - "urs = 'urs.earthdata.nasa.gov' # Earthdata URL endpoint for authentication\n", - "prompts = ['Enter NASA Earthdata Login Username: ',\n", - " 'Enter NASA Earthdata Login Password: ']\n", - "\n", - "netrc_name = \".netrc\"\n", - "\n", - "# Determine if netrc file exists, and if so, if it includes NASA Earthdata Login Credentials\n", - "try:\n", - " netrcDir = os.path.expanduser(f\"~/{netrc_name}\")\n", - " netrc(netrcDir).authenticators(urs)[0]\n", - "\n", - "# Below, create a netrc file and prompt user for NASA Earthdata Login Username and Password\n", - "except FileNotFoundError:\n", - " homeDir = os.path.expanduser(\"~\")\n", - " Popen('touch {0}{2} | echo machine {1} >> {0}{2}'.format(homeDir + os.sep, urs, netrc_name), shell=True)\n", - " Popen('echo login {} >> {}{}'.format(getpass(prompt=prompts[0]), homeDir + os.sep, netrc_name), shell=True)\n", - " Popen('echo \\'password {} \\'>> {}{}'.format(getpass(prompt=prompts[1]), homeDir + os.sep, netrc_name), shell=True)\n", - " # Set restrictive permissions\n", - " Popen('chmod 0600 {0}{1}'.format(homeDir + os.sep, netrc_name), shell=True)\n", - "\n", - " # Determine OS and edit netrc file if it exists but is not set up for NASA Earthdata Login\n", - "except TypeError:\n", - " homeDir = os.path.expanduser(\"~\")\n", - " Popen('echo machine {1} >> {0}{2}'.format(homeDir + os.sep, urs, netrc_name), shell=True)\n", - " Popen('echo login {} >> {}{}'.format(getpass(prompt=prompts[0]), homeDir + os.sep, netrc_name), shell=True)\n", - " Popen('echo \\'password {} \\'>> {}{}'.format(getpass(prompt=prompts[1]), homeDir + os.sep, netrc_name), shell=True)" - ] - }, - { - "cell_type": "markdown", - "id": "white-democracy", - "metadata": {}, - "source": [ - "#### See if the file was created" - ] - }, - { - "cell_type": "markdown", - "id": "modern-italic", - "metadata": {}, - "source": [ - "If the file was created, we'll see a `.netrc` file in the list printed below. \n", - "\n", - "> **!!! Beware,** your password will be visible if the `.netrc` file is opened in the text editor. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "adjusted-render", - "metadata": {}, - "outputs": [], - "source": [ - "!ls -al ~/" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.16" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/python/scripts/EarthdataLoginSetup.py b/python/scripts/EarthdataLoginSetup.py deleted file mode 100644 index 0712bc5..0000000 --- a/python/scripts/EarthdataLoginSetup.py +++ /dev/null @@ -1,40 +0,0 @@ -# -*- coding: utf-8 -*- -""" ---------------------------------------------------------------------------------------------------- - How to Set Up Direct Access the LP DAAC Data Pool with Python - The following Python code will configure a netrc profile that will allow users to download data - from an Earthdata Login enabled server. See README for additional information ---------------------------------------------------------------------------------------------------- - Author: Cole Krehbiel - Last Updated: 11/20/2018 -------------------------------------------------------------------------------- -""" -# Load necessary packages into Python -from netrc import netrc -from subprocess import Popen -from getpass import getpass -import os - -# -----------------------------------AUTHENTICATION CONFIGURATION-------------------------------- # -urs = 'urs.earthdata.nasa.gov' # Earthdata URL to call for authentication -prompts = ['Enter NASA Earthdata Login Username \n(or create an account at urs.earthdata.nasa.gov): ', - 'Enter NASA Earthdata Login Password: '] - -# Determine if netrc file exists, and if so, if it includes NASA Earthdata Login Credentials -try: - netrcDir = os.path.expanduser("~/.netrc") - netrc(netrcDir).authenticators(urs)[0] - -# Below, create a netrc file and prompt user for NASA Earthdata Login Username and Password -except FileNotFoundError: - homeDir = os.path.expanduser("~") - Popen('touch {0}.netrc | chmod og-rw {0}.netrc | echo machine {1} >> {0}.netrc'.format(homeDir + os.sep, urs), shell=True) - Popen('echo login {} >> {}.netrc'.format(getpass(prompt=prompts[0]), homeDir + os.sep), shell=True) - Popen('echo password {} >> {}.netrc'.format(getpass(prompt=prompts[1]), homeDir + os.sep), shell=True) - -# Determine OS and edit netrc file if it exists but is not set up for NASA Earthdata Login -except TypeError: - homeDir = os.path.expanduser("~") - Popen('echo machine {1} >> {0}.netrc'.format(homeDir + os.sep, urs), shell=True) - Popen('echo login {} >> {}.netrc'.format(getpass(prompt=prompts[0]), homeDir + os.sep), shell=True) - Popen('echo password {} >> {}.netrc'.format(getpass(prompt=prompts[1]), homeDir + os.sep), shell=True) diff --git a/python/scripts/daac_data_download_python/DAACDataDownload.py b/python/scripts/daac_data_download_python/DAACDataDownload.py new file mode 100644 index 0000000..8e19e8f --- /dev/null +++ b/python/scripts/daac_data_download_python/DAACDataDownload.py @@ -0,0 +1,80 @@ +""" +--------------------------------------------------------------------------------------------------- + How to Access the LP DAAC Data with Python + The following Python code example demonstrates how to configure a connection to download LP DAAC data + from Data Pool or NASA Earthdata Cloud. + 'earthaccess' package handles the NASA EarthData Login (EDL). + Last Updated: 09/06/2024 +--------------------------------------------------------------------------------------------------- +""" +# Load necessary packages into Python +from subprocess import Popen +from colorama import Fore, Back, Style +import earthaccess +import argparse +import os + +# ----------------------------------USER-DEFINED VARIABLES--------------------------------------- # +# Set up command line arguments +parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter) +parser.add_argument('-dir', '--directory', required=True, help='Specify directory to save files to') +parser.add_argument('-f', '--files', required=True, help='A single granule URL, or the location of csv or textfile containing granule URLs') +args = parser.parse_args() + +saveDir = args.directory # Set local directory to download to +files = args.files # Define file(s) to download from the LP DAAC Data Pool + +# ---------------------------------SET UP WORKSPACE---------------------------------------------- # +# Create a list of files to download based on input type of files above +if files.endswith('.txt') or files.endswith('.csv'): + with open(files, 'r') as f: + fileList = f.read().splitlines() # If input is text/csv file with file URLs + + # fileList = open(files, 'r').readlines().splitlines() +elif isinstance(files, str): + fileList = [files] # If input is a single file +# Check if the directory exists +if not os.path.isdir(saveDir): + os.makedirs(saveDir) + +# --------------------------------AUTHENTICATION CONFIGURATION----------------------------------- # +# AUthenticate using earthaccess.login function. + +earthaccess.login(strategy = 'netrc', persist = True) + +print(Fore.RED + Back.GREEN + 'Please note: if you just entered your Earthdata Login info, your username and password are now stored in a .netrc file located at the Home directory on this system.') +print(Style.RESET_ALL) +# -----------------------------------------DOWNLOAD FILE(S)-------------------------------------- # +# Loop through and download all files to the directory 8 files at a time, and keeping same filenames +num = int(len(fileList)/8) + +for n in list(range(num+1)): + try: + subList = fileList[8*(n):8*(n+1)] + except: + try: + subList = fileList[8*(n):len(fileList)] + except: + subList = fileList[8*(n)] + # Remove the files that are downloaded already + subList_filter = [] + for f in subList: + if not os.path.isfile(os.path.join(saveDir, f.rsplit('/')[-1])): + subList_filter.append(f) + else: + print(f'{f.rsplit("/")[-1]} already exists in the {saveDir}.') + + if len(subList_filter) != 0: + attempts = 0 + success = False + while (attempts < 3 and success == False): + try: + # Download files + earthaccess.download(subList_filter, saveDir, threads=8) + success = True + except: + attempts +=1 + print(Fore.RED + Back.GREEN + f'There is an issue with downloading files in {subList_filter}') + print(Style.RESET_ALL) + + diff --git a/python/scripts/daac_data_download_python/README.md b/python/scripts/daac_data_download_python/README.md new file mode 100644 index 0000000..a346596 --- /dev/null +++ b/python/scripts/daac_data_download_python/README.md @@ -0,0 +1,58 @@ +# How to Access the LP DAAC From Data Pool and NASA Earthdata Cloud +--- +# Objective: +The `DAACDataDownload.py` script demonstrates how to configure a connection to download data directly in Python from an Earthdata Login-enabled server, specifically the [LP DAAC Data Pool](https://www.earthdata.nasa.gov/learn/use-data/tools) and [NASA Earthdata Cloud](https://www.earthdata.nasa.gov/). The script is a command line executable, where a user will submit either a single URL to a file to be downloaded, or the location of a csv or text file containing multiple URLs to be downloaded, and a desired directory to download files to. + + +The script begins with authenticating and then downloads the URL(s) that you provided. If multiple URLs are included in the file list, the script loops through URLs and downloads 8 files at a time using the `earthaccess` multithreading download method to speed up the download process. If the files provided in your list exist in the provided directory, the script skips those files. The output file name will be the same as the input file name.   + +--- +## Prerequisites/Setup Instructions   + +### Environment Setup + +Instruction for setting up a compatible environment is available at: + +### NASA Earthdata Login: + +**You will need a NASA Earthdata Login account to download LP DAAC data (and use this script).** To create a NASA Earthdata Login account, go to the [Earthdata Login website](https://urs.earthdata.nasa.gov) and click the “Register” button, which is next to the green “Log In” button under the Password entry box. Fill in the required boxes (indicated with a red asterisk), then click on the “Register for Earthdata Login” green button at the bottom of the page. An email including instructions for activating your profile will be sent to you. Activate your profile to complete the registration process. + +To download data from the LP DAAC archive, you need to authorize our applications to view your NASA Earthdata Login profile. Once authorization is complete, you may resume your session. +To authorize Data Pool, please [click here](https://urs.earthdata.nasa.gov/approve_app?client_id=ijpRZvb9qeKCK5ctsn75Tg&_ga=2.128429068.1284688367.1541426539-1515316899.1516123516).   + +### **Netrc file** +The netrc file is needed to download NASA Earthdata science data from a scripting environment like Python. There are multiple methods to create a .netrc file. Here, the `earth access` package is used to automatically create a netrc file using your Earthdata login credentials if one does not exist. +--- + +# Procedures: + +## Getting Started: + +> #### 1. Save a download URL for your data from  [NASA Earthdata Search](https://search.earthdata.nasa.gov/) for a single file. For multiple files, download the text file containing URLs to files.   + +> #### 2. Access `DAACDataDownload.py` from [LPDAAC-Data-Resources] Github Repository   +  > 1. You can download the raw file for the script from +    +  > 2. Additionally, you can download all contents of this repository as a [zip file](https://github.com/nasa/LPDAAC-Data-Resources/archive/refs/heads/main.zip). You can also clone the repository by typing `git clone https://github.com/nasa/LPDAAC-Data-Resources.git` in a command line. Navigate to `python/scripts/daac_data_download_python/DAACDataDownload.py`.   + +## Script Execution + +> #### 1. Activate your MacOS/Windows environment, run the script with the following in your Command Prompt/terminal window: + +  > 1.  `python DAACDataDownload.py -dir -f `   +  > - Ex:   `python C:\User\Downloads\DAACDataDownload.py  -dir C:\User\downloads -f C:\User\downloads\ECOSTRESS-granule-list.txt` +  > - Ex: `python C:\User\Downloads\DAACDataDownload.py  -dir C:\User\downloads -f https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/ECO_L2T_LSTE.002/ECOv002_L2T_LSTE_34912_026_54GXQ_20240902T022621_0712_01/ECOv002_L2T_LSTE_34912_026_54GXQ_20240902T022621_0712_01_LST.tif` + +  > 2. If you do not have a netrc file configured in your home directory, the script will prompt you for input on your NASA Earthdata Login Username and Password. Enter your username and password and hit enter to continue downloading your data. **Please note that your Earthdata Login info, your username, and password, will be stored in a .netrc file located in the Home directory on this system you are using.** You will get the same message when you run the script as a reminder. If you do not trust the machine you are using, make sure to delete the created netrc file.   +  > 4. Your file(s) will be downloaded at the designated `-dir` assigned above. +---   + +## Contact Info   + +Email: LPDAAC@usgs.gov   +Voice: +1-866-573-3222   +Organization: Land Processes Distributed Active Archive Center (LP DAAC)¹   +Website:   +Date last modified: 02-20-2024   + +¹Work performed under USGS contract G15PD00467 for NASA contract NNG14HH33I.