Python 2.7 utility that scrapes data of stocks belonging to the S&P 500 index from Yahoo Finance.
Tools such as Finviz let you create screeners to identify certain stocks, e.g., potentially undervalued ones. Unfortunately those tools do not let you download data. With that in mind I have written this program that downloads the following data from Yahoo Finance:
- Ticker
- Current Price
- 5-year PEG ratio
- Trailing P/E ratio
- Forward P/E ratio
- Beta
- Enterprise Value/EBITDA
- 52-week Change
- 52-week Change Compared to S&P 500
- Short % of Float
- Profit Margin %
- Quarterly Earnings Growth (yoy)
- Quarterly Revenue Growth (yoy)
- Return on Assets (ttm)
- Return on Equity (ttm)
- Total Cash
- Total Debt
- Target price estimate (low)
- Target price estimate (average)
- Target price estimate (median)
- Target price estimate (high)
python yahoo-finance-scraper.py --help
usage: yahoo-finance-scraper.py [options]
Get data about the S&P 500 companies.
optional arguments:
-h, --help show this help message and exit
--all_tickers Download data for all companies whose ticker is specified in the -i option [default: true].
--ticker TICKER Download data for "ticker" only.
-d, --download_index Download list of all S&P 500 compononents from Wikipedia [default: false].
-i INPUT, --input INPUT
File containing the tickers, in CSV format. Tickers are in the 1st column [default: ./tickers.csv]. You can have any ticker available in Yahoo finance here, not only those of companies belonging to the S&P 500 index.
-o OUTPUT, --output OUTPUT
File where results are stored, in CSV format [default: ./result.csv].
-a, --append Do no create header [Default: false].
-p PAUSE, --pause PAUSE
Interval between requests to Yahoo Finance [default: 4 seconds]. Without a pause between requests, Yahoo Finance will bounce queries.
-t REQUEST_TIMEOUT, --request_timeout REQUEST_TIMEOUT
Timeout for Yahoo Finance requests [default: 1 second]. The timeout value will be applied to both the connect and the read timeouts.
Please refer to http://docs.python-requests.org/en/master/user/advanced/#timeouts for more details.
The following libraries are used: