- Add official support for Python 3.10.
- Pass
Accept-Encoding: gzip, deflate
andHost: www.sec.gov
headers into all requests as recommended by the SEC fair access rules: https://www.sec.gov/os/accessing-edgar-data. This should lead to smaller request sizes as all requests are now gzipped. It should also help with 403 Forbidden errors since the package is now conforming with the entire suite of fair access rules.
- CIKs are now automatically zero-padded to 10 digits to ensure that filings are accurately retrieved by the SEC Edgar system. For example, passing either
"0000789019"
or"789019"
(the CIK for MSFT) toget()
will yield equivalent results:
>>> dl.get("10-K", "0000789019", amount=1)
1
>>> dl.get("10-K", "789019", amount=1)
1
- Updated the
User-Agent
header to comply with new SEC Edgar Fair Access requirements. This should resolve the 403 network errors some users are encountering when downloading a significant number of filings.
- A
ValueError
is now raised when a CIK of length >10 or a blank ticker/CIK is passed toget()
.
- Anchor links inside of filings are now resolved correctly. Fragments and external links should now function as intended.
- Renamed
requirements.txt
torequirements-dev.txt
in order to prevent confusion with the dependencies listed insetup.py
.
- The
httpx
package has been replaced byrequests
to enable the use of an exponential backoff retry mechanism to help alleviate403 Forbidden
errors some users are seeing. A request tosec.gov
will be retried at most 10 times (with an exponential backoff applied to each request) before failing. - A random
User-Agent
string is now included in the headers of eachGET
andPOST
request tosec.gov
, rather than per session.
- HTTP connections are now re-used when possible (using
httpx.Client()
) to improve download performance.
- Requests are now retried at most 5 times if a request fails. This should solve the
500 Server Error
s that some users are experiencing when downloading a large number of filings.
- Replaced the internal
requests
package withhttpx
, a more modern and performant alternative.
- Fixed a
403 Client Error
that could randomly occur when bulk downloading a large number of filings. This error was most likely caused by recent changes to SEC rate-limiting behavior. It has been fixed by including a random user-agent string, generated by the Faker package, in the request headers.
- Fixed a
RecursionError
that could occur when downloading older filings with thedownload_details
flag set to true. Thanks to @neilbartlett for reporting and fixing this bug!
- Downloads will no longer halt prematurely if a filing document (full or detail) cannot be found (e.g. when the EDGAR Search API outputs incorrect download URLs). Now, the package will automatically catch such network errors, print a helpful warning message, and then proceed to download the remaining filings.
This is a major breaking release. Please see the v4 migration guide for information on how to upgrade and adapt your existing codebase to the new package version.
- The SEC Edgar Full-Text Search API is now used to fetch filing download URLs. This approach replaces the existing fragile scraping and ATOM RSS implementation used in existing versions of this package.
- Note: this API only allows for fetching filings after December 1, 2000.
- Added support for searching and downloading filings via a
query
kwarg:
dl = Downloader()
# Download all Apple proxy statements that contain the word "antitrust"
dl.get("DEF 14A", "AAPL", query="antitrust")
- Filing details, whose extensions vary based on filing type, are now downloaded in addition to the full submission
txt
file. See the migration guide for information on the revamped folder structure. - Added the ability to download all available SEC filings. Please see the README for a full list of all supported filings.
Path
objects can now be used to specify adownload_folder
when constructing aDownloader
object.- Added type annotations throughout the codebase.
- The current working directory is now used as the default download location. Custom paths can be specified when constructing
Downloader
objects. - All arguments passed to
dl.get()
other thanfiling
andticker_or_cik
must be used with a keyword:
dl = Downloader()
dl.get(
"10-K",
"AAPL",
# All other arguments must be used with a keyword
amount=1,
after="2019-01-01",
before="2021-01-01",
include_amends=True,
download_details=True,
query="sample query"
)
- The
after_date
,before_date
, andnum_filings_to_download
kwargs have been renamed toafter
,before
, andamount
, respectively.
- Added support for DEF 14A filings (proxy statements).
- Added
tox.ini
andMakefile
to distribution package.
- Fixed a failing test in the distributed package due to missing sample filings test data.
- Added support for 20-F filings.
- Added support for form 4 filings.
- Added a 0.15s delay to download logic in order to prevent rate-limiting by SEC Edgar.
- Added support for S-1 filings.
- Added the ability to download more than 100 filings.
- Added the ability to specify an
after_date
argument to theget
method. Example usage:
from sec_edgar_downloader import Downloader
dl = Downloader()
# Get all 8-K filings for Apple after January 1, 2017 and before March 25, 2017
dl.get("8-K", "AAPL", after_date="20170101", before_date="20170325")
- Added a
supported_filings
property to theDownloader
class, which gets a list of all filings supported by thesec_edgar_downloader
package. Example usage:
from sec_edgar_downloader import Downloader
dl = Downloader()
dl.supported_filings
- Package has been completely re-written from the ground up.
- The
Downloader
class now has a singleget
entry point method. This change was made to improve and ease maintainability. Here is the new stub for theget
method:
class Downloader:
def get(
self,
filing_type,
ticker_or_cik,
num_filings_to_download=None,
after_date=None,
before_date=None,
include_amends=False
)
Example usage of the new method:
from sec_edgar_downloader import Downloader
dl = Downloader()
# Get all 8-K filings for Apple
dl.get("8-K", "AAPL")
- Replaced retrieval methods for each filing type with a single point of entry. The bulk method
get_all_available_filings
has also been removed, so any bulk actions need to be completed manually as follows:
# Get the latest supported filings, if available, for Apple
for filing_type in dl.supported_filings:
dl.get(filing_type, "AAPL", 1)
# Get the latest supported filings, if available, for a
# specified list of tickers and CIKs
symbols = ["AAPL", "MSFT", "0000102909", "V", "FB"]
for s in symbols:
for filing_type in dl.supported_filings:
dl.get(filing_type, s, 1)
- Added support for form 10KSB.
- Added docs and tests to PyPI distribution package.
- Locked the
requests
dependency tov2.22.0
or greater to ensure optimal performance and compatibility.
sec-edgar-downloader
is now fully documented 🎉. You can view the latest documentation at sec-edgar-downloader.readthedocs.io.- Changed file encoding for filing downloads from
utf-8
toascii
. This switch was made because SEC filings should be submitted in ASCII format. - Locked the
lxml
dependency tov4.3.4
or greater to fix Python 3.8 install issues.
- Added
before_date
parameter to each filing download method. If this value is not specified, it will default to the current date. - Added
include_amends
parameter to each filing download method. If this value is not specified, it will default to false. - Added support for passing relative (e.g.
./
,../
) and user (e.g.~/
) download paths to theDownloader
constructor - An
IOError
is no longer thrown when an invalid download path is passed to theDownloader
constructor. Instead,sec_edgar_downloader
will create all the necessary directories in the path if they do not exist. - Filing documents are no longer downloaded in streamed chunks.
- Downloads are now written to disk with UTF-8 encoding.
- Added
__version__
variable to package. - Travis CI now uses tox to lint and run tests.
- Added
verbose
flag toDownloader
constructor to enable information printing (e.g. how many filings are found and downloaded).verbose
will default to false, meaning that no download information will be printed by default.
- Cleaned up README
- Tweaked package naming in setup.py
- The method for obtaining 13F filings has been split up into two methods: one for obtaining 13F-NT filings and another one for obtaining 13F-HR filings
- You can read about the differences here
- You can now specify the number of filings to download in the
get_all_available_filings
method - Simplified API by combining ticker and CIK functionality into a single method for each filing type
- Available methods:
get_8k_filings
,get_10k_filings
,get_10q_filings
,get_13f_nt_filings
,get_13f_hr_filings
,get_sc_13g_filings
,get_sd_filings
,get_all_available_filings
- All these methods can be passed either a CIK or ticker string
- Available methods:
- Removed ticker validation to facilitate this simplified API change
- Added a full suite of unit and integration tests along with an internal Travis CI pipeline for increased reliability
- Class methods now return the number of filings downloaded
- Added Python 3.8 support
- Added the ability to specify the number of filings to download
- For example, you can download the latest 10-K for MSFT with this command:
downloader.get_10k_filing_for_ticker("MSFT", 1)
- This is available for all non-bulk methods:
get_8k_filing_for_ticker
,get_10k_filing_for_ticker
,get_10q_filing_for_ticker
,get_13f_filing_for_ticker
,get_sc_13g_filing_for_ticker
,get_sd_filing_for_ticker
, and the CIK equivalents
- For example, you can download the latest 10-K for MSFT with this command:
- Internal renaming changes
- Reduced size of ticker validation data by changing an internal data structure
- Fixed the "FileNotFoundError" on import
- Tweaked PyPI description
- Filing downloads are now handled in chunks to improve download and save speed
- Removed get_select_filings_for_ticker() to reduce redundancy
- Added support for SC 13G filings
- Separated ticker and CIK class methods for easier use
- Added ticker symbol validation
- Files now save as ".txt" rather than ".html"
- Initial release