NYSCEF Downloader

This is a WIP set of scripts for accessing data off the New York Unified Court System website and putting it into a readable formats for search.

This was originally created to make it easier to aggregate information during the Child Victims Act look-back period for people who do not have a way to login in the NYSCEF or want to play with the data.

The data is currently hosted in a sqlite database with datasette here.

Challenges

A huge challenge of getting data from NYSCEF is that in order to aggregate all data on a certain case type over a time period, you must use the "Case Search" option. This allows you to search by Court and a specific Date. Once you have all the cases that were submitted to that court on that date, you must look through the entire list and only look for things under your subject. If anyone has a better way to search through this, please let me know.

Then, there is the issue that all use of "Case Search" is protected by captcha. With a little testing it seemed like the session cookie, JSESSIONID, associated with the captcha limited requests with the session id to less than 100. I chose 60 just to be safe. This means that after 60 requests you must input a new JESSSIONID for the script to work. I've sat for a while inputting new ids from fresh, captcha'd sessions into the script.

However, if you have a set of docket ids, there is no captcha requirement for going directly to the docket ids case information, so the other scripts run much faster.

Fetching Docket Ids

The first step to using the docket_id_fetcher.py script is going to https://iapps.courts.state.ny.us/nyscef/CaseSearch?TAB=courtDateRange and passing the captcha test. Then, inspect the cookie and grab the JSESSIONID cookie. You need this to run the script and for every subsequent 60 queries.

There are also some required argument for the script to run:

argument	required	description
start-date	true	the start date of your search window
session-id	true	the JSESSIONID generated by completing captcha test
case-type	true	the case type you are looking to aggregate, default: no type
end-date	false	the end date of your search window, default: today
court	false	the court you want to search, default: all courts
output	false	the file you want to output your ids to

Here is an example of the script running I used for the Child Victim's Act:

python3 docket_id_fetcher.py --start-date "08/14/2019" --case-type "Torts - Child Victims Act" --session-id "37B5BE431C206047654303BE1BE00F70.server2037" --ouput "ids.txt"

Note that you will need to look at the exact language used by NYSCEF for the type.

Fetching Case Data

If you have a list of all docket ids you want to inspect, you can then use a script to generate data about the cases themselves.

All you need to do is either pass a list of docket ids like so:

python3 case_data_fetcher.py --docket-ids "123,456"

or a file of the docket ids like this:

python3 case_data_fetcher.py --docket-ids-file "ids.txt"

It will spit out a json file with formatting that captures all the metadata for the docket and saves it in an output file.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data		data
utils		utils
.gitignore		.gitignore
README.md		README.md
case_data_fetcher.py		case_data_fetcher.py
docket_id_fetcher.py		docket_id_fetcher.py
ids.txt		ids.txt
json-to-sql.py		json-to-sql.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NYSCEF Downloader

Challenges

Fetching Docket Ids

Fetching Case Data

About

Releases

Packages

Contributors 2

Languages

apark2020/nyscef-downloader

Folders and files

Latest commit

History

Repository files navigation

NYSCEF Downloader

Challenges

Fetching Docket Ids

Fetching Case Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages