Sefaria is creating interfaces, apps (like a source sheet builder), and infrastructure (like an API and a structured dataset) for Jewish texts and textual learning.
You can find outputs of our entire database in Sefaria-Export.
Interested developers should join the sefaria-dev mailing list.
For general discussion questions about the project, please email [email protected].
If you find textual problems in our library, please email [email protected].
You can post bugs or request/discuss features on GitHub Issues. Tackling an issue marked as a "Starter Project" is a good way to sink your teeth into Sefaria.
If you're interested in working on a project you see listed here, please email the sefaria-dev mailing list.
First clone the Sefaria-Project repository to a directory on your computer, then follow the instructions:
Note: if you are a developer that might want to contribute code to Sefaria, we suggest first making a fork of this repository by clicking the "Fork" button above when logged in to GitHub.
Note for macOS users - Install Homebrew
:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
There are two methods with which you can run the Sefaria-Project, docker-compose and local install
Follow the instructions here to install Docker and here to install Docker Compose.
In your terminal run:
docker-compose up
This will build the project and run it. You should now have all the proper services set up, lets add some texts.
Connect to mongo running on port 27018. We use 27018 instead of the standard port 27017 to avoid conflicts with any mongo instance you may already have running.
Follow instructions in section 8 below to restore the mongo dump.
Copy the local settings file:
cp sefaria/local_settings_example.py sefaria/local_settings.py
Replace the following values:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'sefaria',
'USER': 'admin',
'PASSWORD': 'admin',
'HOST': 'postgres',
'PORT': '',
}
}
MONGO_HOST = "db"
Optionally, you can replace the cache values as well:
MULTISERVER_REDIS_SERVER = "cache"
REDIS_HOST = "cache"
and the respective values in CACHES
In a new terminal window run:
docker exec -it sefaria-project_web_1 bash
This will connect you to the django container. Now run:
python manage.py migrate
In a new terminal window run:
docker exec -it sefaria-project-node-1 bash
This will connect you to the django container. Now run:
npm run build-client
or
npm run watch-client
In your browser go to http://localhost:8000
If the server isn't running, you may need to run docker-compose up
again.
We Recommend using the latest Python 3.7 as opposed to later versions of Python (esp 3.10 and up) since it has been known to cause some compatibility issues. These are solvable, but for an easier install experience, we currently recommend 3.7
Most UNIX systems come with a python interpreter pre-installed. However, this is generally still Python 2. The recommended way to get Python 3, and not mess up any software the OS is dependent on, is by using Pyenv. You can use the instructions here and also here.
In a later step, we can configure virtual environments to work with Pyenv so you can completely isolate your Sefaria stack.
The Pyenv repository above also has recommendations for Windows.
In order to just plain install Python:
- Read the Official Python documentation on Windows
- Go to the Python Download Page and download and install Python.
- Add the python directory to your OS' PATH variable if the installer has not done so.
If you work on many Python projects, you may want to keep Sefaria's python installation separate using Virtualenv. If you're happy having Sefaria requirements in your main Python installation, skip this step.
You can use virtualenv functionality while also using Pyenv; this allows you to further isolate your code and requirements from other projects.
Instructions here.
Instructions here.
Create a pyenv virtualenv.
In your Sefaria directory, run pyenv local [venv-name]
. This will create a .python-version
and write the version name provided to the file (e.g. 3.7.5/envs/sefaria-venv
). This should serve to activate the virtualenv whenever you are in the Sefaria directory.
Note: If, after following the installation and configuration instructions, running python -V
still displays the system version, you may have to manually add the shims directory to your path. If that does not correct the issue, you should check your bash init file (.zshrc, .bashrc, or the like) for the line eval "$(pyenv init -)"
(not inside an if statement) and change it to eval "$(pyenv init --path)"
and restart the shell.
Note: If you are using an IDE like PyCharm, you can (and should) configure the interpreter options on your Sefaria-Project to point to the python executable of this virtualenv (e.g. ~/.pyenv/versions/3.7.5/envs/sefaria-venv/bin/python3.7
)
Install virtualenv, then enter these commands:
Note: You may need to install Pip (see below) first in order to install virtualenv
Note: You can perform this step from anywhere in your command line, but it might be easier and tidier to run this step from the root of your project directory that you just cloned. e.g ~/web-projects/Sefaria-Project $
virtualenv venv --distribute
source venv/bin/activate
Now you should see (venv)
in front of your command prompt. The second command sets your shell to use the Python virtual environment that you've just created. This is something that you have to run every time you open a new shell and want to run the Sefaria demo. You can always tell if you're in the virtual environment by checking if (venv)
is at the beginning of your command prompt.
If you used pyenv, Pip should be available via the pyenv version of Python.
If you don't already have it in your Python installation, install Pip. Then use it to install the required Python packages.
Use instructions here and then make sure that the scripts subfolder of the python installation directory is also in PATH.
Note: this step (and most of the following command line instructions) must be run from the Sefaria-Project root directory
Please temporarily comment out the line psycopg2==2.8.6
in requirements.txt.
and then run:
pip install -r requirements.txt
If you are not using virtualenv, you may have to run it with sudo: sudo pip install -r requirements.txt
Note: You'll probably need to install the Python development libraries as well:
sudo apt-get install python-dev python3-dev libpq-dev
sudo dnf install python2-devel libpq-devel
Note: If you see an error that pg_config executable not found
, you need to install PostgreSQL. If on macOS you see an error while Building wheel for psycopg2
about linker command failed with exit code 1
, you may need to add the path to OpenSSL
export LDFLAGS="-L/usr/local/opt/openssl/lib"
export CPPFLAGS="-I/usr/local/opt/openssl/include"
After installing the Python development libraries or other dependencies, run pip install -r requirements.txt
again.
gettext
is a GNU utility that Django uses to manage localizations.
brew install gettext
On some macOS systems gettext
will still not run after installation and django manage.py makemessages
will fail. In such a case, one easy solution is to add (replace x's with your gettext version number) to your .bashrc(or its equivalent on your system):
export TEMP_PATH=$PATH
export PATH=$PATH:/usr/local/Cellar/gettext/0.xx.x/bin
sudo apt-get install gettext
Note: this step must be run from the Sefaria-Project root directory
cd sefaria
cp local_settings_example.py local_settings.py
vim local_settings.py
Replace the placeholder values with those matching your environment. For the most part, you should only have to specify values in the top part of the file where it directs you to change the given values.
Note: Among the placeholder values that need to be replaced, set the DATABASES
default NAME
field to a path (including a file name) where Django can create a sqlite database. Using /path/to/Sefaria-Project/db.sqlite
is sufficient, as we git-ignore all .sqlite
files.
You can name your local database (sefaria
will be the default created by mongorestore
below). You can leave SEFARIA_DB_USER
ad SEFARIA_DB_PASSWORD
blank if you don't need to run authentication on Mongo.
Create a directory called log
under the project folder. To do this, run mkdir log
from the project's root directory.
Make sure that the server user has write access to it by using a command such as chmod 777 log
.
If you don't already have it, install MongoDB. Our current data dump requires MongoDB version 4.4 or later. After installing Mongo according to the instructions, you should have a mongo service automatically running in the background.
Otherwise, run the mongo daemon with:
mongod
(Only use sudo if necessary, as it may result in a locked socket file being created that may prevent mongo from running later on).
Or use your os service manager to have it start on startup.
Mongo usually sets all the correct paths for itself to run properly nowadays, but see here for details on setting a specific path for it to store data in.
MongoDB dumps of our database are available to download.
The recommended dump (which is a more manageable size) is available here.
A complete dump is also available here. The complete dump includes the history
collections, which contains a complete revision history of every text in our library. For many applications, this data is not relevant. We recommend using the smaller dump unless you're specifically interested in texts revision history within Sefaria.
Unzip the file and extract the dump
folder. If you don't have an app for unzipping, this can be done from Command Prompt by navigating to the folder containing the download and using tar -xf dump.tar.gz
or tar -xf dump_small.tar.gz
(depending on which dump you downloaded).
This dump
must be restored as a MongoDB database.
From MongoDB version 4.4, mongorestore comes separately from the MongoDB server, so if you don't already have it, you'll also need to download and unzip the Database Tools. You may then need to add its \bin\ directory to your PATH environment variables.
Once you have the unzipped dump
; from the folder which contains dump
, run:
mongorestore --drop
This will create (or overwrite) a mongo database called sefaria
.
If you used the recommended dump, dump_small.tar.gz
, create an empty collection inside the sefaria
database called history
. This can be done using the mongo client shell or a GUI you have installed, such as MongoDB Compass.
Sefaria is using Google's reCAPTCHA to verify the user is not a bot. For a deployment, you should register and use your own reCAPTCHA keys (https://pypi.org/project/django-recaptcha/#installation). For local development, the default test keys would suffice. The warning can be suppressed by uncommenting the following in the local_settings.py file:
SILENCED_SYSTEM_CHECKS = ['captcha.recaptcha_test_key_error']
manage.py
is used to run and manage the local server. It is located in the root directory of the Sefaria-Project
codebase.
Django auth features run on a separate database. To init this database and set up Django's auth system, switch to the root directory of the Sefaria-Project
code base, and run (from the project root):
python manage.py migrate
Note: Older versions of Node
and npm
ran into a file name length limit on Windows OS. This problem should be mitigated in newer versions on Windows 10.
Node is now required to run the site. Even if you choose to have js run only on the client, we are also using Webpack to bundle our javascript.
Instaling Node and npm from the main installers on Node's homepage may cause permission issues. For that reason, it is recommended to use one of the alternative methods (with a preference for a version manager like nvm) listed here.
You are better off using apt-get nodejs
or following the instructions here. They will install both Node and npm.
use brew install node
or nvm
Now download the required Javascript libraries and install some global tools for development with the setup
script (from the project root).
npm install
npm run setup
If the second command fails, you may have to install using sudo npm run setup
.
To get the site running, you need to bundle the javascript with Webpack. Run:
npm run build-client
to bundle once. To watch the javascript for changes and automatically rebuild, run:
npm run watch-client
Sefaria uses React.js. To render HTML server-side, we use a Node.js server. For development, the site is fully functional without server-side rendering. For deploying in a production environment, however, server-side HTML is very important for bots and SEO.
To use server-side rendering, you must also install Redis Cache. brew install redis
or sudo apt-get install redis
.
To run redis: redis-server
. On macOS: brew services start redis
Django
Update your local_settings.py
file and replace the CACHES
variable with the CACHES
variable meant for server-side rendering, commented out in local_settings_example.py
.
Also, make sure the following are set:
SESSION_CACHE_ALIAS = "default" #declares where Django's session engine will store data
USER_AGENTS_CACHE = 'default' #declares where the Django user agent middleware will store data (cannot be JSON encoded!)
SHARED_DATA_CACHE_ALIAS = 'shared' #Tells the application where to store data placed in cache by `sefaria.system.cache` shared-cache functions
USE_NODE = True
NODE_HOST = "http://localhost:3000" #Or whatever host and port you set Node up to listen on
Node
The following environment variables, defined in ./node/local_settings.js
, can be set to configure the node instance:
Env Var | Default | Description |
---|---|---|
NODEJS_PORT |
3000 |
The port to be used by the NodeJs service |
DEBUG |
false |
Determines whether the NodeJs service should run in debug mode |
REDIS_HOST |
127.0.0.1 |
The Redis instance to point Node to when running |
REDIS_PORT |
6379 |
The Default port Redis listens on |
These variables can be set via command line explicitly or set up to be defined when your machine's shell runs or set up in your IDE settings for running the node server.
For development, you can run the Node server using nodemon with:
npm start
To run webpack with server-side rendering, use:
npm run build
or
npm run watch
python manage.py runserver
You can also make it publicly available by specifying 0.0.0.0 for the host:
python manage.py runserver 0.0.0.0:8000
The shell script cli
will invoke a python interpreter with the core models loaded, and can be used as a standalone interface to texts or for testing.
$ ./cli
>>> p = LinkSet(Ref("Genesis 13"))
>>> p.count()
226
We're grateful to the following organizations for providing us with donated services: