From 42d85e3dba02c58c463eaba43d61e891f15e02c6 Mon Sep 17 00:00:00 2001 From: criamos <981166+Criamos@users.noreply.github.com> Date: Fri, 29 Nov 2024 19:23:15 +0100 Subject: [PATCH] docs: update Readme to reflect Python 3.13 / Poetry 1.8.4 requirements --- Readme.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/Readme.md b/Readme.md index 96768d7a..69a7235b 100644 --- a/Readme.md +++ b/Readme.md @@ -1,9 +1,9 @@ # Open Edu Hub Search ETL -## Step 1: Project Setup - Python 3.12 (manual approach) +## Step 1: Project Setup — Python 3.13 (manual approach) - make sure you have python3 installed () - - (Python 3.12 or newer is required) + - (Python 3.13 is required) - go to project root - Run the following commands: @@ -22,7 +22,7 @@ python3 -m venv .venv ## Step 1 (alternative): Project Setup - Python (automated, via `poetry`) -- Step 1: Make sure that you have [Poetry](https://python-poetry.org) v1.5.0+ installed +- Step 1: Make sure that you have [Poetry](https://python-poetry.org) [v1.8.4](https://github.com/python-poetry/poetry/releases/tag/1.8.4)+ installed - for detailed instructions, please consult the [Poetry Installation Guide](https://python-poetry.org/docs/#installation) - Step 2: Open your terminal **in the project root directory**: - Step 2.1: If you want to have your `.venv` to be created inside the project root directory: @@ -31,6 +31,7 @@ python3 -m venv .venv - Step 3: **Install dependencies** (according to `pyproject.toml`) by running: `poetry install` ## Step 2: Project Setup - required Docker Containers + If you have Docker installed, use `docker-compose up` to start up the multi-container for `Splash` and `Playwright`-integration. As a last step, set up your config variables by copying the `.env.example`-file and modifying it if necessary: @@ -40,7 +41,7 @@ As a last step, set up your config variables by copying the `.env.example`-file # Running crawlers - A crawler can be run with `scrapy crawl `. - - (It assumes that you have an edu-sharing v6.0+ instance in your `.env` settings configured which can accept the data.) + - (It assumes that you have an edu-sharing v8.1+ instance in your `.env` settings configured which can accept the data.) - If a crawler has [Scrapy Spider Contracts](https://docs.scrapy.org/en/latest/topics/contracts.html#spiders-contracts) implemented, you can test those by running `scrapy check ` @@ -60,8 +61,10 @@ docker compose up - We use Scrapy as a framework. Please check out the guides for Scrapy spider (https://docs.scrapy.org/en/latest/intro/tutorial.html) - To create a new spider, create a file inside `converter/spiders/_spider.py` -- We recommend inheriting the `LomBase` class in order to get out-of-the-box support for our metadata model -- You may also Inherit a Base Class for crawling data, if your site provides LRMI metadata, the `LrmiBase` is a good start. If your system provides an OAI interface, you may use the `OAIBase` +- We recommend inheriting the `LomBase` class to get out-of-the-box support for our metadata model +- You may also inherit a base class (see: `converter/spiders/base_classes/`) for crawling data. + - If your site provides LRMI metadata, the `LrmiBase` is a good start. + - If your system provides an OAI interface, you may use the `OAIBase` - As a sample/template, please take a look at the `sample_spider.py` or `sample_spider_alternative.py` - To learn more about the LOM standard we're using, you'll find useful information at https://en.wikipedia.org/wiki/Learning_object_metadata