Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OJS][stable-3_3_0] Create a CSV importexport tool for issues and users #4627

Open
wants to merge 2 commits into
base: stable-3_3_0
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
196 changes: 196 additions & 0 deletions plugins/importexport/csv/CSVImportExportPlugin.inc.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
<?php

/**
* @file plugins/importexport/csv/CSVImportExportPlugin.inc.php
*
* Copyright (c) 2014-2025 Simon Fraser University
* Copyright (c) 2003-2025 John Willinsky
* Distributed under the GNU GPL v3. For full terms see the file docs/COPYING.
*
* @class CSVImportExportPlugin
*
* @ingroup plugins_importexport_csv
*
* @brief CSV import/export plugin
*/

namespace PKP\Plugins\ImportExport\CSV;

import('lib.pkp.classes.plugins.ImportExportPlugin');

use PKP\Plugins\ImportExport\CSV\Classes\CachedAttributes\CachedDaos;
use PKP\Plugins\ImportExport\CSV\Classes\Commands\IssueCommand;
use PKP\Plugins\ImportExport\CSV\Classes\Commands\UserCommand;

class CSVImportExportPlugin extends \ImportExportPlugin
{

/**
* Which command is the tool using from CLI. Currently supports "issues" or "users"
*
* @var string
*/
private $_command;

/**
* Username passed as parameter from CLI
*
* @var string
*/
private $_username;

/**
* User registered on system to perform the CLI command
*
* @var \User
*/
private $_user;

/**
* The folder containing all CSV files that the command must go through
*
* @var string
*/
private $_sourceDir;

/**
* Whether to send welcome email to the user
*
* @var bool
*/
private $_sendWelcomeEmail = false;

/**
* @copydoc Plugin::register()
*
* @param null|mixed $mainContextId
*/
public function register($category, $path, $mainContextId = null)
{
$success = parent::register($category, $path, $mainContextId);
$isInstalled = !!\Config::getVar('general', 'installed');
$isUpgrading = defined('RUNNING_UPGRADE');

if (!$isInstalled || $isUpgrading) {
return $success;
}

if ($success && $this->getEnabled()) {
$this->addLocaleData();
}

return $success;
}

/**
* @copydoc Plugin::getDisplayName()
*/
public function getDisplayName()
{
return __('plugins.importexport.csv.displayName');
}

/**
* @copydoc Plugin::getDescription()
*/
public function getDescription()
{
return __('plugins.importexport.csv.description');
}

/**
* @copydoc Plugin::getName()
*/
public function getName()
{
return 'CSVImportExportPlugin';
}

/**
* @copydoc PKPImportExportPlugin::usage
*/
public function usage($scriptName)
{
echo __('plugins.importexport.csv.cliUsage', [
'scriptName' => $scriptName,
'pluginName' => $this->getName()
]) . "\n\n";
echo __('plugins.importexport.csv.cliUsage.examples', [
'scriptName' => $scriptName,
'pluginName' => $this->getName()
]) . "\n\n";
}

/**
* @see PKPImportExportPlugin::executeCLI()
*/
public function executeCLI($scriptName, &$args)
{
$startTime = microtime(true);
$this->_command = array_shift($args);
$this->_username = array_shift($args);
$this->_sourceDir = array_shift($args);
$this->_sendWelcomeEmail = array_shift($args) ?? false;

if (! in_array($this->_command, ['issues', 'users']) || !$this->_sourceDir || !$this->_username) {
$this->usage($scriptName);
exit(1);
}

if (! is_dir($this->_sourceDir)) {
echo __('plugins.importexport.csv.unknownSourceDir', ['sourceDir' => $this->_sourceDir]) . "\n";
exit(1);
}

import('plugins.importexport.csv.classes.cachedAttributes.CachedDaos');

$this->_validateUser();

import('plugins.importexport.csv.classes.handlers.CSVFileHandler');
import('plugins.importexport.csv.classes.validations.InvalidRowValidations');
import('plugins.importexport.csv.classes.cachedAttributes.CachedEntities');

switch ($this->_command) {
case 'issues':
import('plugins.importexport.csv.classes.commands.IssueCommand');
(new IssueCommand($this->_sourceDir, $this->_user))->run();
break;
case 'users':
import('plugins.importexport.csv.classes.commands.UserCommand');
(new UserCommand($this->_sourceDir, $this->_user, $this->_sendWelcomeEmail))->run();
break;
default:
throw new \InvalidArgumentException("Comando inválido: {$this->_command}");
}

$endTime = microtime(true);
$executionTime = $endTime - $startTime;
echo "Executed in: " . number_format($executionTime, 2) . " seconds\n";
}

/**
* Retrieve and validate the User by username
*
* @return void
*/
private function _validateUser()
{
$this->_user = $this->_getUser();
if (!$this->_user) {
echo __('plugins.importexport.csv.unknownUser', ['username' => $this->_username]) . "\n";
exit(1);
}
}

/**
* Retrives an user by username or null if not found
*
* @return \User|null
*/
private function _getUser()
{
/** @var \UserDAO */
$userDao = CachedDaos::getUserDao();
return $userDao->getByUsername($this->_username);
}
}
161 changes: 161 additions & 0 deletions plugins/importexport/csv/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# CSV Import/Export Plugin

This plugin allows you to import issues and users into OJS using CSV files.

The tool processes each row (issue or user) individually. If an error is found during processing:
1. The problematic row will be saved to a new CSV file
2. A new column called 'reason' will be added to this CSV, explaining what went wrong
3. The tool will continue processing the remaining rows
4. At the end of processing, you can check the error CSV file to fix and reprocess the failed entries

For example, if your original CSV had 10 issues and 2 failed, you'll get:
- 8 issues successfully imported
- A new CSV file containing the 2 failed rows with their error descriptions
- Processing will complete for all rows, regardless of individual failures

## Usage

```bash
php tools/importExport.php CSVImportExportPlugin [command] [username] [directory] [sendWelcomeEmail]
```

### Parameters:

- `command`: Either 'issues' or 'users'
- `username`: Username of an existing user in the system who will perform the import
- `directory`: Path to the directory containing CSV files. Can be absolute (e.g., `/full/path/to/directory`) or relative to the current directory (e.g., `./relative/path`). For issues import, this directory must contain both the CSV files and all referenced assets (PDFs, images, etc.)
- `sendWelcomeEmail`: (Optional, users only) Set to true to send welcome emails to imported users

### Examples:

```bash
# Import issues
php tools/importExport.php CSVImportExportPlugin issues admin /path/to/csv/directory

# Import users with welcome email
php tools/importExport.php CSVImportExportPlugin users admin /path/to/csv/directory true
```

## CSV Fields

### Issues Import

Complete field list (in order):
```
journalPath,locale,articleTitle,articlePrefix,articleSubtitle,articleAbstract,articleFilepath,authors,keywords,subjects,coverage,categories,doi,coverImageFilename,coverImageAltText,galleyFilenames,galleyLabels,genreName,sectionTitle,sectionAbbrev,issueTitle,issueVolume,issueNumber,issueYear,issueDescription,datePublished,startPage,endPage
```

Required fields only:
```
journalPath,locale,articleTitle,articleAbstract,articleFilepath,authors,issueTitle,issueVolume,issueNumber,issueYear,datePublished
```

> **Important**: Even when using only required fields, always maintain the same field order as shown in the "Complete field list". For unused optional fields, keep them empty but preserve their position in the CSV.

#### File Structure

All files referenced in the CSV must be placed in the same directory as your CSV file. Required files:
- The CSV file(s) containing issue metadata
- Article files referenced in `articleFilepath` column
- Galley files referenced in `galleyFilenames` column
- Cover images referenced in `coverImageFilename` column

For example, if your CSV contains:
```
articleFilepath=article1.pdf,galleyFilenames=galleys1.pdf;galleys2.pdf,coverImageFilename=cover.png
```

Your directory should contain:
```
/your/import/directory/
├── issues.csv
├── article1.pdf
├── galleys1.pdf
├── galleys2.pdf
└── cover.png
```

Field descriptions:

- `journalPath`: Journal path identifier
- `locale`: Content language (e.g., 'en')
- `articleTitle`: Title of the article
- `articlePrefix`: Prefix for the article title
- `articleSubtitle`: Subtitle of the article
- `articleAbstract`: Article abstract
- `articleFilepath`: Path to the article's main file
- `authors`: Author information with the following rules:
- Each author's data must follow the format: "GivenName,FamilyName,email,affiliation"
- Multiple authors must be separated by semicolons (;)
- FamilyName, email, and affiliation are optional and can be left empty (e.g., "John,,,")
- If email is empty, the system will use the primary contact email
- The first author in the list will be set as the primary contact
- Example with multiple authors:
```
"John,Doe,[email protected],University A;Jane,,[email protected],;Robert,Smith,,"
```
- `keywords`: Keywords (semicolon-separated)
- `subjects`: Subjects (semicolon-separated)
- `coverage`: Coverage information
- `categories`: Categories (semicolon-separated)
- `doi`: Digital Object Identifier
- `coverImageFilename`: Cover image file name
- `coverImageAltText`: Alt text for cover image
- `galleyFilenames`: Names of galley files (semicolon-separated)
- `galleyLabels`: Labels for galleys (semicolon-separated). Must have the same number of items as `galleyFilenames` to ensure correct pairing between files and labels
- `genreName`: Genre name
- `sectionTitle`: Journal section title
- `sectionAbbrev`: Section abbreviation
- `issueTitle`: Title of the issue
- `issueVolume`: Issue volume number
- `issueNumber`: Issue number
- `issueYear`: Year of publication
- `issueDescription`: Description of the issue
- `datePublished`: Publication date (YYYY-MM-DD)
- `startPage`: Starting page number
- `endPage`: Ending page number

### Users Import

Complete field list (in order):
```
journalPath,firstname,lastname,email,affiliation,country,username,tempPassword,roles,reviewInterests
```

Required fields only:
```
journalPath,firstname,lastname,email,roles
```

> **Important**: Even when using only required fields, always maintain the same field order as shown in the "Complete field list". For unused optional fields, keep them empty but preserve their position in the CSV.

Field descriptions:

- `journalPath`: Journal path identifier
- `firstname`: User's first name
- `lastname`: User's last name
- `email`: User's email address
- `affiliation`: User's institutional affiliation
- `country`: Two-letter country code
- `username`: Desired username
- `tempPassword`: Temporary password
- `roles`: User roles (semicolon-separated, e.g., "Reader;Author")
- `reviewInterests`: Review interests (semicolon-separated)

For users import, only the CSV(s) file(s) is(are) needed in your import directory.

## Multiple Values

For fields that accept multiple values:
- Use semicolons (;) to separate multiple values within a field
- For authors field:
- Format for each author: "GivenName,FamilyName,email,affiliation"
- FamilyName, email, and affiliation are optional (can be left empty)
- If email is empty, the system will use the primary contact email
- The first author in the list will be set as the primary contact
- Multiple authors must be separated by semicolons
- Example: "John,Doe,[email protected],University A;Jane,,[email protected],;Robert,Smith,,"
- For galleys:
- Both `galleyFilenames` and `galleyLabels` support multiple values
- They must have the same number of items to ensure correct pairing between files and their labels
- Example: if `galleyFilenames=article.pdf;article.html`, then `galleyLabels` must be something like `PDF;HTML`
Loading
Loading