Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local Zotero access (no API) #37

Open
jangenoe opened this issue Nov 4, 2021 · 1 comment
Open

local Zotero access (no API) #37

jangenoe opened this issue Nov 4, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@jangenoe
Copy link

jangenoe commented Nov 4, 2021

It is great that the cite2c is now ported to jupyterlab. I note that the users reference database is accessed online using the API key, as already was done in the cite2c. However, this online access is not always possible, e.g. when traveling by train of plane ...

Many of the citation manager users are already using Zotero on their PC, and when Zotero with BetterBibTex is running, citations can also be entered from the local Zotero database, as is done in:

https://github.com/retorquere/zotero-citations
https://github.com/mblode/vscode-zotero

Moreover, when Zotero is running and the user is browsing in Zotero on a specific folder or paper (typically when this is being explored), this topic can already be selected also in the extension.

In this view I believe it can be useful to enable also the access to the local database without API key.

PS: it is also great that you suggest in the read-me how to set the path to the csl-styles folder. Zotero users have usually already quite a collection of csl files in their $Zotero-home-dir/styles folder. They can just add this folder to their "jupyter --paths".

@jangenoe jangenoe added the enhancement New feature or request label Nov 4, 2021
@krassowski
Copy link
Owner

  1. We already cache the Zotero collections in the browser database, so this extension works properly when offline, and once back online it syncs with the Zotero API using established protocols; I agree that an issue may arise when you want to use the client to enter a new citation manually (when offline) and then to sync the collections in JupyterLab; this could potentially be solved in two ways:
    • a) you could install Zotero dataserver implementing Zotero Web API v3 locally and sync through it
    • b) we could support some basic editing/creation of citations
    • c) we could connect directly to the local Zotero client

Let's discuss (c) in more details:

  1. It is not possible for the frontend to have a direct connection with the local Zotero client without developing a browser extension (see how Zotero implements Google Docs integration). Many developers had attempted this, but Zotero API blocks the connections on CORS policy level and there is no way for the user to allow external traffic in; there are a few discussions on it on the zotero-dev group: https://groups.google.com/g/zotero-dev/search?q=cors
  2. We could connect from backend to the Zotero client installed on backend but because of JupyterHub and friends, the machine where the server extension lives and the machine where the client thinks the JupyterLab is running are often two different computers

This is why this extension uses web API first and foremost; if someone wants to develop a new citation provider by querying the local instance of Zotero (despite the limitation described in the 3rd point i.e. it will be useless for users who connect to remote jupyter servers) they are welcome to open a draft pull request; it would need to be composed of a server extension that would interact with the local Zotero client and from a frontend IReferenceProvider which would interact with this server extension, implementing:

/**
* A provider of references such as Zotero, EndNote, Mendeley, etc.
*/
export interface IReferenceProvider {
/**
* Identifier; cannot contain pipe (`|`) character.
*/
id: string;
/**
* The name as shown in the user interface.
*/
name: string;
icon: IIcon;
citableItems: Map<string | number, ICitableData>;
/**
* @param force - whether to update even if recently updated
*/
updatePublications(force?: boolean): Promise<ICitableData[]>;
isReady: Promise<any>;
progress?: ISignal<any, IProgress>;
// getCollections?(): Promise<Map<string, ICitableData[]>>;
}

As does the Zotero Web API Client:

/**
* Zotero client implementing the Zotero Web API protocol in v3.
*/
export class ZoteroClient implements IReferenceProvider {
id = 'zotero';
name = 'Zotero';
icon = zoteroIcon;
private _serverURL: string | null = null;
private _key: string | null = null;
private _user: IUser | null = null;
/**
* Version number from API representing the library version,
* as returned in `Last-Modified-Version` of response header
* for multi-item requests.
*
* Note: responses for single-item requests will have item versions rather
* than global library versions, please do not write those onto this variable.
*
* https://www.zotero.org/support/dev/web_api/v3/syncing#version_numbers
*/
lastModifiedLibraryVersion: string | null = null;
citableItems: Map<string, ICitableData>;
isReady: Promise<any>;
/**
* If the API requests us to backoff we should wait given number of seconds before making a subsequent request.
*
* This promise will resolve once the backoff time passed.
*
* https://www.zotero.org/support/dev/web_api/v3/basics#rate_limiting
*/
protected backoffPassed: Promise<void>;
/**
* The Zotero Web API version that we support
*
* https://www.zotero.org/support/dev/web_api/v3/basics#api_versioning
*/
protected apiVersion = '3';
/**
* Bump this version if changing the structure/type of data stored
* in the StateDB if the change would invalidate the existing data
* (e.g. CSL version updates); this should make updates safe.
*
* Do not bump the version if extra information is stored; instead
* prefer checking if it is present (conditional action).
*/
private persistentCacheVersion = '0..';
progress: Signal<ZoteroClient, IProgress>;
constructor(
protected settings: ISettings,
protected trans: TranslationBundle,
protected state: IStateDB | null
) {
this.progress = new Signal(this);
this.citableItems = new Map();
settings.changed.connect(this.updateSettings, this);
const initialPreparations: Promise<any>[] = [this.updateSettings(settings)];
if (state) {
initialPreparations.push(this.restoreStateFromCache(state));
}
this.isReady = Promise.all(initialPreparations);
// no backoff to start with
this.backoffPassed = new Promise(accept => {
accept();
});
}
private async restoreStateFromCache(state: IStateDB) {
return new Promise<void>(accept => {
state
.fetch(PLUGIN_ID)
.then(JSONResult => {
if (!JSONResult) {
console.log(
'No previous state found for Zotero in the StateDB (it is normal on first run)'
);
} else {
const result = JSONResult as IZoteroPersistentCacheState;
if (result.apiVersion && result.apiVersion !== this.apiVersion) {
// do not restore from cache if Zotero API version changed
return;
}
if (
result.persistentCacheVersion &&
result.persistentCacheVersion !== this.persistentCacheVersion
) {
// do not restore from cache if we changed the structure of cache
return;
}
// restore from cache
this.lastModifiedLibraryVersion = result.lastModifiedLibraryVersion;
if (result.citableItems) {
this.citableItems = new Map([
...Object.entries(result.citableItems)
]);
console.log(
`Restored ${this.citableItems.size} citable items from cache`
);
}
this.updateCacheState();
}
})
.catch(console.warn)
// always resolve this one (if cache is not present or corrupted we can always fetch from the server)
.finally(() => accept());
});
}
private async fetch(
endpoint: string,
args: Record<string, string> = {},
isMultiObjectRequest = false,
forceUpdate = false
) {
if (!this._key) {
const userKey = await getAccessKeyDialog(this.trans);
if (userKey.value) {
this._key = userKey.value;
this.settings.set('key', this._key).catch(console.warn);
} else {
return;
}
}
const requestHeaders: IZoteroRequestHeaders = {
'Zotero-API-Key': this._key,
'Zotero-API-Version': this.apiVersion
};
if (
!forceUpdate &&
isMultiObjectRequest &&
this.lastModifiedLibraryVersion
) {
requestHeaders['If-Modified-Since-Version'] =
this.lastModifiedLibraryVersion;
}
// wait until the backoff time passed;
await this.backoffPassed;
return fetch(
this._serverURL + '/' + endpoint + '?' + new URLSearchParams(args),
{
method: 'GET',
headers: requestHeaders as any
}
).then(response => {
this.processResponseHeaders(response.headers, isMultiObjectRequest);
return response;
});
}
protected processResponseHeaders(
headers: Headers,
fromMultiObjectRequest: boolean
): void {
const zoteroHeaders = new ZoteroResponseHeaders(headers);
this.handleBackoff(zoteroHeaders.backoffSeconds);
if (fromMultiObjectRequest && zoteroHeaders.lastModifiedVersion) {
// this is the library version only if we had multi-version request
this.lastModifiedLibraryVersion = zoteroHeaders.lastModifiedVersion;
}
if (
zoteroHeaders.apiVersion &&
zoteroHeaders.apiVersion !== this.apiVersion
) {
console.warn(
`Zotero servers moved to a newer version API (${zoteroHeaders.apiVersion},` +
` but this client only supports ${this.apiVersion});` +
' please consider contributing a code to update this client to use the latest API'
);
}
}
protected handleBackoff(seconds: number | null): void {
if (seconds) {
this.backoffPassed = new Promise<void>(accept => {
window.setTimeout(accept, seconds);
});
}
}
public async updatePublications(force = false): Promise<ICitableData[]> {
const progressBase: Partial<IProgress> = {
label: this.trans.__('Zotero sync.'),
tooltip: this.trans.__(
'Connector for Zotero is synchronizing references…'
)
};
this.progress.emit({ ...progressBase, state: 'started' });
const publications = await this.loadAll(
'users/' + this._user?.id + '/items',
// TODO: also fetch json to get the full tags and collections data and parse from <zapi:subcontent>?
'csljson',
'items',
true,
force,
progress => {
this.progress.emit({
...progressBase,
state: 'ongoing',
value: progress
});
}
).finally(() => {
this.progress.emit({ ...progressBase, state: 'completed' });
});
if (publications) {
console.log(`Fetched ${publications?.length} citable items from Zotero`);
this.citableItems = new Map(
(publications || []).map(item => {
const data = item as ICitableData;
return [data.id + '', data];
})
);
this.updateCacheState().catch(console.warn);
} else {
console.log('No new items fetched from Zotero');
}
return [...this.citableItems.values()];
}
protected async updateCacheState(): Promise<any> {
if (!this.state) {
return;
}
const state: IZoteroPersistentCacheState = {
persistentCacheVersion: this.persistentCacheVersion,
apiVersion: this.apiVersion,
lastModifiedLibraryVersion: this.lastModifiedLibraryVersion,
citableItems: Object.fromEntries(this.citableItems)
} as IZoteroPersistentCacheState;
return this.state.save(PLUGIN_ID, state);
}
private async loadAll(
endpoint: string,
format = 'csljson',
extract?: string,
isMultiObjectRequest = true,
forceUpdate = false,
progress?: (progress: number) => void
) {
let result = await this.fetch(
endpoint,
{ format: format },
isMultiObjectRequest,
forceUpdate
);
if (result?.status === 304) {
console.log(`Received 304 status (${result?.statusText}), skipping...`);
return null;
}
const responses = [];
// TODO
const total =
parseInt(result?.headers.get('Total-Results') as string, 10) || 10000;
let i = 0;
let done = false;
while (!done && i <= total) {
if (!result) {
console.log('Could not retrieve all pages for ', endpoint);
return;
}
responses.push(result);
const links = parseLinks(result?.headers.get('Link') as string);
const next = links.get('next')?.url;
if (next) {
const nextParams = Object.fromEntries(
new URLSearchParams(new URL(next).search).entries()
);
if (nextParams.start) {
i += parseInt(nextParams.start, 10);
if (progress) {
progress((100 * i) / total);
}
}
result = await this.fetch(
endpoint,
{
...nextParams,
format: format
},
isMultiObjectRequest,
// do not add library version condition in follow up requests (we did not fetch entire library yet)
true
);
} else {
done = true;
if (progress) {
progress(100);
}
}
}
const results = [];
for (const response of responses) {
let responseItems = await response.json();
if (extract) {
responseItems = responseItems[extract];
}
for (const item of responseItems) {
results.push(item);
}
}
return results;
}
private reloadKey() {
if (!this._key) {
console.warn('No access key to Zotero, cannot reload');
}
return this.fetch('keys/' + this._key).then(response => {
if (!response) {
console.error(response);
return;
}
return response.json().then(result => {
this._user = {
name: result.username,
id: result.userID
};
return this.updatePublications();
});
});
}
private updateSettings(settings: ISettings) {
this._key = settings.composite.key as string;
this._serverURL = settings.composite.server as string;
return this.reloadKey();
}
}
export const zoteroPlugin: JupyterFrontEndPlugin<void> = {
id: PLUGIN_ID,
requires: [ICitationManager, ISettingRegistry],
optional: [ITranslator, IStateDB, IStatusBar],
autoStart: true,
activate: (
app: JupyterFrontEnd,
manager: ICitationManager,
settingRegistry: ISettingRegistry,
translator: ITranslator | null,
state: IStateDB | null,
statusBar: IStatusBar | null
) => {
console.log('JupyterLab citation manager provider of Zotero is activated!');
translator = translator || nullTranslator;
const trans = translator.load('jupyterlab-citation-manager');
settingRegistry
.load(zoteroPlugin.id)
.then(settings => {
const client = new ZoteroClient(settings, trans, state);
manager.registerReferenceProvider(client);
console.log(
'jupyterlab-citation-manager:zotero settings loaded:',
settings.composite
);
if (statusBar) {
statusBar.registerStatusItem(PLUGIN_ID, {
item: new UpdateProgress(client.progress),
rank: 900
});
}
})
.catch(reason => {
console.error(
'Failed to load settings for jupyterlab-citation-manager:zotero.',
reason
);
});
}
};

But please note that it should not use the native Zotero picker, but instead only sync collection data in JSON CSL format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants