Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace XSLT with TS #13

Merged
merged 50 commits into from
Feb 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
aa157f2
experimenting to replace xslt
nleanba Feb 20, 2024
4f77895
Merge branch 'main' of github.com:plazi/gg2rdf
nleanba Feb 20, 2024
9960463
added some explanatory comments
nleanba Feb 20, 2024
3a324fb
rename
nleanba Feb 20, 2024
e5608a5
move
nleanba Feb 20, 2024
050b023
replaced <xsl:template match="/">
nleanba Feb 20, 2024
3ac8035
added templates with names authority, authorityNameForURI, taxonNameB…
nleanba Feb 20, 2024
4f4b5d4
updated dependency
nleanba Feb 20, 2024
6c8fc29
added a bit of the publication
nleanba Feb 20, 2024
36582e3
removed unneccesary imports
nleanba Feb 20, 2024
4587f32
formatting
nleanba Feb 20, 2024
8fa9308
working on citations
nleanba Feb 21, 2024
53c164d
taxonRelation
nleanba Feb 21, 2024
7acd621
publication done
nleanba Feb 21, 2024
187e0c5
added matCit in object position
nleanba Feb 21, 2024
0db6464
treatment done (?)
nleanba Feb 21, 2024
08e63e8
fixed taxon citations
nleanba Feb 21, 2024
5c07490
fixed false self-citations
nleanba Feb 21, 2024
d2f691f
added taxon concepts (incomplete)
nleanba Feb 21, 2024
b80a3a3
taxon-concepts done?
nleanba Feb 21, 2024
b9a330b
it can now translate some treatments correctly
nleanba Feb 21, 2024
e919519
cited materials!
nleanba Feb 21, 2024
6eebdd1
partial figures
nleanba Feb 22, 2024
d0e469a
figures done
nleanba Feb 22, 2024
5dada16
added test script
nleanba Feb 22, 2024
cff8a72
fixed handling of missing authority attributes
nleanba Feb 22, 2024
613b8c9
fixed some authorities and mc
nleanba Feb 22, 2024
dfc6607
moved to more compicated internal representation
nleanba Feb 22, 2024
d8f58f5
fixes
nleanba Feb 22, 2024
ea76c7c
checked up until 006E64249F0BFFC1A6E1FAC0364EF948
nleanba Feb 22, 2024
dd3e8a8
various fixes
nleanba Feb 23, 2024
b5c774b
making it callable from within ather deno programs
nleanba Feb 23, 2024
2b3c50e
replaced xslt
nleanba Feb 23, 2024
14a5942
removed xslt
nleanba Feb 23, 2024
8ee909c
changed an error message
nleanba Feb 26, 2024
6af4f8f
updated constructor
nleanba Feb 26, 2024
805df31
slight refactor to use Subject also for figures and material citations
nleanba Feb 26, 2024
605a0f5
added fish
retog Feb 28, 2024
5f679a1
removed outputProperties
nleanba Feb 29, 2024
7891c82
removed fish
nleanba Feb 29, 2024
501c91a
cleanup
nleanba Feb 29, 2024
e1c0874
Changed behaviour according to #10
nleanba Feb 29, 2024
9f6ccf7
fixes #2
nleanba Feb 29, 2024
4989ea1
fixed title escape
nleanba Feb 29, 2024
e4ef8d9
nomen dubiom does no lenger "define" the taxon concept, but mark it a…
nleanba Feb 29, 2024
73f8969
removed bad import
nleanba Feb 29, 2024
ef116d9
removed comment
nleanba Feb 29, 2024
0635483
added comment
retog Feb 29, 2024
dcdc967
target to run single transformation
retog Feb 29, 2024
fe6bb43
added comment
retog Feb 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,22 @@
],
"envFile": "${workspaceFolder}/.env",
"attachSimplePort": 9229
},
{
"request": "launch",
"name": "Single transformation",
"type": "node",
"program": "${workspaceFolder}/src/gg2rdf.ts",
"cwd": "${workspaceFolder}",
"args": ["-i","example-data/000040332F2853C295734E7BD4190F05.xml","-o", "/tmp/000040332F2853C295734E7BD4190F05.ttl"],
"runtimeExecutable": "/usr/bin/deno",
"runtimeArgs": [
"run",
"--inspect-brk",
"--allow-all"
],
"envFile": "${workspaceFolder}/.env",
"attachSimplePort": 9229
}
]
}
1 change: 0 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
FROM denoland/deno:ubuntu-1.39.0

# Install cron
RUN apt update
RUN DEBIAN_FRONTEND=noninteractive apt install -y raptor2-utils openjdk-11-jdk git curl
RUN git config --system http.postBuffer 1048576000
Expand Down
310 changes: 310 additions & 0 deletions example-data/000040332F2853C295734E7BD4190F05.xml

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions manual run/readme
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# this is outdated, do not use

Run the ./main-on-all-wrapper.sh skript to convert everything in pwd/xml to ttl (placed into pwd/ttl)

`pwd`/xml must exist & contain xml data
Expand Down
91 changes: 15 additions & 76 deletions src/action_worker.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,12 @@
* The jobs are accepted as messages and stored on disk, when the worker is started uncompleted jobs are picked up and exxecuted.

*/
import * as path from "https://deno.land/[email protected]/path/mod.ts";
import { path } from "./deps.ts";
import { config } from "../config/config.ts";
import { createBadge } from "./log.ts";
import { type Job, JobsDataBase } from "./JobsDataBase.ts";
import { getModifiedAfter, updateLocalData } from "./repoActions.ts";
import { gg2rdf } from "./gg2rdf.ts";

const GHTOKEN = Deno.env.get("GHTOKEN");

Expand Down Expand Up @@ -54,41 +55,6 @@ async function run() {
updateLocalData("source", log);

// run saxon on modified files
for (const file of modified) {
if (file.endsWith(".xml")) {
Deno.mkdirSync(
config.workDir + "/tmprdf/" + file.slice(0, file.lastIndexOf("/")),
{
recursive: true,
},
);
const p = new Deno.Command("java", {
args: [
"-jar",
path.fromFileUrl(import.meta.resolve("./saxon-he-10.8.jar")),
`-s:${file}`,
`-o:${config.workDir}/tmprdf/${file.slice(0, -4)}.rdf`,
`-xsl:${path.fromFileUrl(import.meta.resolve("./gg2rdf.xslt"))}`,
],
cwd: config.workDir + "/repo/source",
});
const { success, stdout, stderr } = await p.output();
if (!success) {
log("saxon failed:");
} else {
log("saxon succesful:");
}
log("STDOUT:");
log(new TextDecoder().decode(stdout));
log("STDERR:");
log(new TextDecoder().decode(stderr));
if (!success) {
throw new Error("Saxon failed");
}
}
}

// convert modified files to ttl
for (const file of modified) {
if (file.endsWith(".xml")) {
Deno.mkdirSync(
Expand All @@ -97,45 +63,16 @@ async function run() {
recursive: true,
},
);
const p = new Deno.Command("rapper", {
args: [
"-e",
"-w",
"-q",
`${file.slice(0, -4)}.rdf`,
"--output",
"turtle",
],
cwd: config.workDir + "/tmprdf",
stdin: "piped",
stdout: "piped",
stderr: "piped",
});
const child = p.spawn();

// open a file and pipe the subprocess output to it.
child.stdout.pipeTo(
Deno.openSync(`${config.workDir}/tmpttl/${file.slice(0, -4)}.ttl`, {
write: true,
create: true,
}).writable,
);

child.stderr.pipeTo(
Deno.openSync(path.join(jobStatus.dir, "log.txt"), {
append: true,
write: true,
create: true,
}).writable,
);

// manually close stdin
child.stdin.close();

const status = await child.status;
if (!status.success) {
log(`Rapper failed on ${file.slice(0, -4)}.rdf`);
throw new Error("Rapper failed");
try {
gg2rdf(
`${config.workDir}/repo/source/${file}`,
`${config.workDir}/tmpttl/${file.slice(0, -4)}.ttl`,
);
log("gg2rdf succesful");
} catch (error) {
log("gg2rdf failed:");
log(error);
throw new Error("gg2rdf failed");
}
}
}
Expand Down Expand Up @@ -169,7 +106,9 @@ async function run() {
try {
Deno.removeSync(ttlFile);
} catch (e) {
log(`Failed to remove file ${ttlFile}. Possbly the xml file was removed before it was trnsformed. \n${e}`);
log(
`Failed to remove file ${ttlFile}. Possbly the xml file was removed before it was transformed. \n${e}`,
);
}
}
}
Expand Down
9 changes: 9 additions & 0 deletions src/deps.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,12 @@ export {
} from "https://deno.land/[email protected]/http/file_server.ts";

export { existsSync } from "https://deno.land/[email protected]/fs/mod.ts";
export * as path from "https://deno.land/[email protected]/path/mod.ts";

export { parseArgs } from "https://deno.land/[email protected]/cli/parse_args.ts";

export { DOMParser } from "https://esm.sh/[email protected]/cached";

// broken somehow??
// export { Element } from "https://esm.sh/v135/[email protected]/types/interface/element.d.ts";
// export type Element = globalThis.Element
Loading
Loading