Skip to content

analyze_project_pairs fails (divide-by-zero error) if there are no parallel verses in a corpus pair #693

Open
@mmartin9684-sil

Description

@mmartin9684-sil

If a configuration is mistakenly set up with a corpus pair that has no parallel verses (e.g., OT books only in the source, NT books only in the target), the analyze_project_pairs script will fail with a divide by zero error.

2025-03-28 14:34:41,308 - silnlp.nmt.analyze_project_pairs - INFO - Computing alignment scores using Eflomal
2025-03-28 14:34:41,310 - silnlp.alignment.eflomal - INFO - Generating alignments
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/nmt/analyze_project_pairs.py", line 524, in <module>
    main()
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/nmt/analyze_project_pairs.py", line 498, in main
    get_corpus_stats(config, exp_name, args.recalculate, args.deutero)
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/nmt/analyze_project_pairs.py", line 95, in get_corpus_stats
    add_alignment_scores(corpus, aligner_id)
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/alignment/utils.py", line 86, in add_alignment_scores
    scores = compute_alignment_scores(src_path, trg_path, aligner_id)
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/alignment/utils.py", line 104, in compute_alignment_scores
    aligner.train(src_tok_output_path, trg_tok_output_path)
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/alignment/eflomal.py", line 49, in train
    iters = max(2, int(round(1.0 * 5000 / sqrt(n_sentences))))
ZeroDivisionError: float division by zero
2025-03-28 10:34:59
Process failed, exit code 1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpipeline 3: preprocessIssue related to preprocessing.

    Type

    Projects

    Status

    🔖 Ready

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions