-
Notifications
You must be signed in to change notification settings - Fork 37
Sanskrit text preparation
sujato edited this page Jul 23, 2021
·
15 revisions
Here is an outline of the steps for preparing a Sanskrit text for translation on Bilara.
- Select a source text.
- Let’s assume our text is the Candrasūtra.
- If the text is already on SC, identify it by its project and UID.
- project =
sf
, UID =sf276
- If it is not on SC, assign a project and UID.
- project =
- Add the folder named with the SC UID to the appropriate project in
publication-sources
.bilara-data/.publication-sources/sf/sf276
- Copy the source file or files to the folder.
- Keep the original file name:
sa_candrasUtra.xml
- Keep the original file name:
- Make an HTML file from a local copy of the text.
- Delete all front and end matter, including metadata etc.
- Ensure the HTML file is well-structured with appropriate heading and
<p>
tags. Occasionally other semantic tags such as lists might be used. Ensure each text is wrapped in<article id='uid'>
, and each<h1>
is wrapped in<header>
. - Add paragraph numbers of the form
<p id='sf276:1'>
. Remember, headings take zeroth level.- Paragraph increments are usually added to
<hX>
,<p>
,<ul>
,<ol>
,<dl>
. However do not be rigid about this, especially to keep consistency with source text.
- Paragraph increments are usually added to
- If data other than regular text content is present, make sure it is wrapped as
<span class='reference'>
,<span class='comment'>
,<span class='variant'>
, etc. - Check that any other HTML in the file is well-formed and consistent with SC standards and usages.
- Make sure all HTML uses
'single quotes'
. - Create segments.
- Typically, use punctuation as the basis, then refine it by an initial reading of the text. It is much more efficient to get the segmenting right now than fix it later!
- Wrap segments in
<span class='root'>
(and<span class='translation'>
if there is one). - Run
tidy –doctype html5 –output-html 1 –tidy-mark 0 –quiet 1 –output-encoding utf8 -w 0 –show-warnings 0 -m *.html
- fix any errors.
This will produce an HTML file something like the following.
<!DOCTYPE html>
<html>
<head>
<title></title>
</head>
<body>
<article id='sf276'>
<h1 id='sf276:0'><span class='root'>Candrasūtra</span></h1>
<p id='sf276:1'><span class='root'>evaṃ mayā śrutam</span> <span class='root'>ekasama<i>yaṃ bhagavāñ</i> śrāvastyāṃ viharati jet<i>a</i>v<i>a</i>n<i>a</i> anāthapiṇḍad<i>ā</i>r<i>ā</i>m<i>e /</i></span></p>
<p id='sf276:2'><span class='root'>tena khalu samayena rāhuṇā asurendreṇa sarvaṃ candramaṇḍalam āvṛtam*</span> <i>/</i></p>
<p id='sf276:3'><span class='root'><i>atha</i> yā devatā tasmiṃ<i>ś</i> candramaṇḍala adhyuṣitā sā bhītā trast<i>ā</i> saṃvignā āhṛṣṭaromakūpā yena bhagavāṃs teno<i>pajagāma /</i> upetya bha<i>ga</i>v<i>a</i>tpādau śirasā <i>vanditvaikāṃ</i>te ‘sthād ekāntasthitā sā devatā tasyāṃ velāyāṃ gāthā babhāṣe //</span></p>
<p id='sf276:4'><span class='root'>buddhavīra namas te ‘stu vipramuktāya sarvataḥ<span class='comment'>Ed. bhitā but MS reads bhītā</span></span> <span class='root'>saṃbādhapratipannāsmi tasya me śaraṇaṃ bhava :<span class='comment'>Ed. buddha vīra</span></span></p>
<blockquote class='gatha'>
<p id='sf276:5'><span class='verse-line'><span class='root'>arhantaṃ sugataṃ loke candramāḥ śaraṇaṃ gataḥ</span></span> <span class='verse-line'><span class='root'>rāhoś candramasaṃ muñca buddhā lokānukampakāḥ //</span></span></p>
</blockquote>
<p id='sf276:6'><span class='root'>bhagavān āha //</span></p>
<p><span class='root'>tamonudaṃ taṃ nabhasi prabhākaraṃ virocanaṃ śukla<i>v</i>iśuddhavarcasam*</span> <span class='root'>rāho ś<i>a</i>śāṅkaṃ grasa māntarīkṣe praj<i>ā</i>pr<i>a</i>dīpaṃ drutam utsṛjainam* //</span></p>
<p id='sf276:7'><span class='root'>atha rāhuṇā as<i>u</i>rendreṇa tvaritatvaritaṃ candramaṇḍalam utsṛṣṭam* ⟨/⟩</span> <span class='root'>tataḥ sa<i>ṃ</i>tvaramāṇo ‘sau rāhuś candram avāsṛ<i>jat*</i></span> <span class='root'><i>saṃsvinnagātro vya</i>thitaḥ saṃbhr<i>ānta āturo ya</i>thā //</span></p>
<p id='sf276:8'><span class='root'>adrākṣīd baḍir vairocano <i>rāhuṇā</i> asurendreṇa tvaritatvaritaṃ candr<i>a</i>maṇḍala<i>m utsṛṣṭam* / dṛṣṭvā ca baḍi</i>r gāthāṃ babhāṣe //</span></p>
<p id='sf276:9'><span class='root'>ki<i>ṃ</i> nu sa<i>ṃ</i>tv<i>aramāṇas</i> tv<i>aṃ</i> rāhuś candraṃ vimuñcasi ·</span> <span class='root'>saṃsvinnagātro vyathitaḥ saṃ<i>bhrānta āturo yathā</i> <i>//</i><span class='comment'>Cf. Pelliot Sanskrit bleu 449 Ac: /// ro yathā //</span></span></p>
<p id='sf276:10'><span class='root'><i>rāhur avocat* //</i></span></p>
<p><span class='root'><i>sa</i>ptadhā me sphalen mūrdhā <i>jīvan na sukha</i>m āp<i>nu</i>yāṃ</span> <span class='root'>ta<i>tra buddh</i>ābhigītena muñceyaṃ śaśinaṃ na cet*<span class='comment'>Cf. Pelliot Sanskrit bleu 449 Ac: rāhu prāha // saptadhā me sphal[e] mūrdhā</span></span></p>
<p id='sf276:11'><span class='root'><i>baḍir vairocano ‘vocat* /</i></span> <span class='root'>x x x x x - - - x x x x madarśi<i>nāṃ</i></span> <span class='root'><i>teṣāṃ gāthābhigītena rāhuś candraṃ vimuñcati //</i></span><span class='comment'>Cf. Pelliot Sanskrit bleu 449 Ad: + + + + + .. .. .. .. .. .. .. (bh)i(g)itena muñce</span></p>
<p id='sf276:12'><span class='root'><i>candrasūtraṃ samāptam* //</i></span></p>
</article>
</body>
</html>