From 121b3a6b9b4534c9a183aab71e2b248c3d664086 Mon Sep 17 00:00:00 2001
From: Julius Richter <jrichter@informatik.uni-hamburg.de>
Date: Mon, 24 Jun 2024 13:50:08 +0200
Subject: [PATCH] clean up

---
 index.html | 268 ++++++++++++++++++++++++++++++++---------------------
 1 file changed, 162 insertions(+), 106 deletions(-)
diff --git a/index.html b/index.html
index c3ee1c8..bfd2333 100644
--- a/index.html
+++ b/index.html
@@ -67,11 +67,18 @@
     }  
 </style>
 
-<h1>EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation</h1>
+<h1>
+    EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation
+</h1>
 
-<p class="left-align">Julius Richter, Yi-Chiao Wu, Steven Krenn, Simon Welker, Bunlong Lay, Shinji Watanabe, Alexander Richard, Timo Gerkmann</p>
+<p class="left-align">
+    Julius Richter, Yi-Chiao Wu, Steven Krenn, Simon Welker, Bunlong Lay, Shinji Watanabe, Alexander Richard, Timo 
+    Gerkmann
+</p>
 
-<h2>Abstract</h2>
+<h2>
+    Abstract
+</h2>
 
 <p>
     We release the EARS (<b>E</b>xpressive <b>A</b>nechoic <b>R</b>ecordings of <b>S</b>peech) dataset, a high-quality
@@ -84,9 +91,14 @@ <h2>Abstract</h2>
     Dataset download links and automatic evaluation server can be found online.
 </p>
 
-<h2>EARS Dataset</h2>
+<h2>
+    EARS Dataset
+</h2>
 
-<p>The EARS dataset is characterized by its scale, diversity, and high recording quality. In Table 1, we list characteristics of the EARS dataset in comparison to other speech datasets. </p>
+<p>
+    The EARS dataset is characterized by its scale, diversity, and high recording quality. In Table 1, we list 
+    characteristics of the EARS dataset in comparison to other speech datasets. 
+</p>
 
 <div class="table-container">
     <table>
@@ -150,99 +162,106 @@ <h2>EARS Dataset</h2>
 </div>
 
 <p>
-    EARS contains 100 h of anechoic speech recordings at 48 kHz from over 100 English speakers with high demographic diversity. 
-    The dataset spans the full range of human speech, including reading tasks in seven different reading styles, emotional reading 
-    and freeform speech in 22 different emotions, conversational speech, and non-verbal sounds like laughter or coughing. Reading 
-    tasks feature seven styles (regular, loud, whisper, fast, slow, high pitch, and low pitch). Additionally, the dataset features 
-    unconstrained freeform speech and speech in 22 different emotional styles. We provide transcriptions of the reading portion and 
-    meta-data of the speakers (gender, age, race, first language).
+    EARS contains 100 h of anechoic speech recordings at 48 kHz from over 100 English speakers with high demographic 
+    diversity. The dataset spans the full range of human speech, including reading tasks in seven different reading 
+    styles, emotional reading and freeform speech in 22 different emotions, conversational speech, and non-verbal sounds 
+    like laughter or coughing. Reading tasks feature seven styles (regular, loud, whisper, fast, slow, high pitch, and 
+    low pitch). Additionally, the dataset features unconstrained freeform speech and speech in 22 different emotional 
+    styles. We provide transcriptions of the reading portion and meta-data of the speakers (gender, age, race, first
+    language).
 </p>
 
-<h3>Audio Examples</h3>
+<h3>
+    Audio Examples
+</h3>
 
-<p>Here we present a few audio examples from the EARS dataset. </p>
+<p>
+    Here we present a few audio examples from the EARS dataset. 
+</p>
 
 <p>
-p002/emo_adoration_sentences.wav<br>
-<audio controls>
-  <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p002_emo_adoration_sentences.wav" type="audio/wav">
-  Your browser does not support the audio element.
-</audio>
-<br>
+    p002/emo_adoration_sentences.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p002_emo_adoration_sentences.wav" type="audio/wav">
+        Your browser does not support the audio element.
+    </audio>
+    <br>
 
-p008/emo_contentment_sentences.wav<br>
-<audio controls>
-  <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p008_emo_contentment_sentences.wav" type="audio/wav">
-  Your browser does not support the audio element.
-</audio>
-<br>
+    p008/emo_contentment_sentences.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p008_emo_contentment_sentences.wav" type="audio/wav">
+        Your browser does not support the audio element.
+        </audio>
+    <br>
 
-p010/emo_cuteness_sentences.wav<br>
-<audio controls>
-    <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p010_emo_cuteness_sentences.wav" type="audio/wav">
-    Your browser does not support the audio element.
-  </audio>
-<br>
+    p010/emo_cuteness_sentences.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p010_emo_cuteness_sentences.wav" type="audio/wav">
+        Your browser does not support the audio element.
+    </audio>
+    <br>
 
-p011/emo_anger_sentences.wav<br>
-<audio controls>
-    <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p011_emo_anger_sentences.wav" type="audio/wav">
-    Your browser does not support the audio element.
-</audio>
-<br>
+    p011/emo_anger_sentences.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p011_emo_anger_sentences.wav" type="audio/wav">
+        Your browser does not support the audio element.
+    </audio>
+    <br>
 
-p012/rainbow_05_whisper.wav<br>
-<audio controls>
-    <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p012_rainbow_05_whisper.wav" type="audio/wav">
-    Your browser does not support the audio element.
-</audio>
-<br>
+    p012/rainbow_05_whisper.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p012_rainbow_05_whisper.wav" type="audio/wav">
+        Your browser does not support the audio element.
+    </audio>
+    <br>
 
-p014/rainbow_04_loud.wav<br>
-<audio controls>
-    <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p014_rainbow_04_loud.wav" type="audio/wav">
-    Your browser does not support the audio element.
-</audio>
-<br>
+    p014/rainbow_04_loud.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p014_rainbow_04_loud.wav" type="audio/wav">
+        Your browser does not support the audio element.
+    </audio>
+    <br>
 
-p016/rainbow_03_regular.wav<br>
-<audio controls>
-    <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p016_rainbow_03_regular.wav" type="audio/wav">
-    Your browser does not support the audio element.
-</audio>
-<br>
+    p016/rainbow_03_regular.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p016_rainbow_03_regular.wav" type="audio/wav">
+        Your browser does not support the audio element.
+    </audio>
+    <br>
 
-p017/rainbow_08_fast.wav<br>
-<audio controls>
-    <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p017_rainbow_08_fast.wav" type="audio/wav">
-    Your browser does not support the audio element.
-</audio>
-<br>
+    p017/rainbow_08_fast.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p017_rainbow_08_fast.wav" type="audio/wav">
+        Your browser does not support the audio element.
+    </audio>
+    <br>
 
-p018/vegetative_eating.wav<br>
-<audio controls>
-    <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p018_vegetative_eating.wav" type="audio/wav">
-    Your browser does not support the audio element.
-</audio>
-<br>
+    p018/vegetative_eating.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p018_vegetative_eating.wav" type="audio/wav">
+        Your browser does not support the audio element.
+    </audio>
+    <br>
 
-p019/vegetative_yawning.wav<br>
-<audio controls>
-    <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p019_vegetative_yawning.wav" type="audio/wav">
-    Your browser does not support the audio element.
-</audio>
-<br>
+    p019/vegetative_yawning.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p019_vegetative_yawning.wav" type="audio/wav">
+        Your browser does not support the audio element.
+    </audio>
+    <br>
 
-p020/freeform_speech_01.wav<br>
-<audio controls>
-    <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p020_freeform_speech_01.wav" type="audio/wav">
-    Your browser does not support the audio element.
-</audio>
+    p020/freeform_speech_01.wav<br>
+    <audio controls>
+        <source src="https://www2.informatik.uni-hamburg.de/sp/audio/publications/interspeech2024-ears/files/ears/p020_freeform_speech_01.wav" type="audio/wav">
+        Your browser does not support the audio element.
+    </audio>
 </p>
 
 <br>
 
-<h2>Benchmarks</h2>
+<h2>
+    Benchmarks<
+    /h2>
 
 <p>
     The EARS dataset enables various speech processing tasks to be evaluated in a controlled and comparable way. Here, we
@@ -251,12 +270,14 @@ <h2>Benchmarks</h2>
 
 <br>
 
-<h3>EARS-WHAM</h3>
+<h3>
+    EARS-WHAM
+</h3>
 
 <p>
-    For the task of speech enhancement, we construct the EARS-WHAM dataset, which mixes speech from the EARS dataset with
-    real noise recordings from the WHAM! dataset <span class="reference" data-ref="wham"></span>. More details can be
-    found in the <a href="https://arxiv.org/abs/2406.06185" target="_blank">paper</a>.
+    For the task of speech enhancement, we construct the EARS-WHAM dataset, which mixes speech from the EARS dataset 
+    with real noise recordings from the WHAM! dataset <span class="reference" data-ref="wham"></span>. More details can 
+    be found in the <a href="https://arxiv.org/abs/2406.06185" target="_blank">paper</a>.
 </p>
 
 <h4>Results</h4>
@@ -264,8 +285,8 @@ <h4>Results</h4>
 <div class="table-container">
     <table style="width:100%; border-collapse: collapse;">
         <caption style="caption-side: bottom; text-align: left; padding: 8px; font-style: italic;">
-            <strong>Table 2: Results on EARS-WHAM.</strong> Values indicate the mean of the metrics over the test set. The
-            best results are highlighted in bold.
+            <strong>Table 2: Results on EARS-WHAM.</strong> Values indicate the mean of the metrics over the test set. 
+            The best results are highlighted in bold.
         </caption>
         <thead>
             <tr style="border-top: 2px solid black; border-bottom: 2px solid black;">
@@ -322,9 +343,18 @@ <h4>Results</h4>
     </table>
 </div>
 
-<h4>Audio Examples</h4>
+<h4>
+    Audio Examples
+</h4>
 
-<p>Here we present audio examples for the speech enhancement task. Below we show the noisy input, processed files for Conv-TasNet <span class="reference" data-ref="convtasnet"></span>, CDiffuSE <span class="reference" data-ref="cdiffuse"></span>, Demucs <span class="reference" data-ref="demucs"></span>, SGMSE+ <span class="reference" data-ref="sgmse"></span>, and the clean ground truth.</p>
+<p>
+    Here we present audio examples for the speech enhancement task. Below we show the noisy input, processed files for 
+    Conv-TasNet <span class="reference" data-ref="convtasnet"></span>, 
+    CDiffuSE <span class="reference" data-ref="cdiffuse"></span>, 
+    Demucs <span class="reference" data-ref="demucs"></span>, 
+    SGMSE+ <span class="reference" data-ref="sgmse"></span>, 
+    and the clean ground truth.
+</p>
 
 <p>Select an audio file: &nbsp;&nbsp;
     <select id="audioSelect" onchange="playAudio()">
@@ -442,7 +472,9 @@ <h4>Audio Examples</h4>
 </script>
 <br>
 
-<h3>Blind test set</h3>
+<h3>
+    Blind test set
+</h3>
 
 <p>
     We create a blind test set for which we only publish the noisy audio files but not the clean ground truth. It
@@ -455,8 +487,8 @@ <h4>Results</h4>
 <div class="table-container">
     <table>
         <caption style="caption-side: bottom; text-align: left; padding: 8px; font-style: italic;">
-            <strong>Table 3: Results for the blind test.</strong> Values indicate the mean of the metrics over the test set.
-            The best results are highlighted in bold.
+            <strong>Table 3: Results for the blind test.</strong> Values indicate the mean of the metrics over the test 
+            set. The best results are highlighted in bold.
         </caption>
         <thead>
             <tr style="border-top: 2px solid black; border-bottom: 2px solid black;">
@@ -513,10 +545,16 @@ <h4>Results</h4>
     </table>
 </div>
 
-<h4>Audio Examples</h4>
+<h4>
+    Audio Examples
+</h4>
 
-<p>Here we present audio examples for the blind test set. Below we show the noisy input, processed files for Conv-TasNet <span class="reference" data-ref="convtasnet"></span>, 
-    CDiffuSE <span class="reference" data-ref="cdiffuse"></span>, Demucs <span class="reference" data-ref="demucs"></span>, and SGMSE+ <span class="reference" data-ref="sgmse"></span>.
+<p>
+    Here we present audio examples for the blind test set. Below we show the noisy input, processed files for 
+    Conv-TasNet <span class="reference" data-ref="convtasnet"></span>, 
+    CDiffuSE <span class="reference" data-ref="cdiffuse"></span>, 
+    Demucs <span class="reference" data-ref="demucs"></span>, 
+    and SGMSE+ <span class="reference" data-ref="sgmse"></span>.
 </p>
 
 <p>Select an audio file: &nbsp;&nbsp;
@@ -624,7 +662,9 @@ <h4>Audio Examples</h4>
 </script>
 <br>
 
-<h3>Evaluation on real-world data</h3>
+<h3>
+    Evaluation on real-world data
+</h3>
 
 <p>
     This demo showcases the denoising capabilities of SGMSE+ <span class="reference" data-ref="sgmse"></span> trained using the EARS-WHAM dataset. 
@@ -637,14 +677,20 @@ <h3>Evaluation on real-world data</h3>
 
 <br>
 
+<h3>
+    Dereverberation (EARS-Reverb)
+</h3>
 
-<h3>Dereverberation (EARS-Reverb)</h3>
-
-<p>For the task of dereverberation, we use real recorded room impulse responses (RIRs) from multiple public datasets
-    <span class="reference" data-ref="ace air arni brudex dechorate detmoldsrir palimpsest"></span>. We generate reverberant speech by convolving the clean
-    speech with the RIR. More details can be found in the <a href="https://arxiv.org/abs/2406.06185" target="_blank">paper</a>.</p>
+<p>
+    For the task of dereverberation, we use real recorded room impulse responses (RIRs) from multiple public datasets
+    <span class="reference" data-ref="ace air arni brudex dechorate detmoldsrir palimpsest"></span>. We generate 
+    reverberant speech by convolving the clean speech with the RIR. More details can be found in the 
+    <a href="https://arxiv.org/abs/2406.06185" target="_blank">paper</a>.
+</p>
 
-<h4>Results</h4>
+<h4>
+    Results
+</h4>
 
 <div class="table-container">
     <table>
@@ -683,9 +729,14 @@ <h4>Results</h4>
     </table>
 </div>
 
-<h4>Audio Examples</h4>
+<h4>
+    Audio Examples
+</h4>
 
-<p>Here we present audio examples for the dereverberation task. Below we show the reverberant input, processed files for SGMSE+ <span class="reference" data-ref="sgmse"></span>, and the clean ground truth.</p>
+<p>
+    Here we present audio examples for the dereverberation task. Below we show the reverberant input, processed files 
+    for SGMSE+ <span class="reference" data-ref="sgmse"></span>, and the clean ground truth.
+</p>
 
 <p>
     Select an audio file: &nbsp;&nbsp;
@@ -772,10 +823,13 @@ <h4>Audio Examples</h4>
 
 <br>
 
-<h2 id="citation">Citation</h2>
+<h2 id="citation">
+    Citation
+</h2>
 
 <p>
-If you use the dataset or any derivative of it, please cite our <a href="https://arxiv.org/abs/2406.06185" target="_blank">research paper</a>: 
+    If you use the dataset or any derivative of it, please cite our 
+    <a href="https://arxiv.org/abs/2406.06185" target="_blank">research paper</a>: 
 </p>
 
 <p>
@@ -784,10 +838,12 @@ <h2 id="citation">Citation</h2>
   author={Richter, Julius and Wu, Yi-Chiao and Krenn, Steven and Welker, Simon and Lay, Bunlong and Watanabe, Shinjii and Richard, Alexander and Gerkmann, Timo},
   booktitle={Interspeech},
   year={2024}{% endraw %}
-}</code></pre>
+</code></pre>
 </p>
 
-<h2>References</h2>
+<h2>
+    References
+</h2>
 
 <ol id="refList" style="list-style-type: none; padding: 0;">
     <!-- JavaScript will populate this list -->