Skip to content

Commit 040dc37

Browse files
angelhofdkarnikistammam1998nvasilakismgree
authored
Fixes for OSDI 22 Artifact Evaluation (#561)
* Clean up and add comments about structure in README * nit * More notes in the readme * Remove dev checks Signed-off-by: Dimitris Karnikis <[email protected]> * Run the evaluation tests only once Signed-off-by: Dimitris Karnikis <[email protected]> * link fix Signed-off-by: Dimitris Karnikis <[email protected]> * Readme Signed-off-by: Dimitris Karnikis <[email protected]> * fix install dep Signed-off-by: Dimitris Karnikis <[email protected]> * fix dep installation Signed-off-by: Dimitris Karnikis <[email protected]> * fixes to plot generation Signed-off-by: Dimitris Karnikis <[email protected]> * switch to correct input Signed-off-by: Dimitris Karnikis <[email protected]> * Fix bar colors and remove AOT from plots Signed-off-by: Dimitris Karnikis <[email protected]> * Add fig5,6,7 Signed-off-by: Dimitris Karnikis <[email protected]> * Fix input setups for full size inputs Signed-off-by: Dimitris Karnikis <[email protected]> * Add option to run full size input Signed-off-by: Dimitris Karnikis <[email protected]> * Fix input flags Signed-off-by: Dimitris Karnikis <[email protected]> * Evaluation text/logs Signed-off-by: Dimitris Karnikis <[email protected]> * Fix broken link Signed-off-by: Dimitris Karnikis <[email protected]> * added links and descriptions to README components section * more links and descriptions to components * fix typos * fix missed link * fix wrong link * fix broken links * fix typos * more portable links to specific lines of code * update commutativty link to include line of code * more links and replace here * Add graphviz dep Signed-off-by: Dimitris Karnikis <[email protected]> * Fixes to quickcheck Signed-off-by: Dimitris Karnikis <[email protected]> * Posix documentation Signed-off-by: Dimitris Karnikis <[email protected]> * Quickcheck. Graphviz and POSIX Signed-off-by: Dimitris Karnikis <[email protected]> * README checkpoint * README checkpoint * Checkpoint on README * Nits * Add Spin instructions * ssh instructions. Benchmark links Signed-off-by: Dimitris Karnikis <[email protected]> * Fixes Signed-off-by: Dimitris Karnikis <[email protected]> * Update README.md * Update README.md * Update README.md * Fix broken links Signed-off-by: Dimitris Karnikis <[email protected]> * Text fixes Signed-off-by: Dimitris Karnikis <[email protected]> * Link fixes Signed-off-by: Dimitris Karnikis <[email protected]> * typo Signed-off-by: Dimitris Karnikis <[email protected]> * Text fixes Signed-off-by: Dimitris Karnikis <[email protected]> * typo fix * new directory + link fixes Signed-off-by: Dimitris Karnikis <[email protected]> * quickcheck Signed-off-by: Dimitris Karnikis <[email protected]> * tweaks * ref * clarify 'smaller' * wordsmith * small fixes to instructions * Silence find error output Signed-off-by: Dimitris Karnikis <[email protected]> * small note about pkill errors * Nit * Antikythera address typo * Try to fix idempotence of unix50 * Fix unix50 input setup * Update setup for web-index to actually download the 1000.txt index * Correct the web-index input downloading * Fix a bug in setup when installing pash in docker * Fine if we don't get the big indices * Fix an idempotence issue when downloading input for web-index * Correctly install npm dependencies... * Add two more necessary packages for samtools * Add instructions for docker container on osdi22-ae branch * fix cleanup script * Fix plotting script * remove some prints * Fix bio input index * Clean up Signed-off-by: Konstantinos Kallas <[email protected]> Co-authored-by: Dimitris Karnikis <[email protected]> Co-authored-by: Tammam Mustafa <[email protected]> Co-authored-by: Nikos Vasilakis <[email protected]> Co-authored-by: Michael Greenberg <[email protected]>
1 parent 9f18496 commit 040dc37

26 files changed

+1493
-169
lines changed

README.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
> _A system for parallelizing POSIX shell scripts._
44
> _Hosted by the [Linux Foundation](https://linuxfoundation.org/press-release/linux-foundation-to-host-the-pash-project-accelerating-shell-scripting-with-automated-parallelization-for-industrial-use-cases/)._
55
6+
67
| Service | Main | Develop |
78
| :--- | :----: | :----: |
89
| Tests | [![Tests](https://github.com/binpash/pash/actions/workflows/tests.yaml/badge.svg?branch=main&event=push)](https://github.com/binpash/pash/actions/workflows/tests.yaml?query=branch%3Amain++) | [![Tests](https://github.com/binpash/pash/actions/workflows/tests.yaml/badge.svg?branch=future&event=push)](https://github.com/binpash/pash/actions/workflows/tests.yaml?query=branch%3Afuture++) |
@@ -45,7 +46,7 @@ This repo hosts the core `pash` development. The structure is as follows:
4546
* [compiler](./compiler): Shell-dataflow translations and associated parallelization transformations.
4647
* [docs](./docs): Design documents, tutorials, installation instructions, etc.
4748
* [evaluation](./evaluation): Shell pipelines and example [scripts](./evaluation/other/more-scripts) used for the evaluation.
48-
* [runtime](./runtime): Runtime component — e.g., `eager`, `split`, and assocaited combiners.
49+
* [runtime](./runtime): Runtime component — e.g., `eager`, `split`, and associated combiners.
4950
* [scripts](./scripts): Scripts related to continuous integration, deployment, and testing.
5051

5152
## Community & More

docs/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Quick Jump: [using pash](#using-pash) | [videos](#videos--video-presentations) |
55

66
The following resources offer overviews of important PaSh components.
77

8-
* Short tutorial: [introduction](./tutorial#introduction), [installation](./tutorial#installation), [execution](./tutorial#running-scripts), and [next steps](./tutorial#what-next)
8+
* Short tutorial: [introduction](./tutorial#introduction), [installation](./install#installation), [execution](./tutorial#running-scripts), and [next steps](./tutorial#what-next)
99
* Annotations: [parallelizability](../annotations#main-parallelizability-classes), [study](../annotations#parallelizability-study-of-commands-in-gnu--posix), [example 1](../annotations#a-simple-example-chmod), [example 2](../annotations#another-example-cut), [howto](../annotations#how-to-annotate-a-command)
1010
* Compiler: [intro](../compiler#introduction), [overview](../compiler#compiler-overview), [details](../compiler#zooming-into-fragments), [earlier versions](../compiler#earlier-versions)
1111
* Runtime: [split](../runtime#stream-splitting), [eager](../runtime#eager-stream-polling), [cleanup](../runtime#cleanup-logic), [aggregate](../runtime#aggregators)

evaluation/benchmarks/dependency_untangling/input/install-deps.sh

+2-3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
IN=$PASH_TOP/evaluation/benchmarks/dependency_untangling/input/
22
mkdir -p ${IN}/deps/
33
# install dependencies
4-
pkgs='ffmpeg unrtf imagemagick libarchive-tools zstd liblzma-dev libbz2-dev zip unzip nodejs tcpdump'
4+
pkgs='ffmpeg unrtf imagemagick libarchive-tools libncurses5-dev libncursesw5-dev zstd liblzma-dev libbz2-dev zip unzip nodejs tcpdump'
55

66
if ! dpkg -s $pkgs >/dev/null 2>&1 ; then
77
sudo apt-get install $pkgs -y
@@ -25,10 +25,9 @@ if [ ! -d ${IN}/deps/samtools-1.7 ]; then
2525
echo 'Samtools installed'
2626
fi
2727

28-
if [ ! -f ${IN}/deps/makedeb.deb ]; then
28+
if ! dpkg -s "makedeb-makepkg" >/dev/null 2>&1 ; then
2929
cd ${IN}/deps/
3030
wget http://pac-n4.csail.mit.edu:81/pash_data/makedeb.deb
3131
sudo dpkg -i makedeb.deb
3232
echo 'Makedeb installed'
3333
fi
34-

evaluation/benchmarks/dependency_untangling/input/setup.sh

+11-10
Original file line numberDiff line numberDiff line change
@@ -96,17 +96,18 @@ setup_dataset() {
9696
wget http://pac-n4.csail.mit.edu:81/pash_data/100G.txt
9797
# download the Genome loc file
9898
wget http://pac-n4.csail.mit.edu:81/pash_data/Gene_locs.txt
99-
# start downloading the real dataset
99+
# start downloading the real dataset
100+
IN_NAME=$PASH_TOP/evaluation/benchmarks/dependency_untangling/input/bio/100G.txt
100101
cat ${IN_NAME} |while read s_line;
101-
do
102-
echo ${IN_NAME}
103-
sample=$(echo $s_line |cut -d " " -f 2);
104-
if [[ ! -f $sample ]]; then
105-
pop=$(echo $s_line |cut -f 1 -d " ");
106-
link=$(echo $s_line |cut -f 3 -d " ");
107-
wget -O "$sample".bam "$link"; ##this part can be adjusted maybe
108-
fi
109-
done;
102+
do
103+
echo ${IN_NAME}
104+
sample=$(echo $s_line |cut -d " " -f 2);
105+
if [[ ! -f $sample ]]; then
106+
pop=$(echo $s_line |cut -f 1 -d " ");
107+
link=$(echo $s_line |cut -f 3 -d " ");
108+
wget -O "$sample".bam "$link"; ##this part can be adjusted maybe
109+
fi
110+
done;
110111
fi
111112
echo "Genome data downloaded"
112113
fi

evaluation/benchmarks/nlp/input/setup.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ setup_dataset() {
1818
if [ ! -e ./pg ]; then
1919
mkdir pg
2020
cd pg
21-
if [[ "$1" == "--gen-full" ]]; then
21+
if [[ "$1" == "--full" ]]; then
2222
echo 'N.b.: download/extraction will take about 10min'
2323
wget ndr.md/data/pg.tar.xz
2424
if [ $? -ne 0 ]; then

evaluation/benchmarks/run.par.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -231,7 +231,7 @@ nlp_pash(){
231231

232232
cd nlp/
233233

234-
install_deps_source_setup
234+
install_deps_source_setup $1
235235

236236
mkdir -p "$outputs_dir"
237237
mkdir -p "$pash_logs_dir"

evaluation/benchmarks/run.seq.sh

+1-2
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,7 @@ nlp(){
192192

193193
cd nlp/
194194

195-
install_deps_source_setup
195+
install_deps_source_setup $1
196196

197197
mkdir -p "$outputs_dir"
198198

@@ -244,7 +244,6 @@ nlp(){
244244
cd ..
245245
}
246246

247-
248247
aliases(){
249248
seq_times_file="seq.res"
250249
outputs_suffix="seq.out"

evaluation/benchmarks/unix50/input/setup.sh

+6-4
Original file line numberDiff line numberDiff line change
@@ -34,10 +34,12 @@ setup_dataset() {
3434

3535
for input in ${inputs[@]}
3636
do
37-
if [ ! -f "${input}.txt" ]; then
38-
wget "http://ndr.md/data/unix50/${input}.txt"
39-
"$PASH_TOP/scripts/append_nl_if_not.sh" "${input}.txt"
40-
fi
37+
# To get idempotence
38+
rm -f "${input}.txt"
39+
#if [ ! -f "${input}.txt" ]; then
40+
wget "http://ndr.md/data/unix50/${input}.txt"
41+
"$PASH_TOP/scripts/append_nl_if_not.sh" "${input}.txt"
42+
#fi
4143
done
4244

4345
## FIXME: Calling this script with --full is not idempotent.

evaluation/benchmarks/web-index/input/install-deps.sh

+7-3
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,12 @@ if ! dpkg -s pandoc > /dev/null 2>&1 ; then
1414
rm ./pandoc-2.2.1-1-$(dpkg --print-architecture).deb
1515
fi
1616

17+
if ! dpkg -s nodejs > /dev/null 2>&1 ; then
18+
# node version 10+ does not need external npm
19+
curl -fsSL https://deb.nodesource.com/setup_10.x | sudo -E bash -
20+
sudo apt-get install -y nodejs
21+
fi
22+
1723
if [ ! -d node_modules ]; then
18-
# node version 10+ does not need external npm
19-
curl -fsSL https://deb.nodesource.com/setup_10.x | sudo -E bash -
20-
npm install
24+
npm install
2125
fi

evaluation/benchmarks/web-index/input/setup.sh

+15-11
Original file line numberDiff line numberDiff line change
@@ -11,25 +11,29 @@ PASH_TOP=${PASH_TOP:-$(git rev-parse --show-toplevel)}
1111

1212
setup_dataset() {
1313
rm -rf ../1-grams.txt ../2-grams.txt
14-
if [ "$1" = "--small" ]; then
15-
if [[ ! -d ./en ]]; then
16-
# 500 entries
17-
wget http://pac-n4.csail.mit.edu:81/pash_data/small/web-index.small.zip
18-
unzip web-index.small.zip
19-
mv small/* .
20-
rm -rf small web-index.small.zip
21-
fi
22-
elif [ "$1" = "--gen-full" ]; then
14+
15+
## Downloading the dataset needs to happen for both small and large
16+
if [[ ! -d ./en ]]; then
2317
wget $wiki_archive || eexit "cannot fetch wikipedia"
2418
7za x wikipedia-en-html.tar.7z
2519
tar -xvf wikipedia-en-html.tar
26-
wget http://ndr.md/data/wikipedia/index.txt || eexit "cannot fetch wiki indices"
20+
wget http://ndr.md/data/wikipedia/index.txt # || eexit "cannot fetch wiki indices"
21+
# It is actually OK if we don't have this index since we download the 500/1000 below
22+
fi
23+
24+
if [ "$1" = "--small" ]; then
25+
# 500 entries
26+
wget http://pac-n4.csail.mit.edu:81/pash_data/small/web-index.small.zip
27+
unzip web-index.small.zip
28+
mv small/500.txt .
29+
rm -rf small web-index.small.zip
2730
else
31+
# elif [ "$1" = "--full" ]; then
2832
# the default full
2933
# 1000 entries
3034
wget http://pac-n4.csail.mit.edu:81/pash_data/full/web-index.full.zip
3135
unzip web-index.full.zip
32-
mv full/* .
36+
mv full/1000.txt .
3337
rm -rf full web-index.full.zip
3438
fi
3539
}

evaluation/eval_script/README.md

-62
This file was deleted.

0 commit comments

Comments
 (0)