import { CodeSurfer, CodeSurferColumns, Step, } from "code-surfer"; import { Appear, Notes, Invert, Split } from "mdx-deck"; import { github, vsDark } from "@code-surfer/themes"; import { Logo } from "./components"; import "prismjs/components/prism-docker"; import "prismjs/components/prism-shell-session";
export const theme = vsDark;
- Software developer with CIS here at Brown
- I work with cloud technologies and containerization
- Worked in private, government and now higher-ed
- Worked for small and large companies
- Most recently worked for the company that made the sensors in the mail room
- Here to provide a developers perspective on reproducibility
- For all of us, not just scientists
- Yes, software reproducibility enables better science
- Also reduces the set of bugs that can occur in production software
- I am not qualified to talk about data
- I'm going to focus on software reproducibility
- Need each ingredient to get good reproducibility
- Peg libraries to one version, not a range
- Select libraries that are maintained and have a corporate sponsor
- How many have hear of containers?
- How many use containers?
- Great abstraction for capturing environment
- Still not perfect
- Two main contenders at Brown
- Singularity more used for research
- Docker used for application development
- Going to use Docker today
- I'm more familiar with Docker
- Both are OCI compliant
- OCI = Open Container Initiative
--- Dockerfile 2019-10-25 14:21:55.000000000 -0400
+++ Dockerfile 2019-10-25 14:41:29.000000000 -0400
@@ -1,4 +1,4 @@
-FROM python
+FROM python:3.8
RUN apt-get update && apt-get install -y \
libxml2-dev \
--- Dockerfile 2019-10-25 14:21:55.000000000 -0400
+++ Dockerfile 2019-10-25 14:41:29.000000000 -0400
@@ -1,4 +1,4 @@
-FROM python:3.8
+FROM python:3.8.0-buster
RUN apt-get update && apt-get install -y \
libxml2-dev \
--- Dockerfile 2019-10-25 14:21:55.000000000 -0400
+++ Dockerfile 2019-10-25 14:52:54.000000000 -0400
@@ -1,8 +1,8 @@
FROM python:3.8.0-buster
RUN apt-get update && apt-get install -y \
- libxml2-dev \
- libsqlite3-dev
+ libxml2-dev=2.9.4+dfsg1-7+b3 \
+ libsqlite3-dev=3.27.2-3
COPY requirements.txt .
RUN pip install -r requirements.txt
--- Dockerfile 2019-10-25 22:33:13.000000000 -0400
+++ Dockerfile.2 2019-10-28 07:56:14.000000000 -0400
@@ -15,5 +15,6 @@
WORKDIR /app
COPY . ./
+ENV RANDOM_SEED=1024
ENTRYPOINT ["python"]
-CMD ["my-script.py"]
+CMD ["my-script.py", "special", "arguments", "here"]
- Contrived dockerfile
- Builds our image, but doesn't run
- Few common gotchas when making new images
- Different operating systems have different packages
- Alpine linux uses a different libc (glibc vs musl)
$ docker build .
$ docker build -t science .
$ docker build -t science:1.0.7 .
- This bit is Docker specific
- Singularity container names based on file paths/names
- Image is built and tagged with a well specified runtime
- Time to gather results and do some science
- Papers written, science complete
- Distribute container with source to reproduce experiments
- Docker can only be distributed through hubs
- Singularity generates a file that can be shared
- DCT initialized on registry and repository
- Admin generates delegate keys for user
- User pushes image with DTC enabled
- Signing and verification automatic
- Two commands
- Sign
- Verify
- Document kernel differences
- Document hardware used to produce results
- Don't modify code at runtime
- Don't dynamically download code or libraries at runtime