-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathslides_git.html
574 lines (453 loc) · 20.8 KB
/
slides_git.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<title>reveal-md</title>
<link rel="stylesheet" href="./dist/reveal.css" />
<link rel="stylesheet" href="./dist/theme/white.css" id="theme" />
<link rel="stylesheet" href="./css/highlight/zenburn.css" />
</head>
<body>
<div class="reveal">
<div class="slides"><section ><section data-markdown><script type="text/template"># 2020-07-01 Software Carpentry Workshop: git
</script></section><section data-markdown><script type="text/template">
## Let me tell you a story...
</script></section><section data-markdown><script type="text/template">
### Once upon a time...
* Wolfman and Dracula were been hired to plan a Mars mission (obviously)
* Wolfman and Dracula live on different continents
* They work on the same plan at the same time
* How to manage this?
* take turns on each file?
* email copies?
* The solution? **Version Control**
</script></section><section data-markdown><script type="text/template">
### Advantage of version control
* Nothing that is committed is *ever* lost (unless you try…)
* We can record who made which changes, and when
* We can revert to previous versions.
* We can identify and correct conflicts
The lab notebook of code development.
</script></section></section><section ><section data-markdown><script type="text/template">
## Version control with `git`
</script></section><section data-markdown><script type="text/template">
### What lies ahead…?
![XKCD comic - A: This is git. It tracks collaborative work on projects through a beautiful distributed graph theory tree mode; B: Cool. How do we use it?; A: No idea. Just memorise these shell commands and type them to sync up. If you get errors, save your work elsewhere, delete the project, and download a gresh copy.](img/git.png)
</script></section><section data-markdown><script type="text/template">
### Learning objectives
* Understand the basics of automated version control
* Understand the basics of `git`
</script></section><section data-markdown><script type="text/template">
### Do you recognise this?
(i.e. have you *ever* worked on the same document as someone else?)
![PhD comic: "Final".doc - files are called: Final.doc, Final_rev.2.doc, FINAL_rev.6.COMMENTS.doc, FINAL_rev.8.commens5.CORRECTIONS.doc, FINAL_rev.18.comment7.corrections9.MORE.30.doc, FINAL_rev.22.comments49.corrections.10.WHYDIDICOMETOGRADSCHOOL.doc](img/phd101212s_small.gif)
</script></section><section data-markdown><script type="text/template">
### How version control works
* Version control is like a 'recording' of history
![Three documents. The first has two paragraphs. The second has a modified paragraph. The third has an additional paragraph](img/play-changes.png)
* Rewind and play back changes
</script></section><section data-markdown><script type="text/template">
### Multiple editors (branching)
* Two people work on a document
* Each makes their changes: docs diverge
![Three documents. On the left is the original, and on the right are two versions of this with different, and conflicting, changes](img/versions.png)
* Changes are separate from the document
</script></section><section data-markdown><script type="text/template">
### Combining changes (merging)
* Several changes can be merged onto the same base document
* 'Merging'
![Three documents. On the left are the two modified documents from the previous slide. On the right is a single that incorporates both of those changes](img/merge.png)
</script></section><section data-markdown><script type="text/template">
### What version control systems do
* Version control systems manage this process
* track changes
* store metadata (who, when)
* record 'versions' (a.k.a. *commits*)
* give you access to any of those versions
`git` is a version control system.
</script></section></section><section ><section data-markdown><script type="text/template">
## Setting up `git`
</script></section><section data-markdown><script type="text/template">
### Learning objectives
* Configure `git` for first use on a computer
* Understand `git config --global`
</script></section><section data-markdown><script type="text/template">
### Setting global options
* `git` needs to know who you are for metadata
* `git` wants your preferences for display/editing
**Live Presentation**
```
git config --global user.name "Vlad Dracul"
git config --global user.email "[email protected]"
git config --global color.ui "auto"
git config --global core.editor "nano -w"
```
</script></section></section><section ><section data-markdown><script type="text/template">
## Creating a repository
</script></section><section data-markdown><script type="text/template">
### Learning objectives
* Create a local `git` repository
* What is in a repository?
* files
* commits
* metadata
</script></section><section data-markdown><script type="text/template">
### Creating a `git` repository
* A fictional project about planets
* (Wolfman and Dracula…)
**Live Presentation**
```
git init
git status
```
</script></section></section><section ><section data-markdown><script type="text/template">
## Tracking changes
</script></section><section data-markdown><script type="text/template">
### Learning objectives
* Practice the modify-add-commit cycle
* Understand where information is stored, in the `git` workflow
</script></section><section data-markdown><script type="text/template">
### My first untracked file
* We'll create a file, but do nothing with it
* "Is Mars suitable as a space base?"
**Live Presentation**
```
nano mars.txt
```
</script></section><section data-markdown><script type="text/template">
### My first `git` commit
* We tell `git` that it should *track* a file (watch for changes): `git add`
* We also `git commit` the file (keep a copy of the file in the *repository*, in its current state)
**Live Presentation**
```
git add mars.txt
git commit -m "start notes on Mars as a base"
git log
```
</script></section><section data-markdown><script type="text/template">
### The staging area
* We don't always want to commit all changes
* The *staging area* holds changes we want to commit
* (other files may also be changed, but we don't want to commit them)
![On the left is a modified document. On the right is a zone representing the data stored in `.git`. In that zone are two containers: a staging area, and a repository. Using `git add` places the document into the staging area. Using `git commit` moves the document from the staging area into the repository](img/git-staging-area.png)
</script></section><section data-markdown><script type="text/template">
### modify-add-commit
* Now we want to add more information to the file
* Modify file
* Add file to *staging area* (`git add`)
* Commit changes
**Live Presentation**
```
nano mars.txt
git diff
git add mars.txt
git diff
git commit
```
</script></section><section data-markdown><script type="text/template">
### Question
- Which command(s) below would save changes in `myfile.txt` to the local `git` repository?
1. `git commit -m "add recent changes"`
2. `git init myfile.txt; git commit -m "add recent changes"`
3. `git add myfile.txt; git commit -m "add recent changes"`
4. `git commit -m myfile.txt "add recent changes"`
</script></section><section data-markdown><script type="text/template">
### Challenge 1 (5min)
* Make a one-line change to `mars.txt`.
* Create file `earth.txt` with one-line comment on Earth.
* Commit both changes (*as a **single*** `commit`)
![On the left are two documents (FILE1.txt and FILE2.txt). On the right is a zone representing the `.git` directory. Arrows show the use of `git add` to place the two documents into the staging area, followed by a `git commit` to move both files simultaneously from the staging area to the repository](img/git-committing.png)
</script></section><section data-markdown><script type="text/template">
### The modify-add-commit lifecycle
![A UML-like diagram showing four potential states of a file, according to `git`: untracked, unmodified, modified, and staged. Arrows indicate the actions required to move a file from one state to another: untracked to staged, add the file; unmodified to modified, edit the file; modified to staged, stage/add the file; staged to unmodified, commit the file; unmodified to untracked, remove the file](img/lifecycle.png)
</script></section><section data-markdown><script type="text/template">
### In which I predict the future…
![XKCD comic: a list of commit messages for a repository that start well, but become progressively more like gibberish, titled "As a project drags on, my `git` commit messages get less and less informative"](img/git_commit.png)
</script></section></section><section ><section data-markdown><script type="text/template">
## Exploring history
</script></section><section data-markdown><script type="text/template">
### Is history bunk?
* How can I identify old versions of files?
* How do I review changes between commits?
* How can I recover old file versions?
</script></section><section data-markdown><script type="text/template">
### Learning objectives
* Understand what the `HEAD` of a repository is
* Identify and use `git` commit numbers
* Compare various versions of tracked files
* Restore old versions of files
</script></section><section data-markdown><script type="text/template">
### Commit history
* Most recent commit: `HEAD`
* Next-most recent: `HEAD~1`
* Next-next-most recent: `HEAD~2`
![Three documents. On the left is the original. In the middle is that document with one line changed. On the right is the middle document with an extra paragraph added. Arrows indicate that the documents are related in order of history.](img/play-changes.png)
</script></section><section data-markdown><script type="text/template">
### History with `git diff`
* We can use `git diff` to see what changed for a file at each commit
**Live Presentation**
```
git diff HEAD~1 mars.txt
git diff HEAD~2 mars.txt
```
</script></section><section data-markdown><script type="text/template">
### History with commit IDs
* We can use the unique ID for a commit in the same way
**Live Presentation**
```
git diff d22195b9ec3c8fb4c2ce0f52f344b95ce5d0d0e3 mars.txt
git diff d221 mars.txt
```
</script></section><section data-markdown><script type="text/template">
### Restoring older versions
* How can we restore older versions/backtrack?
* Let's say we accidentally overwrite a file…
**Live Presentation**
```
git checkout HEAD mars.txt
```
</script></section><section data-markdown><script type="text/template">
### `git checkout`
* `git checkout` "checks out" files from the repo
* Can use any commit identifier
* Check out the commit *before* the edit you want to replace!
![On the left is a zone representing the `.git` directory, with three commits in a repository. One commit (HEAD~1, f22b25e) contains changes we want to recover. On the right are two files that are rcovered. An arrow indicates two commands for recovery: `git checkout HEAD~1` and `git checkout f22b25e`](img/git-checkout.png)
</script></section><section data-markdown><script type="text/template">
### Question
- Which command(s) below will let Jennifer recover the last committed version of her Python script called `data_cruncher.py` (but no other files)?
1. `$ git checkout HEAD`
2. `$ git checkout HEAD data_cruncher.py`
3. `$ git checkout HEAD~1 data_cruncher.py`
4. `$ git checkout <unique ID of last commit> data_cruncher.py`
</script></section></section><section ><section data-markdown><script type="text/template">
## Ignoring things
</script></section><section data-markdown><script type="text/template">
### Learning objectives
* Configure `git` to ignore files and directories
* Understand why this is useful
</script></section><section data-markdown><script type="text/template">
### Not all files are useful
* Editor backup files
* Temporary files
* Intermediate analysis files
**Live Presentation**
```
mkdir results
touch a.dat results/a.out
```
</script></section><section data-markdown><script type="text/template">
### `.gitignore`
* `.gitignore` is a special file in your repository root
* It tells `git` to ignore specified files/directories
* It should be committed in your repository
**Live Presentation**
```
nano .gitignore
git status --ignored
git add -f b.dat
```
</script></section></section><section ><section data-markdown><script type="text/template">
## Remotes in GitHub
</script></section><section data-markdown><script type="text/template">
### Learning objectives
* What remote repositories are and why they are useful
* To clone a remote repository
* To push to and pull from a remote repository
</script></section><section data-markdown><script type="text/template">
### Remote repositories
* Version control most useful for collaboration
* Easiest to have a single repository
* Repository may be hosted off-site (for at least one collaborator)
* Services available:
* GitHub, BitBucket, GitLab
* We're using GitHub
</script></section><section data-markdown><script type="text/template">
### GitHub Saved My Life!
![GitHub Saved My Life Tonight](img/github_saved_my_life.png)
</script></section><section data-markdown><script type="text/template">
### Log in to GitHub
* Register for an account, if you don't have one - then log in
![Screenshot of widdowquinn GitHub profile](img/lp_github.png)
</script></section><section data-markdown><script type="text/template">
### Create a remote repository
* Essentially, on GitHub's servers:
```
mkdir planets
cd planets
git init
```
**Live Presentation**
</script></section><section data-markdown><script type="text/template">
### A freshly-made GitHub repository
* There's nothing in the remote repository!
![Two repositories. At the top, the local `planets` repository (belonging to Vlad), which contains files in the staging area and repository. At the bottom, an empty epository, representing the 'clean' repository just created on `GitHub`](img/git-freshly-made-github-repo.png)
</script></section><section data-markdown><script type="text/template">
### Connecting local and remote repositories
* We tell the *local* repository that the GitHub repository is its *remote* repository.
* `origin` is a local nickname for the remote repo (a common choice)
* Once set up, we *push* changes/history to the remote repo
**Live Presentation**
```
git remote add origin https://github.com/widdowquinn/planets.git
git push origin master
```
</script></section><section data-markdown><script type="text/template">
### Remote GitHub repo after first *push*
* We only *push* the repository, not the staging area
![Two repositories. At the top, the local `planets` repository (belonging to Vlad), which contains files in the staging area and repository. At the bottom, the remote `GitHub` repository, which contains the same repostitory as the local repo - but *not* the staging area](img/github-repo-after-first-push.png)
</script></section><section data-markdown><script type="text/template">
### My first remote *pull*
* To synchronise the local repo with the remote repo, we *pull*
**Live Presentation**
```
git pull origin master
```
</script></section></section><section ><section data-markdown><script type="text/template">
## GitHub collaboration
</script></section><section data-markdown><script type="text/template">
### Learning objectives
* Collaborate pushing changes to someone else's remote repository
</script></section><section data-markdown><script type="text/template">
### Starting a collaboration
* Pair up as 'owner' and 'collaborator'
- doing this remotely will be fine
* as _owner_: give GitHub repo access to your collaborator
* as *collaborator*: clone the _owner's_ repo
```
cd /tmp/
git clone https://github.com/<collaborator>/planets.git
git remote -v
```
</script></section><section data-markdown><script type="text/template">
### Make a collaborative change
**Collaborator:**
* Add a file called `pluto.txt` - (content your own)
* Commit that file and push your commits
```
cd planets
nano pluto.txt
git add pluto.txt
git commit -m "Notes on Pluto"
git push origin master
```
</script></section><section data-markdown><script type="text/template">
### Pull a collaborator's change
**Owner:**
* Change back to your **own** repository
* Check with `git remote -v`
* `git pull` the changes made by your collaborator
```
cd ~/planets/
git pull origin master
```
</script></section></section><section ><section data-markdown><script type="text/template">
## Resolving `git` conflicts
</script></section><section data-markdown><script type="text/template">
### Learning objectives
* Understand what conflicts are, and when they occur
* To be able to resolve conflicts resulting from a merge
</script></section><section data-markdown><script type="text/template">
### Why conflicts occur
* People working in parallel
* different changes to same part of a file
* not keeping local repo in sync before making local changes
* not keeping remote repo in sync after making local changes
* `git pull` before working; `git push` when done
</script></section><section data-markdown><script type="text/template">
### Seriously, `git push` when done…
![A sign: In case of fire 1. git commit, 2. git push, 3. leave building](img/git_fire_notice.jpg)
</script></section><section data-markdown><script type="text/template">
### Let's make a conflict
**in your pairs**
* As the *owner*: add a line to `mars.txt`
* Commit and push the change
**then**
* As the *collaborator*: add a line to `mars.txt`
* Commit and push the change
```
cd ~/planets
nano mars.txt
git push origin master
cd /tmp/planets
nano mars.txt
git push origin master
```
</script></section><section data-markdown><script type="text/template">
### The conflict message
```
To https://github.com/<collaborator>/planets.git
! [rejected] master -> master (fetch first)
error: failed to push some refs to 'https://github.com/<collaborator>/planets.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
```
</script></section><section data-markdown><script type="text/template">
### The conflicting changes
![Three text files. At the top is the original, below this are two versions with conflicting changes. Arrows point to a question mark: how will we resolve this?](img/conflict.png)
</script></section><section data-markdown><script type="text/template">
### Resolving a conflict
* `git` detects overlapping changes
* `git` defers to humans for how to resolve: **communicate**!
* To resolve:
* *pull* remote changes
* *merge* changes into our working copy
* *push* the merged changes
</script></section></section><section ><section data-markdown><script type="text/template">
## Wrapping up
* GitHub/version control can be an open electronic lab book as part of Open Science workflows
* Collect data - store in OA repository (Zenodo/FigShare)
* Use GitHub to store work in progress: analysis lab book
* Post preprint to (Bio)arXiv
* Even if you don't work openly, it's more reproducible (and auditable)
</script></section><section data-markdown><script type="text/template">
### You're ready to leave this behind…
![PhD comic: A directory listing with filenames like data_2010.05.28_test.dat, data_2010.05.28_re-test.dat, data_2010.05.28_re-re-test.dat, data_2010.05.28_calibrate.dat, data_2010.05.29_aaarrrgh.dat, data_2010.05.29_WTF.dat, data_2010.05.29_USETHISONE.dat](img/phd052810s.png)
</script></section></section></div>
</div>
<script src="./dist/reveal.js"></script>
<script src="./plugin/markdown/markdown.js"></script>
<script src="./plugin/highlight/highlight.js"></script>
<script src="./plugin/zoom/zoom.js"></script>
<script src="./plugin/notes/notes.js"></script>
<script src="./plugin/math/math.js"></script>
<script>
function extend() {
var target = {};
for (var i = 0; i < arguments.length; i++) {
var source = arguments[i];
for (var key in source) {
if (source.hasOwnProperty(key)) {
target[key] = source[key];
}
}
}
return target;
}
// default options to init reveal.js
var defaultOptions = {
controls: true,
progress: true,
history: true,
center: true,
transition: 'default', // none/fade/slide/convex/concave/zoom
plugins: [
RevealMarkdown,
RevealHighlight,
RevealZoom,
RevealNotes,
RevealMath
]
};
// options from URL query string
var queryOptions = Reveal().getQueryHash() || {};
var options = extend(defaultOptions, {}, queryOptions);
</script>
<script>
Reveal.initialize(options);
</script>
</body>
</html>