Skip to content

Commit

Permalink
deploy: 364e2a0
Browse files Browse the repository at this point in the history
  • Loading branch information
AmirMardan committed Apr 3, 2024
0 parents commit 733de34
Show file tree
Hide file tree
Showing 63 changed files with 6,435 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 5921507015a185ef15126c8683f60e3b
tags: 645f666f9bcd5a90fca523b33c5a78b7
Binary file added .doctrees/environment.pickle
Binary file not shown.
Binary file added .doctrees/index.doctree
Binary file not shown.
Binary file added .doctrees/predict.doctree
Binary file not shown.
Binary file added .doctrees/readme.doctree
Binary file not shown.
Binary file added .doctrees/train.doctree
Binary file not shown.
Empty file added .nojekyll
Empty file.
Binary file added _images/fb_data.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/fb_data_preparing.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/fb_introducing_fb_segmentation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/fb_predict_one.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/fb_preprocessed_data.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/fb_train.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/github.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/waves.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
28 changes: 28 additions & 0 deletions _sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
.. First Break Picking documentation master file, created by
sphinx-quickstart on Thu Mar 28 09:18:34 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to First Break Picking's documentation!
===============================================

.. figure:: ./imgs/github.png
:width: 20
:alt: git
:target: https://github.com/geo-stack/first_break_picking

.. toctree::
:maxdepth: 1
:caption: Quick Start:

readme.md

.. toctree::
:maxdepth: 1
:caption: Train and Predict:

train

predict


19 changes: 19 additions & 0 deletions _sources/predict.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Predict
=======


first\_break\_picking.train\_eval.ai\_tools
-------------------------------------------

.. automodule:: first_break_picking.train_eval.ai_tools
:members: save_checkpoint, load_checkpoint
:undoc-members:
:show-inheritance:

first\_break\_picking.train\_eval.predict module
------------------------------------------------

.. automodule:: first_break_picking.train_eval.predict
:members: predict
:undoc-members:
:show-inheritance:
194 changes: 194 additions & 0 deletions _sources/readme.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,194 @@
<a id="top"></a>
# First-Break Picking Using Deep Learning

<a id="1-introduction"></a>
## 1. Introduction

This repository is used to implement first-break (FB) picking task using deep learning.
For this purpose, we use a U-net to segment the data as before and after first arrivals.

- [First-Break Picking Using Deep Learning](#first-break-picking-using-deep-learning)
- [1. Introduction](#1-introduction)
- [2. Installation](#2-installation)
- [3. First-Break Picking](#3-first-break-picking)
- [3.1 Initial data files](#31-initial-data-files)
- [3.2 Data preprocessing](#32-data-preprocessing)
- [3.3 Training for FB picking](#33-training-for-fb-picking)
- [3.4 Predicting the first break of one seismic shot](#34-predicting-the-first-break-of-one-seismic-shot)

In a seismic shot record, the first arrival is usually the direct wave from the source followed by refractions (Figure 1). The travel time of a seismic wave from a source to a geophone is called first break. First breaks are invaluable sources of information in near-surface studies. We can employ first breaks to obtain a velocity model of the near-surface. In addition to the importance of first breaks for refraction inversion and understanding the characteristics of the near-surface, they can be employed to perform a successful reflection seismic processing and multi-channel analysis of surface waves (MASW).

![Alt text](./imgs/waves.png)

<a id="2-installation"></a>
## 2. Installation
To install this package, you need to first clone the code
```console
pip install git+https://github.com/geo-stack/first_break_picking.git
```

<a id="3-first-break-picking"></a>
## 3. First-Break Picking
We solve the first-break picking as a segmentation problem. It means that we have two segments,
1. before FB,
2. after FB.

In this way, FB can be picked as the interface between two segments.
![segmentation](./imgs/fb_introducing_fb_segmentation.png)

In the next sections, we see how to prepare the dataset and the processing steps that can be done to improve the accuracy of the results.

<a id="3-first-break-picking"></a>
### 3.1 Initial data files
To use this package, a user needs to prepare the dataset appropriately.
In one folder, we need to have the seismic data and corresponding FB (for training) in `.npy` and `.txt` format.
An example of the first-break file can be seen in the following figure.

![segmentation](./imgs/fb_data.png)

<a id="32-data-preprocessing"></a>
### 3.2 Data preprocessing
After preparing the initial data files in `.npy` and `.txt` formats, we can perform some preprocessing steps using `save_shots_fb`. To explain the arguments of this function, let's look at the following figure.

<a id="Figure2"></a>
![data_preprocessing](./imgs/fb_data_preparing.png)

- We have a great data imbalance which leads to a decrease in accuracy. To deal with this problem, we crop the data (a) to generate data presented in (b). For this purpose, `save_shots_fb` gets an argument called `time_window` which gets a list with two integer values showing the beginning and the end of the cropping window (in terms of sample and NOT time). Basically, the first element of this list should be `0`. For example, I use `time_window = [0, 512]`.
- In the next step, we scale the data to increase the accuracy and ease of learning. This step leads to the image (c). To do so, user can use two arguments, `scale` and `grayscale`, which are boolean and should be set to `True`.
- For data augmentation, we divide each seismic shot into some subimages with a specific overlap (e and d). For this purpose, `save_shots_fb` gets `split_nt` to specify the number of columns in each subimage and `overlap` which defines the overlap of subimages in terms of percentage, between `0.0` to `1.0`. I usually use `overlap = 0.15`. For shots with 48 traces, I use `split_nt = 22`, but in the case of shots with more traces, we can use a larger value for `split_nt`.
- It is really important to provide `save_shots_fb` with the correct value for the sampling rate as `dt`.
- This function also gets two other arguments to specify the extension of shot and first-break files as `shot_ext` and `fb_ext`. This can be used to develop the code easily in case we want to load `.segy` or `.json` files.
- `save_shots_fb` saves the processed data at `dir_to_save`.


So, here is how to call this function,

```Python
from first_break_picking.data import save_shots_fb

data_info = save_shots_fb(
dataset_dir=path_to_load,
dir_to_save=path_save,
split_nt= split_nt,
overlap = overlap,
time_window=[0, n_time_sampels],
fbt_file_header=fbt_file_header,
fbt_time_column=0,
scale=True,
grayscale=True,
dt=dt_project,
shot_ext=".npy",
fb_ext=".txt" if phase=="train" else None
)

data_info.to_csv(f"{path_save}_data_info.txt", index=False)
```
The function `save_shots_fb` returns a Pandas DataFrame which should be saved for using during the prediction.
Here is an example of saved data for a project.

![files](./imgs/fb_preprocessed_data.png)

<div class="alert alert-block alert-warning">
<b>Warning:</b> Please be careful about choosing the appropriate sampling rate.
</div>

<a id="33-training-for-fb-picking"></a>
### 3.3 Training for FB picking
To train a network, we use the function `train`. This function gets some arguments that are presented here.
- `train_data_path`: Path of the training dataset (can be a list of different datasets).
- `upsampled_size_row`: We upsample the data samples before sending them into the model. This variable is used to define the number of rows in upsampled size (must be dividable by 16).
- `upsampled_size_col`: This variable is used to define the number of columns in upsampled size (must be dividable by 16).
- `batch_size`: Number of data samples that are taken into account together to calculate the loss.
- `val_percentage`: A value between 0 to 1 to specify the percentage of data that is used to test the generalizability of the algorithm.
- `epochs`: Number of iterations.
- `learning_rate`: This is used to define the learning rate.
- `path_to_save`: Path to a folder to save the checkpoints and loss values.
- `checkpoint_path`: In case a user wants to start training a pretrained network, the path of the checkpoint should be specified here.
- `step_size_milestone`: Is used to define a learning rate scheduler. If you want to halve the learning rate at a specific number of epochs, this argument should be used.
- `show`: This is a boolean and can be used to specify if the user likes to see the learning procedure. If set to `True`, a figure would be presented like the following example.
![files](./imgs/fb_train.gif)

Here is an example of calling this function,
```Python
from first_break_picking import train
from first_break_picking.tools import seed_everything

seed_everything(10)

train_data_path = [
"path/to/train/dataset_0",
"path/to/train/dataset_n",
]

train(train_data_path,
upsampled_size_row=n_time_sampels,
upsampled_size_col=upsampled_size_col,
batch_size=batch_size,
val_percentage=val_percentage,
epochs=num_epcohs,
learning_rate=1e-4,
device="mps",
path_to_save="path/to/save/results/checkpoints",
save_frequency=num_epcohs,
loss_fn_name=loss_fn,
model_name=model_name,
checkpoint_path=None,
features=[16, 32, 64, 128],
in_channels=1,
out_channels=2,
encoder_weight="imagenet",
step_size_milestone=15,
show=True
)
```

<a id="34-predicting-the-first-break-of-one-seismic-shot"></a>
### 3.4 Predicting the first break of one seismic shot
If you want to predict the first breaks in numerous shots, you should create the dataset as described [here](#22-data-preprocessing).
However, if you need to predict the first break on only one shot (or all shots in a loop without saving dataset), the class `Predictor` should be used.
This object can be created as,
```Python
from first_break_picking import Predictor

predictor = Predictor(
path_to_save="path/to/save/results/checkpoints",
checkpoint_path=checkpoint,
split_nt=split_nt,
overlap=overlap,
upsampled_size_row=n_time_sampels,
upsampled_size_col=upsampled_size_col,
dt = dt,
smoothing_threshold=smoothing_threshold,
model_name="unet_resnet"
)
```
- `path_to_save`: Path to a folder to save the result (will be overwritten).
- `checkpoint_path`: Path of the checkpoint that is saved after training.
- `split_nt` Number of columns in each subimage.
- `overlap`: Overlap of subimages of one shot.
- `upsampled_size_row`: Number of rows in upsampled image.
- `upsampled_size_col`: Number of columns in upsampled image.
- `dt`: Temporal sampling rate.
- `smoothing_threshold`: An integer used to avoid the generated artifacts above the true FB.

By creating this object, we can now give the path of one seismic shot (as presented in [Figure 2a](#Figure2)) to the method `predict` and get the first break.

```Python
predictor.predict(
path_data=path_data
)
```
![data_preprocessing](./imgs/fb_predict_one.png)

<div class="alert alert-block alert-warning">
<b>Warning:</b> If you define the sampling rate incorrectly, you can't see its effects on the plot (Y-axis is time step), but the saved time in the first-break folder will be wrong.
</div>
<br>

<!-- ## Issues and Questions -->
**Acknowledgment:**<br>
This work, developed by [Amir Mardan](https://github.com/AmirMardan), was supported by Mitacs through the Mitacs Elevate Program.


[Top](#top)

11 changes: 11 additions & 0 deletions _sources/train.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Train
=====

first\_break\_picking.train\_eval.train module
----------------------------------------------

.. automodule:: first_break_picking.train_eval.train
:members:
:undoc-members:
:show-inheritance:

123 changes: 123 additions & 0 deletions _static/_sphinx_javascript_frameworks_compat.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
/* Compatability shim for jQuery and underscores.js.
*
* Copyright Sphinx contributors
* Released under the two clause BSD licence
*/

/**
* small helper function to urldecode strings
*
* See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/decodeURIComponent#Decoding_query_parameters_from_a_URL
*/
jQuery.urldecode = function(x) {
if (!x) {
return x
}
return decodeURIComponent(x.replace(/\+/g, ' '));
};

/**
* small helper function to urlencode strings
*/
jQuery.urlencode = encodeURIComponent;

/**
* This function returns the parsed url parameters of the
* current request. Multiple values per key are supported,
* it will always return arrays of strings for the value parts.
*/
jQuery.getQueryParameters = function(s) {
if (typeof s === 'undefined')
s = document.location.search;
var parts = s.substr(s.indexOf('?') + 1).split('&');
var result = {};
for (var i = 0; i < parts.length; i++) {
var tmp = parts[i].split('=', 2);
var key = jQuery.urldecode(tmp[0]);
var value = jQuery.urldecode(tmp[1]);
if (key in result)
result[key].push(value);
else
result[key] = [value];
}
return result;
};

/**
* highlight a given string on a jquery object by wrapping it in
* span elements with the given class name.
*/
jQuery.fn.highlightText = function(text, className) {
function highlight(node, addItems) {
if (node.nodeType === 3) {
var val = node.nodeValue;
var pos = val.toLowerCase().indexOf(text);
if (pos >= 0 &&
!jQuery(node.parentNode).hasClass(className) &&
!jQuery(node.parentNode).hasClass("nohighlight")) {
var span;
var isInSVG = jQuery(node).closest("body, svg, foreignObject").is("svg");
if (isInSVG) {
span = document.createElementNS("http://www.w3.org/2000/svg", "tspan");
} else {
span = document.createElement("span");
span.className = className;
}
span.appendChild(document.createTextNode(val.substr(pos, text.length)));
node.parentNode.insertBefore(span, node.parentNode.insertBefore(
document.createTextNode(val.substr(pos + text.length)),
node.nextSibling));
node.nodeValue = val.substr(0, pos);
if (isInSVG) {
var rect = document.createElementNS("http://www.w3.org/2000/svg", "rect");
var bbox = node.parentElement.getBBox();
rect.x.baseVal.value = bbox.x;
rect.y.baseVal.value = bbox.y;
rect.width.baseVal.value = bbox.width;
rect.height.baseVal.value = bbox.height;
rect.setAttribute('class', className);
addItems.push({
"parent": node.parentNode,
"target": rect});
}
}
}
else if (!jQuery(node).is("button, select, textarea")) {
jQuery.each(node.childNodes, function() {
highlight(this, addItems);
});
}
}
var addItems = [];
var result = this.each(function() {
highlight(this, addItems);
});
for (var i = 0; i < addItems.length; ++i) {
jQuery(addItems[i].parent).before(addItems[i].target);
}
return result;
};

/*
* backward compatibility for jQuery.browser
* This will be supported until firefox bug is fixed.
*/
if (!jQuery.browser) {
jQuery.uaMatch = function(ua) {
ua = ua.toLowerCase();

var match = /(chrome)[ \/]([\w.]+)/.exec(ua) ||
/(webkit)[ \/]([\w.]+)/.exec(ua) ||
/(opera)(?:.*version|)[ \/]([\w.]+)/.exec(ua) ||
/(msie) ([\w.]+)/.exec(ua) ||
ua.indexOf("compatible") < 0 && /(mozilla)(?:.*? rv:([\w.]+)|)/.exec(ua) ||
[];

return {
browser: match[ 1 ] || "",
version: match[ 2 ] || "0"
};
};
jQuery.browser = {};
jQuery.browser[jQuery.uaMatch(navigator.userAgent).browser] = true;
}
Loading

0 comments on commit 733de34

Please sign in to comment.