Skip to content

Commit

Permalink
Update notes
Browse files Browse the repository at this point in the history
  • Loading branch information
Jonas1312 committed May 10, 2024
1 parent f241c12 commit 00ff607
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
- [T5](#t5)
- [Encoder, Decoder, or Encoder-Decoder Transformer?](#encoder-decoder-or-encoder-decoder-transformer)
- [Sentence Embeddings](#sentence-embeddings)
- [Transformers in computer vision](#transformers-in-computer-vision)
- [Transformers in CV](#transformers-in-cv)
- [Adapting transformers to CV](#adapting-transformers-to-cv)
- [Patch embeddings and tokenization](#patch-embeddings-and-tokenization)
- [More](#more)
Expand Down Expand Up @@ -148,7 +148,7 @@ The dimensions of the matrices are:
- $O = S V \in \mathbb{R}^{n \times d_v}$: the **attention output matrix**:
- Each row of $O$ is the weighted sum of the values for a token.

Note that the attention outout is $O \in \mathbb{R}^{n \times d_v}$, so it's different from the input $X \in \mathbb{R}^{n \times d}$.
Note that the attention output is $O \in \mathbb{R}^{n \times d_v}$, so it's different from the input $X \in \mathbb{R}^{n \times d}$.

Thus, a final weight matrix $W^O \in \mathbb{R}^{d_v \times d}$ can be applied to the output to obtain the final output $O' \in \mathbb{R}^{n \times d}$.

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
36 changes: 35 additions & 1 deletion base/science-tech-maths/machine-learning/metrics/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ $$CI = \bar{x} \pm z \frac{s}{\sqrt{n}}$$

Accuracy is the number of correct predictions $X = \sum{(\hat{y} == y)}$, divided by the test set size $n$.

We consider each prediction of the model as a Bernouilli trial, and the number of correct predictions $X$ is a random variable following a binomial law $Bin(n,p)$:
We consider each prediction of the model as a Bernoulli trial, and the number of correct predictions $X$ is a random variable following a binomial law $Bin(n,p)$:

- $n$ test set size
- $p$ probability of success, that is prob to have a correct prediction $\hat{y} == y$
Expand Down Expand Up @@ -104,3 +104,37 @@ For a 95 CI:
ci_lower = np.percentile(test_accuracies, 2.5)
ci_upper = np.percentile(test_accuracies, 97.5)
```

## NLP Metrics

### BLEU

Use case: translation

It calculates the **precision** of n-grams (sequences of n words) in the generated text that appear in the reference text, adjusted by a brevity penalty to prevent overly short generations from being overly rewarded.

Captures word-by-word similarity. It is a good metric for comparing translations, but it has limitations, such as not considering synonyms or paraphrases.

### ROUGE

Use case: text summarization

ROUGE tries to compare the overall meaning of the generated text with the reference text.

It calculates the F1 score of n-grams in the generated text that appear in the reference text, with different versions of ROUGE focusing on different n-gram lengths (ROUGE-1, ROUGE-2, etc.) or on word sequences (ROUGE-L).

### METEOR

Use case: translation, text generation

This metric improves upon the shortcomings of BLEU by considering synonyms, stemming, and paraphrasing, which makes it more flexible. It combines precision and recall, and it aligns words between the generated and reference texts using a harmony of exact, stem, synonym, and paraphrase matches.

### BERTScore

Use case: summarization, translation, text similarity

BERTScore computes the similarity of tokens in candidate and reference texts based on their embeddings, capturing deeper semantic similarities that go beyond surface-level exact word matches.

![](./bert.png)

The importance of each token is weighted by the inverse document frequency (IDF) of the token in the reference text, which helps to prioritize rare words that are more informative.

0 comments on commit 00ff607

Please sign in to comment.