You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are a couple different libraries investigated, I am choosing
BertViz: shows the attentions between query from search and a SERP
Others I considered but will hold off on for now are:
Attention: gives a correlation matrix and affords interactive attention tracing
Captum which is from meta and uses an integrated gradients approach.
SERP = Search Engine Results Page
PL = Programming Language
Note here the 'page' is a search/IR vernacular for someblock of text which historically referred to a page from a document (say a book or website) whereas here the 'page' typically refers to a class/method/function from the PL.
There are pros and cons to using each one so I'll just pick one to implement and move forward, perhaps investigating others afterwards to see if they afford anything that is clearly a limitation. Plus I'll be able to contextualize those impacts better once some use has been made of the chosen library.
For now I have some scratch code for BertViz I'm working to include as an inspection tool for a SERP.
I will know this issue is resolved when I can click on a SERP and see a BertVIz of the query and the result.
Doing so enables someone to dynamically understand what tokens in the phrasing of their search query generates the selected response. This view affords the user to dynamically, and iteratively reformulate their query without needing to understand the complicated vernaculars of e.g. lucene style queries.
Limitations:
The query and the SERP (NL + PL) needs to be within the context window length. While we could do more complicated things, we skip it for now to get something working.
This will not work 'out of the box' for models that aren't in the list of models that BertViz supports. If we choose a different dependency then it has it's own set of models. For now, we're using the codebert model which is a Roberta family of models. With some limited hacking we can get this model to work.
The BertViz library dumps all the attentions into the data attribute of the html class of the output this is so that it can work on the rendering with the d3 javascript library (I think). It would be nice to have the embeddings read in from the sqlite-vec given a chosen layer it would read in the vectors for only that layer and render the visual.
I am unsure how much the BertViz library is actively maintained. For example the last release was 2 years ago and there is a dependency issue I noticed-and opened a pr for. Some response on the pr-even if solely to reply-will indicate some active maintainer efforts.
The text was updated successfully, but these errors were encountered:
The BertViz library dumps all the attentions into the data attribute of the html class of the output this is so that it can work on the rendering with the d3 javascript library (I think). It would be nice to have the embeddings read in from the sqlite-vec given a chosen layer it would read in the vectors for only that layer and render the visual.
What happens in the latest version of BertViz (1.4.0) the attentions are written as a substitution to a constant in the *.js versions of the views. Here is the line for the constant in the head_view and the line which does the interpolation is here in the head_view.py module for the html_action=='view' which is the branch used within a jupyter notebook. To embed into an html outside a jupyter notebook you use the 'return' branch of the function instead and then format appropriately for your templating engine. The jinja module is used inside the flask app so I pass the data from this view through Markup() to appropriately escape the html tags etc.
Also here is a screenshot from one of the inspect views inside the app to demonstrate what I'm referring to by writing all the attentions into the html tag (inside a <script> tag element)
For now the proof of concept is working ok but the attention weight visualizations are less than ideal. This is because the bertviz library doesn't seem to provide an option to renormalize the output and exclude the [CLS] and [SEP] tokens from the visualization.
This is mentioned by the author in a paper but they never updated the code it seems, cf.
Section 5.2.3 Filtering Null Attention of the paper.
If I revisit this (or someone wants it) I'll make the necessary changes to bertviz or fork the repo if no one merges the existing PR I have to fix the dependency spec in setup_requires.
There are a couple different libraries investigated, I am choosing
query
from search and a SERPOthers I considered but will hold off on for now are:
SERP = Search Engine Results Page
PL = Programming Language
Note here the 'page' is a search/IR vernacular for someblock of text which historically referred to a page from a document (say a book or website) whereas here the 'page' typically refers to a class/method/function from the PL.
There are pros and cons to using each one so I'll just pick one to implement and move forward, perhaps investigating others afterwards to see if they afford anything that is clearly a limitation. Plus I'll be able to contextualize those impacts better once some use has been made of the chosen library.
For now I have some scratch code for BertViz I'm working to include as an inspection tool for a SERP.
I will know this issue is resolved when I can click on a SERP and see a BertVIz of the query and the result.
Doing so enables someone to dynamically understand what tokens in the phrasing of their search query generates the selected response. This view affords the user to dynamically, and iteratively reformulate their query without needing to understand the complicated vernaculars of e.g. lucene style queries.
Limitations:
data
attribute of the html class of the output this is so that it can work on the rendering with the d3 javascript library (I think). It would be nice to have the embeddings read in from the sqlite-vec given a chosen layer it would read in the vectors for only that layer and render the visual.The text was updated successfully, but these errors were encountered: