Skip to content

Latest commit

 

History

History
59 lines (35 loc) · 2.54 KB

File metadata and controls

59 lines (35 loc) · 2.54 KB

Greetings everyone, this is divyansh and I am going to present my work on Barclays IB quant Case study.

This is are the contents I am going to cover.

I will begin with my introduction

I am a final year undergraduate in the computer science department of IIT Kanpur, I am deeply interested in recent advancements in the tech and how are these advancement going to impact the future of health, finance and related sectors. Academically I have pursued ML and systems and majorly their applications. Apart from these I also like to draw and read blogs online.

Now, I will move to understanding the problem statement

So, we have been provided with texts from news and blogs and depending on the targets in the sentence provided we need to predict the aspect and sentiment score. Any text can have mulitple targets.

Let's analyze the data to get more insights about the problem statement

If we look distribution of sentiment score

New formats doesn't really impact the score

label make it too trivial

preprocess the aspects labels and then see the count

aspect classificatin and sentiment score prediction do have correlation between them

let's see the key challenge

not a standard problem of text classification since we have targets.

consider this text where we have RIVIAN and TESLA stocks

depending on the target the aspect and sentiment score changes

Now, I will continue with the approach to the problem

to give semantic meaning we have first tokenizer then an bert tuned on financial dataset the embedding of the cls token givens embedding representation of wholw sentence. and we call vectorizer module

So, now we have vector representation for both the input sentence and target but what about the key challenge of target based processing.

The targets needs attention, let's this is the sentence representation and below is the target representation, Now from the sentence we need to know sentiment about this target specifically. So, attention heads make the embdding attend to the text about thi:s target specifically and ignore the other parts. this gives the sentiment output for this target specifically I call this attention module

we have got the embedding with all the context of the input sentence and target. now for aspect classification we have simple dense classifier and a simple dense regressor for sentiment score

for backprop we have mse loss for score and bce with logits for aspect classification, the aggregated loss is then used for updating the weights of the model

Evaluation results