- In this project our focus centers on utilizing transformer-based architectures, specifically the BERT (Bidirectional Encoder Representations from Transformers) model for Sentiment Analysis. BERT, known for its bidirectional contextual understanding, serves as the foundation for our custom classifier that we have used for Yelp Sentiment Analysis. Through strategic modifications to the last four layers of BERT, we aim to amplify its sensitivity to nuanced sentiment expressions.
- Here we are using a customer classifier that includes the BERTsmall model as part of it. We intend to use the already trained BERT model and update the weights of only the last few layers during the training process. On the top of BERT model, we have added fully connected layers with batch normalization, dropout, and ReLU activation functions.
- Fine Tuning of BERT base model: The BERT model serves as the backbone for capturing contextu- alized representations of input sequences in our classification model. Here, we have frozen the first 8 layers of the BERTsmall model. The freezing of BERT layers prevents them from being updated during training, preserving the pre-trained contextual information. Conversely, in the model, we are fine-tuning the BERT model by allowing the weight updation of the last four BERT model layers. The use of a fine-tuned pre-trained model has been proven to be successful in the field of Sentiment Analysis, hence, in our classifier, we have decided to use this setting.
- Fully Connected and Batch Normalization Layers: Following the BERT layers, the model incorporates several fully connected layers [6]. These layers, with specified dimensions (e.g., 128, 64, 32, 3), are responsible for learning task-specific features by transforming the high-dimensional BERT output into a format suitable for classification. In this classification model, we have employed Batch normalization after each fully connected layer to stabilize and normalize activations, mitigating issues like internal covariate shift during training.
- Dropout: To prevent overfitting, dropout layers are inserted after each fully connected layer, including the BERT output. As we know, dropout randomly drops a fraction of the neurons during training which helps to reduce the overfitting problem and it promotes model generalization.
- Activation layers: In our model, we have also used the rectified linear unit (ReLU) activation function after each fully connected layer to introduce non-linearity in the model and enable it to learn complex relationships within the data.
-
Notifications
You must be signed in to change notification settings - Fork 0
mohitsarin-tamu/Sentiment-Analysis-YELP-data
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published