Used the Enron email dataset to represent a real life like scenario. Classified emails by understanding the content of the email sent. Did feature extraction using Google's NLP model BERT which gave us feature vectors. Used the feature vectors as input for a standard artificial neural network which did the classification. For the classification task, compared various machine learning models like Linear Support Vector Machine, Random Forest, SGD Classifier and LSTM. For the machine learning approaches, tried various embeddings like TF-IDF and CountVectorizer. Did topic modelling with Latent Dirichlet Allocation to find the major topics of discussion in the dataset.
shaival2905/email-classification-using-BERT
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|