Vietnamese Sentiment Analysis Assistant 🎭

A powerful hybrid sentiment analysis system for Vietnamese text using PhoBERT + Rule-based approach, achieving 95.80% accuracy on diverse test cases.

✨ Features

Hybrid Model: Combines deep learning (PhoBERT) with rule-based analysis for optimal accuracy
Advanced Preprocessing: Noise removal, teencode normalization, word segmentation
Streamlit Web App: Beautiful, user-friendly interface
History Management: SQLite-based storage with export/import capabilities
Multi-format Export: CSV, JSON, HTML, ICS calendar format
Comprehensive Testing: 1000+ diverse prompts validation

🚀 Quick Start

Local Development

# Clone repository
git clone https://github.com/d0ngle8k/Extract-Prompt-To-Emotion.git
cd Extract-Prompt-To-Emotion

# Create virtual environment
python -m venv .venv
.venv\Scripts\activate  # Windows
# source .venv/bin/activate  # macOS/Linux

# Install dependencies
pip install -r requirements.txt

# Run the app
streamlit run app.py

🌐 Deploy to Streamlit Cloud

Fork this repository on GitHub
Go to Streamlit Cloud
Connect your GitHub account
Select this repository
Deploy! (Streamlit will automatically detect app.py as the entry point)

⚠️ Streamlit Cloud Notes:

Free tier limitations: Model loading may take time on first run
Data persistence: History data doesn't persist between sessions (cloud limitation)
Memory usage: PhoBERT model requires ~2GB RAM, ensure adequate resources

📊 Performance

Overall Accuracy: 95.80%
Positive: 85.71%
Negative: 98.72%
Neutral: 98.70%

Tested on 1000 diverse Vietnamese prompts including standard text, toxic language, slang, and edge cases.

🏗️ Architecture

app.py              # Streamlit web interface
├── preprocessing.py # Text cleaning & segmentation
├── phobert_module.py # Hugging Face PhoBERT integration
├── rule_based.py    # Lexicon-based sentiment analysis
├── fusion.py        # Conditional model fusion
└── db_connector.py  # SQLite database operations

📁 Project Structure

Extract-Prompt-To-Emotion/
├── app.py                 # Main Streamlit application
├── preprocessing.py       # Vietnamese text preprocessing
├── phobert_module.py      # PhoBERT sentiment analysis
├── rule_based.py          # Rule-based sentiment analysis
├── fusion.py              # Model fusion logic
├── db_connector.py        # Database operations
├── requirements.txt       # Python dependencies
├── .streamlit/config.toml # Streamlit configuration
├── test/                  # Test suite
│   ├── test_1000_prompts.py
│   └── test_1000_prompts_refined.txt
├── phobert_finetuned/     # Pre-trained model files
└── README.md

🔧 Dependencies

streamlit: Web app framework
transformers: Hugging Face models
torch: PyTorch for deep learning
underthesea: Vietnamese NLP toolkit
pandas: Data manipulation
numpy: Numerical operations

🎯 Usage

Classification Tab: Input Vietnamese text and get instant sentiment analysis
History Tab: View past classifications with export options
Export Data: Download history in CSV, JSON, HTML, or ICS format
Import Data: Upload CSV/JSON files to add historical data

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

📄 License

This project is open source. Feel free to use and modify.

🙏 Acknowledgments

PhoBERT by VinAI Research
Underthesea Vietnamese NLP toolkit
Streamlit for the amazing web app framework

🤝 Collaborator

Special thanks: chotxx - Vu Quoc Bao

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.devcontainer		.devcontainer
.streamlit		.streamlit
scripts		scripts
test		test
tests		tests
.gitignore		.gitignore
CODEBASE.md		CODEBASE.md
CODEBASE_INDEX.md		CODEBASE_INDEX.md
README.md		README.md
REPORT.md		REPORT.md
app.py		app.py
db_connector.py		db_connector.py
failed_mixed_100.txt		failed_mixed_100.txt
failed_patterns.txt		failed_patterns.txt
failed_phobert_only.txt		failed_phobert_only.txt
failed_random_1000.txt		failed_random_1000.txt
fusion.py		fusion.py
phobert_module.py		phobert_module.py
preprocessing.py		preprocessing.py
requirements.txt		requirements.txt
rule_based.py		rule_based.py
test_1000_random_prompts.txt		test_1000_random_prompts.txt
test_100_mixed_prompts.txt		test_100_mixed_prompts.txt
test_validation.py		test_validation.py
test_validation_standalone.py		test_validation_standalone.py
test_validation_updated.py		test_validation_updated.py
validation.py		validation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Vietnamese Sentiment Analysis Assistant 🎭

✨ Features

🚀 Quick Start

Local Development

🌐 Deploy to Streamlit Cloud

⚠️ Streamlit Cloud Notes:

📊 Performance

🏗️ Architecture

📁 Project Structure

🔧 Dependencies

🎯 Usage

🤝 Contributing

📄 License

🙏 Acknowledgments

🤝 Collaborator

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

d0ngle8k/Extract-Prompt-To-Emotion

Folders and files

Latest commit

History

Repository files navigation

Vietnamese Sentiment Analysis Assistant 🎭

✨ Features

🚀 Quick Start

Local Development

🌐 Deploy to Streamlit Cloud

⚠️ Streamlit Cloud Notes:

📊 Performance

🏗️ Architecture

📁 Project Structure

🔧 Dependencies

🎯 Usage

🤝 Contributing

📄 License

🙏 Acknowledgments

🤝 Collaborator

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages