Skip to content
View AmirhosseinHonardoust's full-sized avatar
:electron:
Debugging my life
:electron:
Debugging my life

Highlights

  • Pro

Block or report AmirhosseinHonardoust

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

AmirHossein Honardoust

Data Scientist & Machine Learning Engineer
I design data-centric, explainable, and interactive AI systems.

All RepositoriesLinkedInEmail

coding gif


About Me

  • Data Scientist & ML Engineer with a Physics background and hands-on experience across:
    • Time series & forecasting
    • NLP and text classification
    • Computer vision & recommendation systems
    • Algorithmic trading & risk modeling
  • Comfortable with the full ML lifecycle:
    • Problem framing → data generation/collection → feature engineering → modeling → evaluation → deployment (Streamlit/FastAPI dashboards).
  • Strong focus on:
    • Synthetic data & robustness
    • Interactive dashboards & decision-support tools
    • LLMs, RAG & hybrid AI architectures
    • Explainability & human-centered AI
  • I enjoy writing deep, structured explanations, many repos are part code, part essay.

Tech Snapshot

Languages & Tools

  • Languages: Python, SQL, MQL5, a bit of Solidity
  • ML / DL: PyTorch, TensorFlow/Keras, scikit-learn, XGBoost, LightGBM
  • Time Series & Forecasting: ARIMA/SARIMA, Prophet, LSTMs, rolling windows
  • NLP: BERT, TF-IDF, classic ML (LogReg, Naive Bayes, SVM)
  • Apps & Services: Streamlit, FastAPI, Plotly, SQLite/SQLAlchemy
  • MLOps-ish / Analysis: SHAP, feature importance, evaluation frameworks, synthetic data benchmarks

Domains & Topics

  • Synthetic & tabular data quality
  • Forecasting & scenario simulation
  • Recommender systems
  • Smart contract risk analytics
  • Interactive data storytelling
  • LLMs & Retrieval-Augmented Generation (RAG)
  • Algorithmic trading & financial modeling

How to Navigate My Work

I organize my projects into a few clusters so you can jump directly to what interests you:

  1. Synthetic Data, Data Realism & Anomaly Detection
  2. Forecasting, Dashboards & Interactive ML
  3. LLMs, RAG & Hybrid AI Systems
  4. Core ML, Deep Learning & Portfolio Projects
  5. Explainability, Thinking Like a Data Scientist & AI Philosophy
  6. Smart Contracts, Security & Risk Analytics

Synthetic Data, Data Realism & Anomaly Detection

Generating realistic tables, probing data “authenticity”, and stress-testing models.

These projects focus on fidelity, coverage, privacy, and utility of synthetic data, plus anomaly detection in tabular domains.

  • Autocurator-Synthetic-Data-Benchmark
    A benchmarking toolkit for synthetic tabular data generators:

    • Compares different models (GANs, VAEs, copulas, etc.)
    • Evaluates distribution fidelity, feature coverage, privacy leakage, and downstream ML utility
    • Produces visual reports (PCA, correlations, histograms) to understand where generators succeed or fail
      Goal: make it easier to choose the right synthetic data approach for a business use case.
  • Synthetic-Data-Artist
    Deep dive into Gaussian Copula vs VAE for tabular data:

    • Side-by-side comparison of marginal and joint distributions
    • PCA visualizations of real vs synthetic embeddings
    • Correlation matrix similarity, pair plots, and coverage analysis
      Think of it as a “microscope” for synthetic tabular data.
  • Anomaly-Detection
    End-to-end anomaly detection on synthetic transactions/sales:

    • Data generation with realistic “weird” patterns injected
    • Uses Isolation Forest, Local Outlier Factor, and classic statistical methods
    • Visual diagnostics and confusion-matrix-style evaluations
      Useful for building intuition about anomalies in finance/ops data.
  • Market-Basket-Analysis
    Retail-style synthetic purchase data:

    • Apriori & FP-Growth frequent itemset mining
    • Association rules with support, confidence, and lift
    • Exportable rules + quick visual summaries
      Foundation for recommendation, cross-sell, and promo design.
  • Sales-Data-Analysis
    Lightweight but complete:

    • Synthetic sales dataset generation
    • Cleaning, aggregation, KPI dashboards
    • Time-based trend analysis and segmentation
      Great for explaining analytics pipelines to non-technical stakeholders.
  • Missing-Data-Doctor
    Toolkit for missingness profiling & imputation:

    • Visual missingness maps and patterns (by column, row, time)
    • Simple and advanced imputation strategies
    • Before/after comparisons for ML performance
      Focus: understanding how missing data distorts models.
  • Noise-Injection-Techniques
    Experiments on robustness via controlled noise:

    • Add noise to tabular features/labels during training
    • Explore how different noise types affect generalization
    • PyTorch-based training loops and results visualization
      Bridge between data augmentation and robustness in non-vision domains.

Forecasting, Dashboards & Interactive ML

Treat forecasting and analytics as interactive tools, not static reports.

These projects focus on Streamlit dashboards, scenario analysis, and business-friendly UIs.

  • Forecast-Factory
    Forecasting & simulation app:

    • Streamlit UI to upload time series (sales, traffic, revenue)
    • Uses Prophet (and/or other models) for forecasting with confidence intervals
    • Lets users run “what if we change X?” simulations on key drivers
      Designed for business teams to explore future scenarios without touching code.
  • Market-IQ
    BI-style web app:

    • Ingests transactional/sales-like data
    • Computes core KPIs (revenue, retention, AOV, etc.)
    • Time-series charts, comparisons, and exportable reports
      Acts like a focused analytics tool for small/medium businesses.
  • Data-Storytelling-Dashboard
    End-to-end narrative dashboard:

    • E-commerce style dataset with customers, orders, and products
    • KPIs, cohort analysis, and retention curves
    • Visuals + narrative “takeaways” to interpret the charts
      Focuses on storytelling, not just plotting.
  • Beyond-Charts-Interactive-Storytelling
    Code + essay:

    • RFM segmentation, cohort tracking, user lifecycle
    • Interactive views that adapt to user selections
    • Conceptual guide on how to build narrative dashboards
      For people who want to turn dashboards into decision tools.
  • AI-Report-Factory
    Automated reporting:

    • Input: structured data + configuration
    • Output: KPIs, visualizations, and narrative sections in Markdown/HTML
    • Uses templating to make the reporting repeatable
      Ideal for recurring reports that still need a “human-readable” style.
  • AI-Personal-Study-Tracker
    Productivity & study analytics:

    • Streamlit interface for logging study sessions
    • SQLite backend for persistence
    • ML model (RandomForestRegressor) to predict productivity and surface patterns
      Example of sending ML back to the user as personal feedback.
  • Demand-Forecasting
    Classic time-series pipeline:

    • Synthetic demand & seasonality
    • ARIMA/SARIMA modeling workflow
    • Forecast evaluation and plots
      Template for demand planning and inventory decisions.
  • ML-Playground-Autodetect
    Auto ML playground:

    • Streamlit UI where you upload a dataset
    • Automatically detects classification vs regression
    • Builds sensible ML pipelines + evaluation
      Useful for teaching and quick sanity checks.

LLMs, RAG & Hybrid AI Systems

Building explainable, data-grounded LLM systems with retrieval & graphs.

  • Graph-RAG-Engine
    A more structured take on RAG:

    • Vector search (FAISS) for semantic retrieval
    • Knowledge graph to add structure and relationships
    • FastAPI backend and optional Streamlit front-end
    • Emphasis on traceability and explaining why an answer was given
      Great for recommendation, research assistants, or domain-specific QA.
  • Designing-Hybrid-AI-Systems
    Conceptual + practical:

    • How to combine vector search, knowledge graphs, and LLMs
    • Design patterns for hybrid intelligence
    • Notes on failure modes and interpretability
      A “systems thinking” view for building LLM-powered apps.
  • RAG-vs-Fine-Tuning
    Decision framework:

    • When to use RAG, when to fine-tune, when to do both
    • Cost, latency, maintenance, and data constraints
    • Includes examples and architectural diagrams (where applicable)
      Helpful for teams deciding how to productionize LLMs.

Core ML, Deep Learning & Portfolio Projects

Classic ML projects done with clean structure and clear evaluation.

  • Stock-LSTM-Forecasting
    Time-series forecasting with LSTMs:

    • Data preparation with sliding windows
    • PyTorch LSTM architecture
    • Loss curves + forecast vs actual plots
      Use case: financial time series, sensor data, or demand.
  • Image-Captioning-CNN-LSTM
    Computer vision + language:

    • Pretrained ResNet as image encoder
    • LSTM decoder generating captions word by word
    • BLEU score and qualitative examples
      Classic example of multimodal ML.
  • Sentiment-Analysis-BERT
    Transformer-based text classification:

    • Fine-tuning BERT on sentiment data (tweets)
    • Training/evaluation pipeline
    • Confusion matrix, ROC curves, and example predictions
      Template for other classification tasks with BERT-style models.
  • Sentiment-Analysis-NLP
    Classical ML for text:

    • Tokenization, stopword removal, lemmatization
    • TF-IDF vectorization
    • Models: Logistic Regression, Naive Bayes, Random Forest
      Shows how far you can go with “non-deep” NLP.
  • Movie-Recommendation-System
    Hybrid recommender:

    • Content-based filtering (TF-IDF + cosine similarity)
    • Collaborative filtering via matrix factorization
    • Evaluation with ranking metrics and examples
      Useful base for product/content recommendations.
  • LSTM-Time-Series-Forecasting
    Generic LSTM-forecast template:

    • Works on many univariate series
    • Clear code structure and visualizations
      Good starting point for experimenting with sequence models.
  • Handwritten-Digit-GAN
    Generative modeling:

    • DCGAN on MNIST
    • Training loop, generated samples, and latent space interpolation
      Intro to generative models in vision.

Explainability, Thinking Like a Data Scientist & AI Philosophy

Code + essays about how we think about and interpret AI systems.

  • Shap-Mini
    Minimal SHAP explainability demo:

    • Tabular ML model (tree-based)
    • Global and local SHAP plots
    • Good for teaching “why did the model decide this?”
      Brings explainability down to a small, digestible example.
  • Think-Like-a-Data-Scientist
    Long-form essay:

    • Framing questions, hypotheses, and experiments
    • How to move from raw data → insight → action
    • Balancing rigor with storytelling
      More about mindset than code.
  • Forecasting-The-Future-of-Forecasting
    Strategic perspective:

    • How forecasting tools shape decisions
    • Interactive foresight vs static point estimates
    • Reflections on feedback loops and reflexive systems
  • The-Future-of-Interactive-ML
    Why interactivity matters:

    • Benefits of human-in-the-loop ML
    • Examples with Streamlit-style interfaces
    • How UI/UX changes modeling choices
  • Algorithmic-Empath-Human-Fallibility
    Ethics & “algorithmic empathy”:

    • Modeling human mistakes, uncertainty, and disagreement
    • Thinking beyond accuracy: fairness, robustness, trust
      Explores what it means for algorithms to “understand” humans.
  • Measuring-The-Soul-of-Data
    Philosophy of data realism:

    • What makes data feel “alive” or “authentic”
    • Exploring relationships, diversity, and subtle patterns
    • Especially in the context of synthetic vs organic data
  • Quiet-Machines-Minimalist-AI
    Minimalist AI:

    • Preference for quiet, non-intrusive, human-respecting AI systems
    • Thoughts on attention, overload, and calm technology

Smart Contracts, Security & Risk Analytics

Applying ML & static analysis ideas to smart contracts and risk.

  • Python-Solidity-Feature-Engineering
    Feature extraction for Solidity contracts:

    • Parse Solidity code using Python tooling
    • Extract structural & semantic features (complexity, patterns)
    • Basis for risk models, security classification, or audit support
  • Smart-Contract-Risk-Analyzer
    Static analysis plus heuristics:

    • Identify risky patterns in smart contracts
    • Compute risk scores or categories
    • Designed as a step toward automated security triage

Suggested Flagship Repositories

If you’re just browsing and want a quick sense of my work, start here:


“AI is not just about models; it’s about systems that solve real problems for real people.”

Pinned Loading

  1. Image-Captioning-CNN-LSTM Image-Captioning-CNN-LSTM Public

    An end-to-end image captioning project using a CNN encoder (ResNet-50) and LSTM decoder in PyTorch. Includes vocabulary building, preprocessing, training with BLEU evaluation, and inference. Genera…

    Python 33

  2. Coffee-Shop-Profit-Predictor Coffee-Shop-Profit-Predictor Public

    Predict the profitability of potential coffee shop locations using SQL and Python. Combines data engineering with feature-rich regression modeling, visual analytics, and business insights to suppor…

    Python 32

  3. Fake-News-Detector Fake-News-Detector Public

    A complete NLP and Machine Learning project to detect fake and real news using TF-IDF and Logistic Regression. Includes full training pipeline, evaluation charts, and an interactive Streamlit web a…

    Python 33 2

  4. Stock-LSTM-Forecasting Stock-LSTM-Forecasting Public

    Predict stock prices using LSTM networks in PyTorch. This project covers data preprocessing, sliding window creation, model training with early stopping, and evaluation with RMSE/MAE/MAPE. Includes…

    Python 28

  5. Sentiment-Analysis-BERT Sentiment-Analysis-BERT Public

    End-to-end sentiment analysis of tweets using BERT. Includes preprocessing, training, and evaluation with classification reports, confusion matrices, ROC curves, and word clouds. Demonstrates fine-…

    Python 25

  6. Market-Basket-Analysis Market-Basket-Analysis Public

    Python project for Market Basket Analysis. Generates synthetic retail transactions, mines frequent itemsets using Apriori & FP-Growth, derives association rules, and outputs CSVs + visualizations. …

    Python 25