Hi, I’m Matt Najarian 👋
I’m a data scientist specializing in machine learning, statistics, optimization, natural language processing (including RAG and NER), and full-cycle model deployment. Over the past few years, I’ve developed and deployed predictive models with Python, SQL, Apache Spark, and cloud platforms such as Azure Machine Learning, Snowflake, and Databricks. I enjoy collaborating with cross-functional teams—ranging from data engineers to business leaders—to transform complex problems into practical, high-impact solutions.
In this repo, I share some of my experiences in the following ares:
Section1: AI
-
Timeseries Forecasting
- SARIMAX,Linear Regressions,
- XGBoost, LightGBM, CatBoost,
- FaceBook Prophet,
- and pytorch-forecasting
-
- recommendation system (ALS)
- RDD, DataFrame, and MLIB
-
- customer segmentation
- customer lifetime value
- media mix model
- causal inference
-
LLM, RAG, Chatbot
-
PyTorch, TensorFlow, and Keras
Section2: Optimization
-
optimization which contains my optimization model implementations in Python, Java, and C++. Here is a list of project codes you will find:
-
Security Constrained Unit Commitment: Unit Commitment is the process of turning on (committing) resources to meet load and other market requirements. • Security-Constrained Unit Commitment (SCUC) commits units (electricity generators). while respecting limitations of the transmission system and unit. I have coded it in Java and Python.
-
Maximizing Infrastructure Resiliency Under Budgetary Constraint: it is crucial for investment on resiliency to distribute budget among different resources, in a way that the effect is maximized. Check my paper in the following link: https://www.sciencedirect.com/science/article/abs/pii/S0951832019308336
-
Component Importance: at the time of recovery from a disaster, some components play a more important. This is an ongoing research of mine to find those components. The codes include cool visualization (Cytoscape) and random graph generation codes.
-
-
SQL (PostgreSQL 14.0 and PgAdmin 4)
-
NoSQL (Neo4j)
-
Business Intelligence (Apache SuperSet, Tableau)
One of my hobbies is managing my personal computing cluster. It consists of three standard PCs and one high-end PC, on which I have installed several applications to support my projects and experiments.
I built my high-end PC using parts sourced from MicroCenter and a used Nvidia RTX 3090 Ti Founders Edition (24 GB) that I purchased for $800. The system is powered by a Ryzen 7 CPU and equipped with 32 GB of RAM (F5-6000J3238F16G).
What runs on this cluster are:
- Hadoop and Apache Spark
- Ollama
- PostgreSQL 14.0 and PgAdmin 4
- Apache Airflow
Here is a list of useful links that Steve Nouri has shared on his Twitter account plus two other links that I added to it. While I put them here for simplicity of acces, please also read on the Twitter.

