Skip to content

Arjun-M-101/Arjun-M-101

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

44 Commits
ย 
ย 

Repository files navigation

Hello I'm Arjun

Aspiring Data Engineer


๐Ÿ™‹โ€โ™‚๏ธ About Me

  • ๐Ÿ‘จโ€๐Ÿ’ป Iโ€™m currently working as a Database Administrator, building strong foundations in data management and reliability
  • ๐ŸŒฑ Transitioning into Data Engineering by designing endโ€‘toโ€‘end batch and streaming pipelines
  • ๐Ÿ› ๏ธ Passionate about building scalable, reliable data pipelines that turn raw data into actionable insights
  • ๐Ÿ‘ฏ Open to collaborating on Data Engineering & Open Source projects
  • ๐Ÿ‘จโ€๐Ÿ’ป Explore my work here: My Portfolio
  • ๐Ÿ“ซ Reach me at [email protected]
  • โšก Fun fact: I debug pipelines the way I play games โ€” with persistence and strategy

๐Ÿ› ๏ธ Tech Stack

๐Ÿ”น Languages

Python

๐Ÿ”น Data Engineering & Analytics

Pandas Spark Kafka Airflow Streamlit

๐Ÿ”น Databases

Postgres MySQL

๐Ÿ”น DevOps & Cloud

Linux Git Docker AWS

๐Ÿ”น Web Basics (optional)

HTML CSS JavaScript Bootstrap

๐Ÿ”น Tools

Postman


๐Ÿ“‚ Featured Projects

  • ๐Ÿ—„๏ธ YouTube Data Engineering Pipeline (Batch Processing)
    Endโ€‘toโ€‘end batch ETL pipeline implementing the Medallion Architecture (Bronze โ†’ Silver โ†’ Gold).

    • Orchestrated with Apache Airflow (3.x)
    • Transformations with Apache Spark
    • Data lake layers on local filesystem (Bronze/Silver/Gold)
    • Serving layer in Postgres (analyticsโ€‘ready tables)
    • Interactive Streamlit + Altair dashboard via SQLAlchemy
    • Ingests raw YouTube trending data (CSV/JSON), cleans, enriches, and computes derived metrics for BI
  • ๐Ÿ“Š StockPulse (Streaming Pipeline)
    Realโ€‘time streaming pipeline simulating stock ticks and processing them endโ€‘toโ€‘end.

    • Ingestion via Kafka producer publishing to stock_ticks topic
    • Processing with Spark Structured Streaming (schema enforcement + derived metrics)
    • Dual sinks: Postgres (serving layer) + Parquet (partitioned by index/date)
    • Interactive Streamlit + Altair dashboard for realโ€‘time visualization
    • Fully orchestrated with Apache Airflow

๐Ÿ“œ Certifications

AWS Certified Cloud Practitioner Badge

๐Ÿ“Š My GitHub Stats

Arjun's streak

Arjun M's Github Stats Arjun M's Top Languages

Note: Top languages is only a metric of the languages my public code consists of and doesn't reflect experience or skill level.


Arjun's Graph

๐ŸŒ Connect with Me


โค Views and Followers

GitHub Badge

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published