About Me
A full-time student and part-time chef (only in my own kitchen), who loves hosting dinners that somehow end up with everyone overstuffed and overly impressed. Born into a family where music flows through our veins, I grew up surrounded by melodies that shaped my path—from family jam sessions to my current role at Bose, where I merge my passion for sound with cutting-edge technology.
When I’m not juggling assignments or diving into data science projects in the media industry, you’ll find me soaking up the sun, hitting the beach, or attempting to pet every cat and dog I spot on the street. A globetrotter at heart, I once managed to visit four countries in a single month—I still claim it was for “research purposes.”
And Sundays? Those are sacred. They’re reserved for my unique brand of therapy: grocery shopping at Trader Joe’s followed by intense gaming sessions, all while curating the perfect soundtrack, of course. Life’s all about finding harmony in every sense of the word, right?
Education
- Northeastern University, Boston, MA
Master of Science in Data Science (Graduated May 2025)
Graduate Teaching Assistant, Khoury College of Computer Science
- Fall 2023: TA for DS 2000 Programming with Data, assisting 700+ students with foundational programming concepts, debugging, and solving real-world data challenges using Python.
- Fall 2024: TA for DS 3000 Advanced Programming with Data, guiding students through advanced topics such as APIs, data visualization, and efficient data manipulation.
- Mentored students during office hours and project consultations, enhancing their problem-solving skills and my ability to communicate complex concepts effectively.
- Collaborated with faculty on assignment design and grading, improving organizational and time management skills.
Relevant Courses: LLM Agents, Supervised & Unsupervised Machine Learning, Deep Learning, NLP, Database Systems, Data Processing & Visualization, Algorithms
- Vellore Institute of Technology, Vellore, India
Bachelor of Technology in Computer Science and Engineering (June 2016 – June 2020)
Relevant Courses: Probability Theory, Linear Algebra, Statistics, Calculus, Differential Equations, AI, Data Mining
Work Experience
Bose Corporation (Boston, MA)
Data Scientist NLP Intern (January 2024 – August 2024)
Key Focus Areas:
LLM Optimization (GPT-4, Llama 3.1), RAG Systems (AWS Kendra), NLP Pipelines (BERTopic, RoBERTa), Streamlit Development
- End-to-End ML Pipeline: Implemented unsupervised topic clustering to identify 120 key return drivers from customer feedback. Trained & deployed a RoBERTa model on AWS for multi-label classification F1 Score 0.92, integrating Spark for data preprocessing, model inference, & result cleaning.
- AI Interview Automation: Engineered OpenAI-powered assistant with Streamlit-Snowflake backend automating 200+ daily interviews, reducing candidate dropout by 30% while saving 50+ weekly hours
- Enterprise Chatbot: Deployed AWS Kendra RAG system handling 2K+ daily interactions at 99.8% uptime, validated through red team testing and guardrail implementation
- Cost-Optimized LLM Ops: Transitioned sentiment analysis from GPT-4→Llama 3.1 via A/B testing, achieving 97% cost reduction while maintaining performance on 2M+ records
- GenAI Analytics: Automated product review analysis using Claude Sonnet/AWS Bedrock, enabling 65% faster insights via Streamlit dashboard with 98% accuracy
West Pharmaceutical Services (Bangalore, IN)
Data Scientist (November 2020 – December 2022)
Key Focus Areas:
Computer Vision (XceptionNet), Cloud Deployment (Azure), Production ML, Data Visualization
- Defect Detection: Architected XceptionNet model identifying 15 defect classes (92% accuracy, 37% improvement vs manual), deployed via Docker/Azure
- Production Analytics: Built Power BI dashboards enabling 43% faster issue identification through visual trend analysis
- Document Intelligence: Implemented BERT/ALBERT/RoBERTa pipelines improving document-keyword mapping by 23%
Associate Data Scientist (January 2020 – November 2020)
Key Focus Areas:
ETL Pipelines (Azure Data Factory), NLP (BERT/ALBERT), Data Storytelling
- Enterprise Data Integration: Designed Azure Data Factory pipelines ingesting/transforming documents from 10+ sources, reducing data prep time by 35%
- Document Intelligence: Implemented transformer-based NLP workflows improving document-keyword mapping by 23%, presenting insights through weekly stakeholder dashboards
- Process Optimization: Migrated legacy manual workflows to automated ETL/NLP pipelines, collaborating with 5+ factory teams to ensure operational alignment
Info Origin (Bangalore, IN)
Data Scientist Intern (May 2018 – August 2018)
Key Focus Areas:
Time Series Forecasting, ETL Development, Data Visualization
- Workplace Analytics: Built ETL pipeline for 10K+ employee conference room data, enabling ARIMA forecasts (R²:0.90) with Tableau dashboards tracking utilization trends
- Stakeholder Communication: Presented forecasting insights through weekly visual reports, helping facilities team optimize room allocation and reduce scheduling conflicts by 28%
- Process Documentation: Created technical manuals for ARIMA model deployment, enabling knowledge transfer to operations team
Projects
Visual Mood-Based Music Recommendation (In progress)
- Leveraged LLaVA for vision-language understanding, retrieved mood-aligned songs via RAG with contrastive learning on audio embeddings, and integrated personalized Spotify playlist filtering.
- Tools: LLaVA, RAG, Contrastive Learning, Spotify API, Streamlit
Automating Job Applications (October 2024) Live App
- Built an LLM agent to scrape job descriptions, match resume embeddings, and generate tailored resumes/cover letters.
- Tools: OpenAI Embedding API, Groq Llama 3.1 API, Streamlit
- Developed a local LLM-powered agent to summarize and classify Reddit football data into short audio/text insights.
- Tools: Reddit API, LangChain, Qwen 2.5, Llama 3.1, PostgreSQL
ArguSense: Argument Essay Evaluation (September 2023)
- Developed an NLP pipeline using Longformers and BERT to classify argument structures in essays. Integrated MLflow in AWS for scalable model tracking, versioning, and deployment with containerization.
- Tools: Streamlit, HuggingFace, Named Entity Recognition, Longformers, BERT, MLFlow, AWS, Git LFS
- Deployed a Streamlit app for interactive analysis and visualization of 500+ football matches, integrating an ensemble model with Markov Chains, XGBoost, and Logistic Regression to predict the “Expected Threat” metric.
- Tools: Streamlit, Python, StatsBomb API, OpenAI API, Postgresql
Achievements
- Most Impactful Business Project Award - Bose Hackathon: Developed a semantic search retrieval system for 1M+ records.
Technical Skills
- Programming: Python, R, SQL, C#
- Data Science: Machine Learning, NLP, Deep Learning, Feature Engineering
- Libraries: NumPy, Pandas, Scikit-learn, TensorFlow, PyTorch, LangChain, HuggingFace
- Platforms: AWS, Microsoft Azure, Snowflake, Databricks
- Other: Docker, Kubernetes, LLMOps, Streamlit