Bob Hosseini bab-git

Bob Hosseini

Senior Data Scientist | GenAI & ML Systems | Team Lead
*Designing AI systems that scale and teams that ship. *

🔧 What I Do

Lead development of GenAI systems using LLMs, RAG, and agentic pipelines
Architect ML solutions from experimentation to scalable production
Drive data product strategy and cross-functional execution in e-commerce
Mentor data teams and implement best practices for ML/AI delivery
Translate complex business problems into real-world AI products

🧠 Generative AI Projects

Two-Stage RAG for Document QA 🟢 Production Ready - Backend + Frontend (App)
A scalable Retrieval-Augmented Generation (RAG) pipeline leveraging two-stage retrieval: keyword and semantic search.
This approach enhances precision and reduces computational costs, achieving over 75% reduction in retrieval overhead for enterprise-scale QA.
Read More: Two-Stage Consecutive RAG for Document QA on Medium
▶️ Access the Application (if in sleep mode, click the "Wake Up" button )
Tech: RAG, Sentence Transformers, Cross-Encoder Reranker, LangChain, ChromaDB, Docker, Poetry, Streamlit
LLM Agents for Clinical Trials 🔵 Development Notebooks
Agentic LLM pipeline automating clinical trial eligibility and patient matching.
Includes data analysis, compliance verification, hallucination grading, and human-in-the-loop workflows.
Tech: LangGraph, OpenAI, Agentic, Tool-calling, Pydantic, Gradio
LLM Tutorials & Applications 🔵 Development Notebooks - Educational
A collection of practical LLM architectures and end-to-end notebooks featuring carefully selected case studies across domains like: healthcare, customer support, product search.
Includes RAG, tool-using agents, clinical trial retrieval, chatbot workflows, and document QA with real-world data sources.
Tech: OpenAI, RAG, LangChain, ChromaDB, Pinecone, Streamlit

📈 Machine Learning & Data Science Projects

Social Sphere: Student Social-Media Analytics 🟡 Ongoing
As the team lead for this SuperDataScience community project, I am spearheading efforts to predict relationship conflicts and self-reported addiction levels from digital behavior, segment students into behavior-based clusters, and visualize trends in usage intensity, platform preference, and mental well-being.
🗂️ Data Source: Utilizing a fresh dataset of nearly 700 students aged 16–25 from high school to graduate programs across multiple countries, collected in Q1 2025, from Kaggle.
🖥️ MLflow Dashboard on Dagshub
Tech: Python, Scikit-Learn, XGBoost, Regression, Clustering, Data Visualization, MLflow, SHAP
Energy Forecasting with SARIMAX 🟢 Production Ready - Backend + Frontend
Community project with SuperDataScience to forecast building energy consumption using a synthetic Kaggle dataset.
Built a SARIMAX pipeline with uncertainty-injected exogenous inputs (random walks) and time-series CV.
App delivers EDA, forecast accuracy, and feature relevance visualizations.
▶️ Live App (if in sleep mode, click the "Wake Up" button )
Tech: Python, pandas, statsmodels, SARIMAX, Streamlit
Data Science & ML Mini Tasks 🔵 Development Notebooks
Single-notebook projects showcasing applied machine learning and data science, including:
- Ad Response Prediction,
- Material Strength Prediction,
- Recipe Recommender,
- Customer Satisfaction Classification,
- Hotel Staff Size Prediction.
  Tech: Python, scikit-learn, XGBoost, Streamlit, pandas, matplotlib

✍️ Writing

Guardrails in LLM Apps – Strategies for implementing ethical safeguards, ensuring compliance, and enhancing security in Large Language Model applications.
LLM Model Selection and Updates – Guidelines for selecting appropriate Large Language Models and managing their updates to balance quality, cost, and scalability in AI applications.
Two-Stage RAG for Document QA – An innovative approach to document-based question answering using a two-stage retrieval strategy to enhance precision and scalability in Retrieval-Augmented Generation systems.
Data Engineers: The Unsung Heroes Behind AI – An exploration of the pivotal role data engineers play in AI development, emphasizing their contributions to data quality, infrastructure, and the overall success of data science teams.

💬 Let’s Connect

Feel free to reach out for collaboration, leadership opportunities, or just to swap ideas on building better GenAI systems.

📫 Email • LinkedIn

“Build AI that works — and teams that last.”

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bob Hosseini bab-git

Achievements