Skip to content
View bab-git's full-sized avatar

Block or report bab-git

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
bab-git/README.md

Bob Hosseini

Senior Data Scientist | GenAI & ML Systems | Team Lead
Designing AI systems that scale and teams that ship.


🔧 What I Do

  • Lead development of GenAI systems using LLMs, RAG, and agentic pipelines
  • Architect ML solutions from experimentation to scalable production
  • Drive data product strategy and cross-functional execution in e-commerce
  • Mentor data teams and implement best practices for ML/AI delivery
  • Translate complex business problems into real-world AI products

🚀 Featured Projects

  • LLM Agents for Clinical Trials
    Agentic LLM pipeline for automating trial eligibility checks and patient-trial matching.
    Integrates data analysis, compliance verification, and hallucination grading with human-in-the-loop workflows.
    Tech: LangGraph, OpenAI, Pydantic, Gradio

  • Two-Stage RAG for Document QA
    Scalable RAG pipeline using two-stage retrieval — keyword + semantic search Boosts precision and cut compute cost. Achieved >75% reduction in retrieval overhead for enterprise-scale QA.
    Tech: LangChain, OpenAI, ChromaDB, Streamlit

  • Data Science & ML Mini Tasks
    A curated set of focused, single-notebook projects that demonstrate applied ML and data science problem-solving.
    Topics include:

    • Ad Response Prediction
    • Predictive Modeling for Manufacturing Material Strength
    • Recipe Recommender, System Design
    • Customer Satisfaction Classifier
      Tech: Python, scikit-learn, XGBoost, Streamlit, pandas, matplotlib
  • LLM Tutorials & Applications
    A collection of practical LLM architectures and end-to-end notebooks featuring carefully selected case studies across domains like healthcare, customer support, and product search.
    Includes RAG, tool-using agents, clinical trial retrieval, chatbot workflows, and document QA with real-world data sources.
    Tech: LangChain, OpenAI, RAG, ChromaDB, Pinecone, Streamlit


✍️ Writing

  • Guardrails in LLM AppsStrategies for implementing ethical safeguards, ensuring compliance, and enhancing security in Large Language Model applications.
  • LLM Model Selection and UpdatesGuidelines for selecting appropriate Large Language Models and managing their updates to balance quality, cost, and scalability in AI applications.
  • Two-Stage RAG for Document QAAn innovative approach to document-based question answering using a two-stage retrieval strategy to enhance precision and scalability in Retrieval-Augmented Generation systems.
  • Data Engineers: The Unsung Heroes Behind AIAn exploration of the pivotal role data engineers play in AI development, emphasizing their contributions to data quality, infrastructure, and the overall success of data science teams.

💬 Let’s Connect

Feel free to reach out for collaboration, leadership opportunities, or just to swap ideas on building better GenAI systems.

📫 EmailLinkedIn


“Build AI that works — and teams that last.”

Pinned Loading

  1. NNKSC NNKSC Public

    Non-negative Kernel Sparse Coding algorithm for semantic dictionary learning in feature space.

    MATLAB 6 4

  2. llm-tutorials llm-tutorials Public

    A collection of applications that can be used with Large Language Models (LLMs).

    Jupyter Notebook 6

  3. data-science-and-ml-mini-projects data-science-and-ml-mini-projects Public

    This repository represent a range of data analysis and machine learning exercises, typically completed within a single Jupyter notebook. Some of these may be derived from interview challenges I've …

    Jupyter Notebook

  4. two-stage-conrag two-stage-conrag Public

    A RAG pipeline that optimizes both precision and scalability by employing a sequential retrieval strategy that leverages the strengths of both keyword-based and semantic search while minimizing com…

    Python 1

  5. llm_pharma llm_pharma Public

    This is a tutorial of an agentic Large Language Model (LLM) application to automate the evaluation of patients for clinical trials. It leverages documents related to patients' medical histories, cl…

    Jupyter Notebook

  6. NQP NQP Public

    The optimization algorithm for minimizing non-negative quadratic problems under cardinality constraint.

    Python 3