Equip data scientists and ML engineers with hands-on skills to design, build, and deploy autonomous LLM-driven agents in Jupyter Notebooks that can handle the full model development cycle—from data ingestion through modeling to summarization.
Module 1 – Agentic AI Foundations
• Understand autonomy, orchestration, and planner–executor logic
• Introduce agent stacks: LangChain, LangGraph, CrewAI
• Build a simple reactive agent with tools and memory
Module 2 – Jupyter-Centric Agent Integration
• Use Jupyter Agent or LangChain callbacks or something similar to control notebooks
• Agents write, execute, and summarize notebook cells
• Prepare scaffold for end-to-end control
Module 3 – Automated Data Ingestion & Profiling
• Agents infer schema, detect types, and perform data cleaning
• Validate input assumptions
• Hands-on: A Jupyter notebook that is incrementally built by an agent
Module 4 – Agentic EDA with Visualization-to-Action Feedback
• Agents create EDA charts and feed insight back into planning
• Use visual outputs (e.g., correlation heatmaps, trend lines) to influence downstream logic
• Hands-on: EDA + text summary of analysis + decision log from the agent
Module 5 – Visual-Driven Feature Engineering
• Use LLM vision (e.g., GPT-4o) to detect patterns/anomalies in charts
• Build features based on visual insights
• Discuss agent decision-making and verification
• Hands-on: Dataset with newly engineered features
Module 6 – Agent-Led Model Training & Evaluation
• Select, train, and evaluate models using agents
• Compare model performance using agents' reasoning
• Hands-on: Model training, hyperparameter optimization, validation report
Module 7 – LLM API Nuances
• Compare OpenAI, Claude, and Perplexity APIs in terms of latency, context window, cost, and function/tool calling
• Analyze suitability for different agent orchestration tasks
• Hands-on: Implement the same task with different LLM APIs and compare response structure, latency, and token cost
Module 8 – Summary Generation & Notebook Export
• Use LLMs to write structured summaries
• Export notebooks to scripts/APIs
• Emphasize reproducibility and explainability
Module 9 – Observability, Guardrails & Metrics
• Log hallucination rates, latency, tool invocation success
• Introduce Langfuse and prompt-based validators
• Encourage safe experimentation and auditability
Module 10 – Agent Performance Evaluation
• Define quantitative metrics: latency, success rate, prompt coverage, hallucination index
• Hands-on: Add observability layer to track performance and evaluate agent execution trace logs using Langfuse or OpenTelemetry dashboards
Module 11 – Wrap-Up & Next Steps
• Review key concepts and workflows
• Introduce additional self-paced learning resources
• Discuss pathways for production deployment of agentic pipelines