BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Silicon Valley Engineering Council - ECPv6.15.20//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Silicon Valley Engineering Council
X-ORIGINAL-URL:https://svec.org
X-WR-CALDESC:Events for Silicon Valley Engineering Council
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:-07:00
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=-07:00:20250922T190000
DTEND;TZID=-07:00:20250922T210000
DTSTAMP:20260428T044016
CREATED:20250809T190327Z
LAST-MODIFIED:20250809T190327Z
UID:76863-1758567600-1758574800@svec.org
SUMMARY:Deploying & Scaling LLM in the Enterprise: Architecting Mult-agent AI Systems
DESCRIPTION:*Deploying and Scaling Large Language Models in the Enterprise: Architecting Multi-Agent AI Systems Integrating Vision\, Data\, and Responsible AI* \nLOCATION ADDRESS (Hybrid\, in person or by zoom\, you choose)\nValley Research Park\n319 North Bernardo Avenue\nMountain View\, CA CA 93043\nIf you want to join remotely\, you can submit questions via Zoom Q&A. The zoom link:\nhttps://acm-org.zoom.us/\nJoin via YouTube:\nhttps://youtube.com/live/ \nAGENDA\n6:30 Door opens\, food and networking (we invite honor system contributions)\n**7:00** SFBayACM upcoming events\, introduce the speaker\n7:15 speaker presentation starts\n8:15 – 8:30 finish\, depending on Q&A \nJoin SF Bay ACM Chapter for an insightful discussion on: \n**Abstract:**\nLarge Language Models (LLMs) are rapidly reshaping enterprise AI\, but real-world deployments demand far more than fine-tuning and API calls. They require sophisticated architectures capable of scaling inference\, integrating multi-modal data streams\, and enforcing responsible AI practices—all under the constraints of enterprise SLAs and cost considerations.\nIn this session\, I’ll deliver a deep technical dive into architecting multi-agent AI systems that combine LLMs with computer vision and structured data pipelines. We’ll explore: \n* **Multi-Agent System Design:** Architectural patterns for decomposing enterprise workflows into specialized LLM-driven agents\, including communication protocols\, context sharing\, and state management.\n* **Vision-Language Integration:** Engineering methods to fuse embeddings from computer vision models with LLM token streams for tasks such as visual question answering\, document understanding\, and real-time decision support.\n* **Optimization for GPU Inference:** Detailed strategies for memory optimization\, quantization\, mixed-precision computation\, and batching to achieve high throughput and low latency in LLM deployment on modern GPU hardware (e.g.\, NVIDIA A100/H100).\n* **Observability and Responsible AI:** Techniques for building observability layers into LLM pipelines—capturing token-level traces\, detecting drift\, logging model confidence—and implementing fairness audits and risk mitigation protocols at runtime. \nDrawing on practical examples from large-scale enterprise deployments across retail\, healthcare\, and finance\, I’ll discuss the engineering trade-offs\, tooling stacks\, and lessons learned in translating research-grade LLMs into production-grade systems.\nThis talk is designed for AI engineers and researchers eager to understand the technical complexities—and solutions—behind scaling multi-modal\, responsible AI systems that deliver real business value. \n**Speaker Bio:**\n**Dhanashree** is a Senior Machine Learning Engineer and AI Researcher with over a decade of experience designing and deploying advanced AI systems at scale. Her expertise spans architecting multi-agent solutions that integrate Large Language Models (LLMs)\, computer vision pipelines\, and structured data to solve complex enterprise challenges across industries including retail\, healthcare\, and finance.\nAt Albertsons\, Deloitte\, and Fractal\, Dhanashree has led the development of production-grade AI applications\, focusing on optimization\, model observability\, and responsible AI practices. Her work includes designing scalable inference architectures for LLMs on modern GPU infrastructures\, building hybrid pipelines that fuse vision and language models\, and engineering systems that balance performance with ethical and regulatory considerations. \nShe actively collaborates with research institutions like the University of Illinois. Dhanashree actively engages with the research community and frequently speaks on bridging advanced AI research and production systems.\n[https://www.linkedin.com/in/karanluniya](https://www.linkedin.com/in/karanluniya) \n— \nValley Research Park is a coworking research campus of 104\,000 square feet hosting 60+ life science and technology companies. VRP has over 100 dry labs\, wet labs\, and high power labs sized from 125-15\,000 square feet. VRP manages all of the traditional office elements: break rooms\, conference rooms\, outdoor dining spaces\, and recreational spaces. \nAs a plug-and-play lab space\, once companies have secured their next milestone and are ready to expand\, VRP has 100+ labs ready to expand into.\nhttps://www.valleyresearchpark.com/
URL:https://svec.org/event/deploying-scaling-llm-in-the-enterprise-architecting-mult-agent-ai-systems/
LOCATION:Valley Research Park\, 319 N Bernardo Ave\, Mountain View\, CA\, 94043\, United States
ATTACH;FMTTYPE=image/jpeg:https://svec.org/wp-content/uploads/2025/08/1024x576-pElGtp.jpg
END:VEVENT
END:VCALENDAR