Cerebral Valley
Posts
Snowflake AI Research is Advancing the Field of AI ❄️

Snowflake AI Research is Advancing the Field of AI ❄️

Plus: AI Research Lead, Anupam Datta, on why he believes data agents, observability and inference optimization will be the most important foundations for enterprise-grade AI systems...

August 06, 2025

CV Deep Dive

Snowflake has become a key player in enterprise AI, with more than 5,200 customers running AI workloads each week. As companies scale AI into production, new challenges emerge: orchestrating agents across structured and unstructured data, ensuring transparency and trust in model decisions, and keeping inference performance under control.

To explore how Snowflake is approaching these challenges head-on, we spoke with Anupam Datta, AI Research Lead at Snowflake, about the team’s focus on 3 critical areas – agentic systems, observability, and inference optimization – and how these pillars unlock the ability do more with your data.

Let’s dive in ⚡️

Read time: 8 mins

Our Chat with Anupam 💬

Anupam, welcome to Cerebral Valley! First off, introduce yourself and give us a bit of background on you and Snowflake’s AI Research team.

Hey there! I'm Anupam Datta, and I’m an AI Research Lead at Snowflake. I joined Snowflake about a year ago when a company I co-founded called TruEra was acquired. TruEra is an open source AI observability tool that helps teams evaluate and monitor their models so they know what is working and what isn’t. Before that, I spent over a decade as a professor at Carnegie Mellon, digging into fairness, transparency, and trustworthy AI.

CV: For the founders reading this, why should they care about what you’re building at Snowflake?

We’re solving the exact problems they’re probably running into. Inference cost and latency are huge blockers when you’re trying to deploy an LLM-powered product. Agents are basically black boxes unless you have good tracing and evaluation tools. And getting structured and unstructured data to work together? That’s a pain most teams underestimate until they’re deep into it.

The Snowflake AI Research team is laser focused on making enterprise AI trustworthy, practical, and fast. For us, that boils down to three big areas:

Agentic AI: building data agents that can orchestrate across structured and unstructured data.
Inference optimization: LLMs can get expensive and slow, fast. We’re focused on making them efficient, responsive, and reliable enough to scale without burning your budget.
Observability: building tracing and eval tools so you always know what your agent is doing, why it made a decision, and where it might be going off track.

CV: Agentic AI is where a lot of the excitement is right now. What’s different about data agents you’re building?

A lot of teams are excited about agents, but most still struggle to get real answers from their data. Data is spread across dashboards, documents, CRMs, and chat threads, so even basic questions like “why sales are down?” turn into messy, time-consuming projects.

Our data agents are built for exactly that kind of complexity. They’re designed for enterprise workflows, where accuracy and traceability are critical. These agents can plan, choose the right tools, and pull from both structured sources like database tables and unstructured content like docs or PDFs to generate clear, grounded answers. This is the foundation of Cortex Agents, our agentic orchestration system.

Within structured data, one area where we’re world-leading is text-to-SQL. Most general-purpose models struggle with real-world SQL. They're trained to mimic what SQL looks like, not whether it actually works. We built Arctic-Text2SQL-R1, a state-of-the art reasoning model that was post-trained with reinforcement learning (RL) specifically for accurate, executable queries based on enterprise data. Ranked #1 on the BIRD benchmark, it uses simple reward signals based on whether the query runs successfully, grounding the model in real outcomes instead of just syntax.

And it’s open source! So other developers can build on it and improve structured data workflows in their own systems.

CV: Let’s talk about inference. You’ve called it the biggest bottleneck in AI right now.

Inference is quickly becoming the dominant workload in AI, but most systems force developers to choose between being fast (i.e., low latency) or affordable deployment (i.e. high throughput).

Our team has invested heavily to eliminate those trade-offs through state-of-the-art optimizations built for real-world workloads. That includes unique speculative decoding for up to 4x faster generation, 2x higher throughput on long-context tasks with SwiftKV, GPU-level improvements that enable over 1.5M tokens per second on embedding workloads, and Shift Parallelism to keep latency low while providing high throughput even under spiky, unpredictable traffic.

These optimizations are already running in production inside Snowflake Cortex, powering live agentic workloads at scale. These are also available via Arctic Inference, our open-source vLLM plugin, so others can benefit from the same performance.

CV: Observability is another big focus for you. What does that mean in the context of agents?

When you’re working with agents, it’s not enough to see just the final output, you need to understand how the agent got there. Was it making the right decisions at each step? Did it go in circles? Did it use the right tool? Without that visibility, debugging is basically guesswork.

Our team at TruEra coined an observability concept called the RAG Triad. It’s a simple but powerful way to evaluate context relevance (whether an agent’s retrievals are on point), whether its answers are grounded in the data, and answer relevance (whether its response actually answers the user’s question).

At Snowflake, we’ve optimized our LLM Judges for the RAG Triad to make them state-of-the art. We have also generalized this approach to give developers end-to-end traceability for agent workflows. We trace every step in an OpenTelemetry-compatible manner, enable RAG Triad evaluations to those traces, and use LLM-as-a-judge for higher-level scoring to catch issues like irrelevant context or incomplete reasoning.

CV: There’s a lot of debate about open versus closed models. Where do you land?

The real issue isn’t open vs closed, it’s whether the model is reliable, performant, and suited to the task. Startups need flexibility across use cases, but switching between providers or model types can be painful: different APIs, auth methods, rate limits, and latency profiles.

That’s why we built Cortex AI to abstract over those differences. You can access both open and closed models through a single REST API, so you don’t have to rewrite code to try something new. We’re inviting a number of design partners to help shape the REST API, so if you’re building with LLMs and want early access, we’d love to work with you!

Our Arctic embedding models currently lead the MTEB leaderboard for small models (under 300M parameters) and power Cortex Search, which gives you fast, low-cost retrieval. And because they’re open, you can fine-tune or customize them as needed.

CV: What’s next for you and the research team?

We’re going to keep pushing on three fronts: open research, faster inference, and agentic AI frameworks. Expect more Arctic model releases, more open-source tools like TruLens, and more features in Cortex that make it easy to build and deploy data agents that our users can trust.

If you’re a founder or developer working on agentic AI, you can join our AI community to get regular updates on the latest innovations, use cases, and discover ways to collaborate with us. We see a huge opportunity to move this field forward together.

CV: Any final advice for builders who are just getting started with agents?

Start small and focus on trust. Build an agent that can solve one meaningful problem, and instrument it so you can see how it’s reasoning. Don’t get caught up in chasing the biggest models. Speed, cost, and accuracy matter a lot more when you’re trying to ship something people will use.

Conclusion

Stay up to date on the latest with Snowflake AI Research, follow them here.

Read our past few Deep Dives below:

If you would like us to ‘Deep Dive’ a founder, team or product launch, please reply to this email ([email protected]) or DM us on Twitter or LinkedIn.

Snowflake AI Research is Advancing the Field of AI ❄️

Plus: AI Research Lead, Anupam Datta, on why he believes data agents, observability and inference optimization will be the most important foundations for enterprise-grade AI systems...

CV Deep Dive

Our Chat with Anupam 💬

Conclusion

CV • Instagram • X • All Events