Tavus is revolutionizing AI-powered video 🎥

Plus: Founder Hassaan Raza on unlocking a new, more human era of computing through AI...

CV Deep Dive

Today, we’re talking with Hassaan Raza, Co-Founder and CEO of Tavus.

Tavus is an AI research company focused on building foundational models for human-centric AI. Designed as an infrastructure platform for developers, Tavus enables the creation of AI-powered video experiences, allowing companies to integrate AI avatars, video generation, and real-time conversational video into their products. By providing powerful models and APIs, Tavus helps developers build more interactive and immersive digital experiences that feel human – where AI has perception, empathy, and understanding. 

Tavus has already seen early adoption across industries like marketing, sales automation, and corporate training, with use cases expanding into education, e-commerce, and tele-health. The company’s cutting-edge technology is driving the future of human-computer interaction, offering users more personalized and natural communication experiences.

In this conversation, Hassaan shares how Tavus was founded, the challenges of building conversational video models, and how the company is unlocking a new, more human era of computing through AI.

Let’s dive in ⚡️

Read time: 8 mins

Our Chat with Hassaan 💬

Hassaan - welcome to Cerebral Valley! First off, give us a bit about your background and what led you to found Tavus? 

Hey, I'm Hassaan, co-founder and CEO at Tavus. My background is in product and engineering, and before Tavus, I worked at Google, Apple, and a few other companies. I also ran a product studio and have been working in the ML space, specifically in areas like avatars, 3D reconstruction, and lip sync, for about five or six years now. It’s been interesting to watch AI evolve during that time.

Tavus started to take shape back in 2019 or 2020. We had this idea around scaling digital likeness and presence. One key insight was how effective personalized, one-to-one videos were in engaging audiences, but it wasn't practical to record hundreds of videos every day. That’s when the lightbulb went off—why not use what was then called deep fake tech (now known as generative AI) to scale personalized video creation?

The focus was always on extending digital likeness and making digital experiences more immersive and human. Since then, we've evolved from personalized videos to full video generation, and now we're focused on real-time conversational video – and providing these as tools for developers. That’s the journey so far.

Give us a top level overview of Tavus - how would you describe the startup to those who are maybe less familiar with you? 

Tavus is an AI research company focused on building foundational models and tooling focused on making AI interactions more human. Essentially, we provide the infrastructure—models and APIs—that developers can use to create video experiences within their platforms. Whether they want to integrate AI replicas or avatars, video generation, or build real-time conversational video experiences, they can leverage our API and models to bring those capabilities to life. We also have lip syncing and word replacement APIs coming soon. 

Talk to us about your users today - who’s finding the most value in what you’re building with Tavus? Any specific customer segments uniquely suited to incorporating Tavus? 

The majority of our usage at Tavus comes from our enterprise customers—product teams at software companies that want to build AI video features, like video editors or AI video platforms. However, we also serve a significant number of startups where our APIs are core to their product. So, we support a wide range of users, from startups to larger enterprises.

For us at Tavus, the earliest adopters of our technology have consistently been in marketing and sales – or anyone in customer engagement. These teams have always resonated with innovations like personalized video and full video generation. Now, with conversational video, we're seeing the same trend. Marketing and sales automation have greatly benefited from using our tools to effectively create, manage, and nurture their funnels.

Corporate training is another area that's shown significant interest, especially in personalizing the learning experience. For example, with role-playing and expert cloning through conversational video, companies can scale training without needing the physical presence of experts.

We're also seeing emerging use cases in sectors like live e-commerce, education, life coaching, and telehealth. One of our key clients, a major video sales platform with millions of users, uses our AI models to power their AI avatar workflows. This has significantly boosted engagement and increased the number of videos being produced, allowing their users to create more curated, personalized content.

How do you measure the impact that Tavus is having on your key customers? Any use-cases that you’d like to share?  

We look at it at two levels: Firstly, is Tavus helping product development teams get to market faster with AI video? Does Tavus enable high-value human services like expert mentoring or therapy to reach a wider audience through AI scalability? And, secondly, what impact does our technology have on our customer’s users?  

For conversational video, we’re seeing the impact in terms of whether end users are spending more time with AI agents compared to what they would have with conversational text or audio. The question is, are they more engaged? Is the quality of the conversation better? We’ve seen, time and time again, that people are more engaged, ask better questions, and get better responses because there’s more context and the exchange feels more human when using conversational video through Tavus, versus just audio or text.

For async video, it’s about seeing if there’s better engagement overall. Engagement can mean different things depending on the user, but we measure it by looking at whether people are using the platform more to generate videos and whether more people are watching them. For example, one of our customers said that embedding generated video into email increased response rates from nearly zero to an average of 3-5%, with peaks as high as 14%.

Ultimately, we measure impact by making sure that what we’re offering is driving more scalability, better engagement, more personalization, and greater adoption from our customer’s users.

Could you share a little bit about how Tavus works under the hood? 

Tavus revolves around what we call replica models. These models create a digital version of a person’s face and voice. A user records about one or two minutes of video training footage, and then the model learns how they move their face and mouth, creating a 3D model of them. We use something called Gaussian splatting to achieve this, and a neural renderer ensures that the textures on your face are accurate and closely resemble you.

A lot of our research is focused on building high-quality, immersive models that replicate people as accurately as possible. These models are used both for video generation and conversational video. The conversational video aspect is really interesting because it's a multimodal interface—it can see, hear, and respond like a human. For example, if you were interacting with my digital twin right now, it could see you, maybe comment on your hat, or notice something in your background. It uses facial expressions and other visual cues to drive the conversation naturally, just like a human would.

We've developed this sophisticated pipeline that can essentially mimic human interaction, and it's powered by the same Phoenix class of models used in our video generation, just at a much faster speed.

We’re really focused on what makes a conversation feel human, what are the non-verbal cues that impact understanding and perception

There’s been an explosion of interest in agentic workflows. How has that shaped the way you’re thinking about building Tavus? 

Our Conversational Video product brings agents to life by adding a human-like video and intelligent conversational layer. Most agents out there are text or audio-based, but with Tavus, you get a full video experience. This makes interacting with agents feel more natural. For example, instead of having to text with an AI chatbot, someone like my mom could now FaceTime with an AI that looks and feels human, making the experience much more comfortable and intuitive for her.

Not only are we providing a video layer and giving conversational AI a face, we are giving it human-like understanding and driving towards empathy. We are helping move toward a future where human-computer interfaces mimic how we interact with each other in the real world with perception and understanding of non-verbal cues. The goal is to have the computer understand us better, rather than us having to adjust to how it works.

What’s been the hardest technical challenge you’ve had to overcome to get Tavus to where it is today? 

Latency is a huge factor when it comes to building  conversational video. It has to be low-latency, or it’s simply not going to work. We’ve dedicated a lot of our research and engineering time into making sure that our conversational videocan process multimodal inputs—like understanding voice and video—in a really short amount of time, and then generate responses quickly too. 

It’s about creating a really tight integration of all the pieces, ensuring there’s no significant fidelity loss, while also optimizing each step in the pipeline for speed. We’re measuring tens of milliseconds to make sure everything is as fast as possible. Our conversational video has the lowest latency of any interface out there, and it’s because we’ve hyper-optimized the entire process—from video generation that’s faster than real-time to speech recognition and the LLM itself. It’s less than one second, and often even faster than that. We’ve even built models that "think ahead" by predicting what’s being said while it’s still being spoken, which is key to achieving that fast response time.

How do you plan on Tavus progressing over the next 6-12 months? Anything specific on your roadmap that new or existing customers should be excited for? 

The next six to twelve months are going to be insane in terms of development. Right now, even though the video is already really good, there is still some uncanniness—like certain expressions or responses that don't fully land. It’s already better than just audio or text, but soon, we’ll reach a point where it will feel very fluid. 

The rendering, conversational awareness, and multimodal understanding will become so advanced that the experience will feel like you are talking to a real person. Of course, there will still be clear indicators that it’s AI when asked, and our philosophy is full disclosure, but people won’t care if it's AI because it will feel so natural. Right now, companies don’t quite realize that this technology is already here. For example, just last week Intercom was talking about the future of customer support AI agents evolving from text to voice to, eventually video avatars. There was also a viral post on X debating whether a video ticket agent at a train station was AI. But the reality is, this tech is already here.

Once people recognize that, we're going to see much wider adoption, beyond the early adopters. Human-like AI agents will start showing up in kiosks, education, telehealth—you name it. Conversational avatars will be everywhere, and it will feel totally natural, and people will genuinely love them because they’ll be designed to be super helpful.

Lastly, tell us a little bit about the team and culture at Tavus. How big is the company now, and what do you look for in prospective team members that are joining?

At Tavus, we're really looking for a few key traits in the people we bring on: curiosity, adaptability, and a deep passion for building. Everyone here is genuinely excited about the future of human-computer interfaces and pushing the boundaries of digital communication. It’s also a low-ego environment—everyone is kind, collaborative, and hardworking.

One thing we tend to notice is that everyone at Tavus has a bit of a screw loose, in the best way. We’re all a little obsessive about certain details, and that kind of intensity is something we value. Whether it’s having something to prove or being hyper-focused on making things perfect, it’s a trait that really defines our team.

In terms of roles, we’re currently hiring for a growth marketer, an SDR, machine learning engineers, AI researchers, and full-stack engineers who are comfortable working in a machine learning setting.

Conclusion

To stay up to date on the latest with Tavus, learn more about them here.

Read our past few Deep Dives below:

If you would like us to ‘Deep Dive’ a founder, team or product launch, please reply to this email ([email protected]) or DM us on Twitter or LinkedIn.