Cerebral Valley
Posts
TensorStax is building Autonomous AI for MLOps and Data Engineering 🌐

TensorStax is building Autonomous AI for MLOps and Data Engineering 🌐

Plus: Founder Aria on the exciting future of data science agents...

September 06, 2024

CV Deep Dive

Today, we’re talking with Aria Attar, Founder of TensorStax.

TensorStax is revolutionizing the way companies approach data science and machine learning by developing autonomous agents that handle complex data engineering and MLOps. Aria, who dove into machine learning straight out of high school and gained experience across various startups, has set his sights on addressing the shortage of skilled data scientists and ML engineers. TensorStax's goal is to create agents that can manage everything from data pipelines to training and deploying models, making these advanced processes more accessible and efficient for those pursuing data science functions across their tech stack.

Since launching, TensorStax has seen success in helping startups and F500s alike streamline data transformation, build recommendation systems, and even fine-tune computer vision models. With a focus on solving real-world problems for both software engineers and data scientists, the platform is quickly becoming a go-to tool for enterprises looking to scale their data operations.

In this conversation, Aria discusses the journey of building TensorStax, the technical challenges they've overcome, and the exciting future of autonomous agents in data science and ML.

Let’s dive in ⚡️

Read time: 8 mins

Our Chat with Aria 💬

Aria - welcome to Cerebral Valley! First off, give us a bit about your background and what led you to found TensorStax?

Back in 2018, I graduated high school and got really interested in machine learning and deep learning, when Tesla started showing the viability of FSD through cameras. I wanted to dive straight into working with startups in the ML/AI space, so I started learning a lot of this stuff on my own. I wanted to drop out of university, but my parents weren’t too happy about that, so I did a couple of semesters before finding the confidence to drop out.

After that, I worked at different startups in various roles, mostly in data science and ML. I even tried to start my own company around that time, but we had to shut it down pretty quickly. Eventually, I moved into some go-to-market roles to learn the other side of startups, and that’s kind of how I got to where I am today.

Give us a top level overview of TensorStax - how would you describe the startup to those who are maybe less familiar with you?

We're building autonomous data science and ML engineers, where these agents can help with everything from data engineering and ETL jobs to training models, deploying them, and setting up observability. Our primary users fall into two categories. The first is software engineers who haven't spent much time with data science and ML, and we're helping them get to a baseline level of value with some solid processes in place.

The second group is data scientists and ML engineers. They often have more work to do than they can manage, and we're aiming to speed up their processes significantly. The core thesis behind this is that there aren't nearly enough data scientists and ML engineers. If you look at the numbers, there's only about one machine learning engineer for every 250 software engineers, which is a pretty crazy ratio. And that's a big part of the problem we're trying to solve.

How would you recommend a new user get started with TensorStax. Any specific use-cases you’d like to highlight?

Right now, we're really focused on the data engineering side of things, particularly when it comes to complex use cases like full-on data analysis, data visualization, or building and deploying complex ETL pipelines. These are areas where we really shine at the moment, and we're also working on training some basic ML models on top of that. As we continue to build out the product, things will get more complex, but it's all about matching the use cases you're trying to achieve with the complex data engineering workflows we're good at. It also depends on where your data lives — whether it's in databases on AWS, GCP, Azure, or in warehouses or lake houses like Snowflake or Databricks.

We’ve been keeping track of our success by looking at how accurate and reliable our agents are when it comes to building complex workflows. For example, if you need to set up a complex data pipeline and get it into production, we see how quickly the agent can handle that and how close it is to what you actually wanted. That’s been our main way of measuring things.

Walk us through some of those use-cases, specifically on the ML side of things.

So far, we’ve found success in a couple of key areas. On the data pipelining and engineering side, the most common use case we see is when users need to move data from an old Postgres instance to a new Snowflake warehouse. You might want to set up some transformations along the way, like converting third-party data into the format that fits your current datasets and internal use cases. We’re seeing a lot of success with data transformations in those scenarios.

On the ML side, we’ve had great results with recommendation models. The agent can take your data, format it for a recommendation algorithm, and then go ahead and train and deploy the model. Another area where we’ve had a lot of success is anomaly detection, like spotting network anomalies. We’re also working with a government agency that’s fine-tuning computer vision models—specifically CLIP and YOLO models—through our agent. The workflow is pretty straightforward: they give the agent a bucket with labeled computer vision data, and the agent handles everything from spinning up the infrastructure to fine-tuning and deploying the model.

There’s a lot of excitement around AI agents today - how are you factoring that into TensorStax’ product given you’re building an AI data scientist?

Today, our agents function more like autonomous co-pilots. You give them a specific task, they execute the steps in a sequence and then come back to you with the results. But looking forward, we see the future of work—and agents—evolving into higher levels of abstraction, especially around recurring task assignments. We're really focused on building out a system where an engineer, data scientist, or ML engineer can assign a task to an agent, and it’ll just go ahead and do it the same way that one of their co-workers would.

Take for example a recommendation algorithm, where performance can degrade over time as user behavior shifts. Typically, you’d need to reprocess data, retrain the model, adjust hyperparameters, and finally redeploy the model when its live prediction metrics fall below a certain threshold. These are routine tasks for data scientists and ML engineers. We're building a world where you can offload recurring tasks to an agent. You could tell it, “Monitor the live prediction metrics for this recommendation model, and whenever they drop below a certain level, rerun last week's workflow.” The agent would then manage this process continuously, 24/7 just as a data scientist would.

As for whether knowledge workers, software engineers, or data scientists will be replaced by AI—my honest take is no. I think instead the nature of their work will shift dramatically. Individual contributors will take on roles more like engineering managers. They’ll be responsible for managing and assigning tasks to multiple agents, monitoring the work, and making adjustments as needed. We’re starting to see this play out now, and it’s an exciting direction for the future of work.

What’s the hardest technical challenge you’ve had to face whilst building TensorStax to where it is today?

There’s been a lot we’ve learned along the way. One of the key insights we've had about agents—and this is becoming more widely understood—is that they struggle with very general tasks. However, when you narrow down the inputs and outputs that these LLM-based agents deal with, their reliability increases significantly.

For our use case, we built an infrastructure layer underneath the agent. This infrastructure is what the agent relies on to trigger complex data science and ML workflows. We try to offload as much of the critical work as possible to deterministic code and traditional software, making the agent more of an orchestrator for the underlying infrastructure.

Another thing we noticed is that as agents run longer workflows or engage in extended conversations, their performance tends to degrade. They start hallucinating more and making more mistakes. We traced this back to an issue with compounding token errors in LLMs. Basically, the longer they run, the more they encounter out-of-distribution tokens, which can throw the LLM off track and lead to more errors.

To tackle this, we found that minimizing the background information or tokens injected into the context before the agent enters an autonomous loop greatly improves accuracy and reliability. After each loop, we summarize the actions taken and feed that summary back to the orchestrator agent. This approach has really helped maintain reliability across complex, multi-step workflows.

On top of that, we've tackled some pretty big infrastructure challenges, like enabling agents to run large batch jobs. For example, when an agent needs to execute a big data job, it has to provision a large compute instance and manage that job efficiently.

How do you plan on TensorStax progressing over the next 6-12 months? Anything specific on your roadmap that new or existing customers should be excited for?

The next wave of use cases we're excited to fully capture involves large-scale data transformations and pipelining with big data—think billion-scale data transformations. These are the kinds of tasks you’d typically see a whole data engineering team handling through a tool like Databricks.

Beyond that, we plan to focus on more complex tasks like model training and fine-tuning. Right now, the agent can handle training and fine tuning with a limited set of models, but we’re adding more layers of complexity to the types of models it can work with. Something we're particularly excited about is enabling the agent to fine-tune LLMs for customers. So rather than setting up your training infrastructure and writing code, you can just tell the agent to fine-tune something like Llama 70B using DPO on a specific dataset, and it’ll take care of the rest.

Lastly, tell us a little bit about the team and culture at TensorStax. How big is the company now, and what do you look for in prospective team members that are joining?

Right now, we’re a small team of five—three full-time and two part-time. The two part-time engineers are still with us but will be transitioning to full-time once we close our next funding round. We're looking for engineers with a strong background in building complex infrastructure and developer tools at scale. We're also really interested in people who are excited about agentic architectures and can approach new problems in that space from a first-principles perspective.

From what we've seen, both ML engineers and backend engineers can thrive in this environment, as can DevOps engineers and those with experience in related fields. So really we're open to anyone with the right skills and passion for working on these cutting-edge problems.

Conclusion

To stay up to date on the latest with TensorStax, learn more about them on LinkedIn and Twitter.

Read our past few Deep Dives below:

If you would like us to ‘Deep Dive’ a founder, team or product launch, please reply to this email ([email protected]) or DM us on Twitter or LinkedIn.

TensorStax is building Autonomous AI for MLOps and Data Engineering 🌐

Plus: Founder Aria on the exciting future of data science agents...

CV Deep Dive

Our Chat with Aria 💬

Conclusion

Join Slack | All Events | Jobs