The goal of eval is to guide the product development. So you see eval, because I think I'm a big fan of eval, is that it helps you uncover opportunities where the progress are doing well.
Chip Huyen
Core developer at Nvidia, AI researcher, Stanford instructor, Author
8 quotes across 1 episode
AI Engineering 101 with Chip Huyen
In a lot of the companies that I have seen, that's the biggest performance, in their RAG solutions coming from better data preparations, not agonizing over what vector databases to use.
I think there's a lot of value plays in... So before we have a lot of disjointed teams. We have very clear engineering team, product team, but then there's a question of who should write eval? Who should own the metrics? And it turns out, eval, it's not a separate problem. It's a system problem because you need to look into different components, how they interact with each other.
We are in an ideal crisis. Now, we have all this really cool tools to do everything from scratch and have new design. It can have you write code. You can have new website. So in theory, we should see a lot more, but at the same time, people are somehow stuck. They don't know what to build.
It's really hard to measure productivity. So you actually think about what actually drive productivity metrics for you.
I do ask people to ask their managers, 'Would you rather give everyone on the team very expensive coding agent subscriptions or you get an extra head count?' Almost every one, the managers will say head count. But if you ask VP level or someone who manage a lot of teams, they would say, 'Want AI assistant.' Because as managers, you are still growing, so for you having one HR head count is big. Whereas for executives, maybe you have more business metrics that you care about.
You don't have to be absolutely perfect, I think, to win. You just need to be good enough and being consistent about it.
One tip is go look from the last week. For a week, just pay attention to what you do and what frustrates you. And when something frustrates you, think about, is there anything we can do? Can it be done a different way?