🏀Zerve chosen as NCAA's Agentic Data Platform for 2026 Hackathon·🧮Meet the Zerve Team at Data Decoded London·📈We're hiring — awesome new roles just gone live!
Videos / Automating the Hard Parts of Data Science
X
Data Day Podcasts

Automating the Hard Parts of Data Science

February 10, 2026

 Automating the Hard Parts of Data Science

In this episode, Greg Michaelson is joined by Razi Raziuddin for a wide-ranging conversation about how data science is evolving as automation and agent-based systems become more capable. They will discuss why feature engineering and data preparation dominate the data science lifecycle, how organizations balance productivity with model performance, and where tools like AutoML, feature stores, and large language models fit in real-world practice. The conversation will also explore go-to-market challenges, the gap between hype and adoption, and how teams are rethinking what it means to build and deploy predictive models at scale.

  • 0:00

    All [music]

    0:07

    [music] right, welcome back to day-to-day with

    0:13

    Greg Michaelelsson. I'm joined here by Razie Razi, my good friend uh from Data Robot and who's doing some really

    0:19

    incredible things in the feature engineering space and we actually haven't chatted in a while. So, I'm excited to catch up. Welcome.

    0:26

    Thanks, Greg. Good to see you. Yeah, it's good seeing you as well. You're up in Boston, right?

    0:31

    I am, man. I am. Yeah, I never moved. You're pretty pretty well stuck. Did I

    0:38

    remember you had one of your kids was at Yale? Do I remember that right? Or is it Cornell? Uh, Cornell. Yeah,

    0:43

    Cornell. And they've graduated since, I guess. Yeah, he's graduated, but you know, I think he likes Cornell a lot, so he went

    0:49

    back. So, [laughter] he's still there. That can happen. Grad school.

    0:55

    Grad school. Yeah. Yeah. Yeah. Yeah. Good times. So I have I have two boys. One on the east coast, one on the west coast. So I

    1:01

    have one in New York, one in LA. [laughter] So I I have both coasts covered.

    1:07

    All right. Excellent. Spanning nationwide nationwide family there. You're spreading the network.

    1:12

    Yes. How about yours? H golly. My uh my oldest is has moved

    1:17

    back home, so you know, he's still kind of finding his way a bit. All right. Jack is about to join the army. He wants

    1:24

    to be a welder. So that should be exciting. And then the youngest two are in school at in Reno and and Vegas. So they're uh

    1:32

    off doing their metriculating. So I've almost got them off the payroll. There you go. Well, I keep telling myself that, but

    1:38

    I'm I'm a long ways from having them off the payroll, I think. [laughter] So man, let fill me in. How's it going

    1:44

    at Feature Bite? Well, for folks that aren't familiar with Feature Bite, maybe just do like a little background. I'd love to hear more about what you guys

    1:50

    Yeah, absolutely. So, we're a data science agent. And uh you know, this is something that Greg, you're going to be

    1:58

    super familiar with as a former data robot and a data scientist

    2:04

    extraordinaire, right? The the the challenge with with data science and the data science life

    2:09

    cycle is not about building models. I mean, you know, we used to encounter that when we pitched data robot to data

    2:17

    scientists. They're like, "Yeah, we can build models. I know you know it's going to maybe the robot or tools like the

    2:24

    robot any autoML tool will do a slightly better job but at the end of the day it was all about okay you know that's the

    2:31

    fun part yeah [laughter] the really kind of crappy part is just dealing with all of the data doing

    2:38

    feature engineering understanding uh the domain and bringing all of that

    2:43

    knowledge and dealing with messy data and that's like you know deploying all

    2:48

    of that stuff that's 90% of the work right for data scientists and um so we started

    2:55

    feature bite and you know my partner in crime is somebody that you know really

    3:01

    well Zavia Connor used to be the chief data scientist at data robot he used to be number one on on on Kaggle um so we

    3:09

    when we stepped outside uh data robot we were like okay what problems haven't

    3:15

    been solved in the space right and one of the ones is like well that's a biggest problem to go off and solve. So

    3:21

    we said okay well let's start there. So we started with feature engineering but at the end of the day as we started

    3:26

    going to market uh you know customers were like okay well if you're if you're taking care of 80 90% of the work why

    3:33

    don't you just do the whole thing so we added you know just uh added autoML

    3:39

    light AutoML capabilities. Uh but at the end of the day, it's a it's an end toend

    3:44

    data science life cycle, data science agent that just basically starts with

    3:49

    data that's uh sitting in your data warehouse or or data lake and it's a raw form and then goes through all of the

    3:57

    steps that a really good data scientist would do to then build a model and help you get the model as well as all of the

    4:04

    feature pipelines deployed in production. So really cool, really cool. I I can't take credit for building it,

    4:11

    but you know, I love talking about it because it's so cool. [laughter]

    4:16

    And it's all agent based, so the user is not actually in there typing code and stuff. Yeah. I mean, it's it's very

    4:22

    transparent. Uh so at the end of the day, the user could decide if if uh you know, they want to go in and type code

    4:30

    or bring their own features or you know, just extract code, put it somewhere else

    4:35

    if they want to deploy it in their environment. So there's a lot of flexibility. You can pretty much think of this as as like an extension to the

    4:42

    data science team that's doing a lot of things on your behalf. So it is an agent in of itself

    4:50

    where you can say, "Okay, well here's the data. Here's the problem I'm looking to go off and solve.

    4:56

    Show me what you can do with this data." Right? And then it just goes off and looks at all of the metadata that's out

    5:03

    there. tries to understand okay how it relates to the uh data science problem

    5:10

    and then does the you know the the the ideation of features as well as

    5:16

    statistical evaluation of features and then ultimately goes off and builds models. So it's as you know the the the

    5:22

    whole data science life cycle is very complex. It's not something that you can just vibe code [laughter] your way

    5:28

    through. So are you pro giving it are you pointing at a at a variable or you

    5:34

    giving it like a text description? So would you say hey I want to model churn or or are you saying here's my target or

    5:40

    kind of what do you feed it? Yeah. So both right. So um you describe the use

    5:46

    case. Uh so it's it's it's like a prompt to to the system to feature bite to say

    5:51

    I want to predict churn or I want to understand the risk associated with you know the set of customers or these

    5:58

    mortgages or whatever have you right any kind of classical ML use case. I hate

    6:04

    that word. Let's call it predictive AI use case. How about that? And then you give it a variable. uh you give it a

    6:10

    target to say okay well this is this is the ground truth that I'm looking to go off and now predict on and then you know

    6:18

    it does a lot of the the feature ideation the generation of historical

    6:23

    data sets that are point in time correct so it avoids target leakage and then you

    6:30

    know just uh basically takes your your data sets splits it appropriately within

    6:35

    training and and testing and validation those are the kinds of things that you know in in many ways have been solved

    6:41

    with AutoML but it's all of the the front end of the data science life cycle

    6:47

    which is understanding the data you know is figuring out okay well you know how

    6:53

    um that data relates to the problem bringing in domain knowledge and all that and you know we've done some really

    6:59

    clever stuff to use Genai as sort of the domain expert across many different

    7:05

    domains many different problems so it's a problem that was very vertical in

    7:10

    nature. We've been able to sort of you know make it uh make the solution horizontal which is kind of what you

    7:17

    know this team did with with data robot as well. Right. Right. Right. Right. Now do you are you guys using like a one of

    7:24

    the foundational models at the at the core? Do you have your own kind of like secret sauce that's baked into your your

    7:30

    system and you kind of have are using Genai like LLMs on the side or what

    7:36

    what's the what's the heart of it look like? Yeah, it's uh the answer is both. You know, we have we have our own uh IP

    7:44

    that allows us to generate features that are relevant uh to a given use case. So,

    7:49

    it's basically using semantics of the data. So understanding the semantics of the data and then using that to generate

    7:56

    features that are very specific and ultimately um give the model a lot more

    8:03

    uh power and um performance at the end of the day. uh but we also use Gen AI

    8:10

    and and foundation models like OpenAI and uh anthropic cloud models you know

    8:16

    just to kind of um understand [clears throat] the data be able to explain at a semantic level what the

    8:23

    meaning of the data is how it the data relates to um uh given use case and even

    8:30

    when we generate the features we're using uh LLMs to kind of guide the

    8:35

    system guide the agent to say okay Well, which ones of these features are actually going to be meaningful and

    8:42

    relevant uh from an from a semantic or domain point of view

    8:48

    and then we do statistical evaluation of those features to understand okay you know based on the data and based on the

    8:53

    target what features actually are uh meaningful and work for the use case.

    9:00

    Gotcha. Yeah. So again, kind of how you would approach the problem as a data scientist, right? That's the workflow

    9:06

    we're mimicking. So you're out marketing to data science teams. We're marketing to data science teams.

    9:12

    We're also marketing to business executives who are clients of data

    9:17

    science teams because they are the ones who face the problem of you know I I need to build you know uh a whole array

    9:25

    of models to you know just get some kind of a

    9:30

    business outcome or objective and increasingly I mean you know this is something that we're seeing uh in the

    9:37

    early stages of which is if you're building some kind of an agentic workflow

    9:42

    Right? So let's say you you're building a retention agent or customer service

    9:48

    agent. At the end of the day, in order to make that agent more personalized, in

    9:54

    order to make it more, you know, smarter, um, as far as developing an

    10:00

    intuition about the business, a lot of that data is embedded in your historical

    10:06

    lakes, tables, and, you know, sitting in your data lakes and data warehouses, right? And you cannot just take an LLM

    10:14

    and and connect it to your data warehouse and go, well, here's 3 years worth of history. Just do whatever you

    10:20

    want with it. [laughter] I don't know if you've tried that. It doesn't go anywhere, right?

    10:27

    So, you need to extract a lot of meaning from that data in the form of

    10:32

    predictions, in the form of metrics, um, which inherently LLMs are not good

    10:38

    at. You know, that's not what they're designed to do. that's not they're they're trained for. And so you need

    10:43

    almost a layer of these models, predictive models that serve as the

    10:49

    bridge or translation layer between historical data and um the agents and

    10:54

    and workflows that you're building. So we're seeing some really interesting use cases emerge out of that space.

    11:00

    Do you guys maintain like a feature store once you've done some of that feature engineering type work? Yeah, we we maintain a feature store,

    11:07

    but we integrate with a feature store. So for example, data bricks has a feature store. We integrate with that if

    11:13

    you're you know if a client is using data bricks or we've integrated with feast which is an open source uh feature

    11:20

    store and yeah we don't want to reinvent uh the the whole wheel. Uh

    11:28

    no sense rebuilding all those components. Yes, exactly. And you know feature stores are part of the infrastructure.

    11:33

    So it just makes sense for us to leverage components that already exist. uh so for example even the compute layer

    11:41

    we do all of our computation in the data layer itself so you know data brick

    11:48

    snowflake etc we just push all the feature computation all the

    11:54

    anything that's you know data related gets pushed down into the data data platform data warehouse data lake um

    12:02

    because that's those capabilities are already built you know these uh these uh

    12:07

    companies have done an amazing job of developing the scale and the ability to

    12:12

    deal with massive volumes of data. So we're just leveraging all the tools available. Now do you have you found I'm

    12:19

    maybe side side note the when you're interacting with these feature stores have you actually found people using

    12:25

    them because I I always hear people talking about them but then yeah like nobody actually uses them. Yeah,

    12:31

    really in you know if you if you think about where feature stores are the most

    12:37

    useful are if you have some real time use cases where you need to do real time

    12:43

    serving or real time uh creation or computation of the the use cases and serving of the use cases uh or of the

    12:50

    features for uh specific use case fraud is a very good example right um so

    12:56

    that's where we see uh feature stores being used

    13:01

    the all of the capabilities that exist in the real time stuff. Yeah. In the real time stuff in real

    13:06

    time serving or real time computational features but doing anything real time anyway, right?

    13:12

    Yeah. I mean, you know, there are if you think about like financial services, fraud, etc. That's there are definitely

    13:18

    tons of um uh real-time use cases, but in the 90 maybe even the 95 to 98% of

    13:26

    the cases uh you know all of the the the features store the reason it was

    13:31

    designed as is kind of an overkill. Yeah. But I mean, it's a great it's sort of a great idea to be able to have like

    13:39

    this list of useful variables and track who's been using which variables in which applications and then then you can

    13:45

    sort them by the most valuable and you know maintain quality and blah blah blah blah. Nobody's doing this though.

    13:51

    Yeah, not too many folks are doing. You can think of it, you know, feature stores as almost like a specialized

    13:57

    database, right, for I mean it's it's like a bunch of tables that are dedicated to just

    14:04

    maintaining your features and and that's really helpful especially if you have a

    14:09

    large team, you know, that's uh instead of having to reinvent the the wheel

    14:15

    every single time, create the same features by, you know, different folks uh or different parts of the

    14:22

    organization, you can of reuse of the same uh features and capabilities. at

    14:27

    the end of the day um you know that you can also accomplish that by

    14:32

    yeah with just a a bunch of tables right [laughter]

    14:38

    you know we're we're also seeing like uh you know some of the the same customers who've who've traditionally done

    14:45

    completely batch oriented uh use cases moving to towards uh these uh real-time

    14:52

    use cases. So it's an easy transition for them to go from, you know, offline

    14:58

    features to online features that can be served in real time, but that's not the norm.

    15:05

    Yeah, it it seems to me like data scientists are really good at inventing stuff that people don't actually use.

    15:10

    Like there's a lot there's a lot of buzz out there in the space. Uh I think more

    15:15

    than other kind of areas. Do do you find that to be true? Like there's a lot store space. No, just

    15:23

    in general like you had feature stores and then I don't know part of me thinks that this whole experiment tracking

    15:28

    thing is kind of like who's actually doing that uh you know and like the whole weights and biases thing like

    15:35

    I'm like are these things that are real that people are actually using I don't know it seems like there's a lot of that

    15:41

    stuff in the in the data science space. Yeah, I think some of this, you know, it's if if you think about like uh some

    15:48

    of the tools that have emerged and I I saw this with big data as an example,

    15:54

    right? When um everyone was talking about big data, you had so many different tools available. If you

    16:00

    remember Zookeeper from the Hadoop and Cloudera days,

    16:06

    throwback. Yeah. They're they're like, you know, all these different tools that had

    16:12

    animal names associated with them that were just emerging, mushrooming out of nowhere, right? And then a few of them

    16:18

    survived, right? That that are actually really useful. It's, you know, that's

    16:25

    that's kind of the way that um a collective kind of emergence of

    16:31

    these tools occurs from from my perspective. You know, even developer tools, right? There's just so many of

    16:37

    them that come and go and then a a really small handful of them stick and they're the ones that,

    16:44

    you know, kind of take a life of their own that win. Yeah. Yeah. That's totally. So I I mean I'm interested in it because

    16:51

    Zerve is a a development environment, right? So we're we're we're also working with like data analysts and people

    16:57

    anybody that's interacting with code or with data using code, right? That that's who we sort of work with. Yeah.

    17:03

    And it seems like that's a I guess it's sort of a hard group to pin down to

    17:10

    define really because there's so many different kinds of people that are working with data.

    17:15

    Right. Right. So like there's a this there's a late uh there's a person our there's a

    17:21

    person at Zerve who's focused on operations and she always asks who's our ideal customer?

    17:26

    Yeah. And we're like we're like okay good question. anyone that data.

    17:32

    Yeah. And [laughter] so, but it's hard to like pin it down because you don't want to say data

    17:39

    scientists because there's not very, you know, there's not very many like data scientists that are actually doing modeling all the time and that's not

    17:45

    even the majority of work that people are doing with data. Like a lot of times people are just drawing a chart because

    17:50

    they need to answer a question or they need to fill up a slide for a meeting or something like that. So, there's such a

    17:55

    huge diversity of data work. Have you guys encountered that in the in your sort of go to market journey?

    18:01

    Yeah. So our our focus is very narrow, right? Um we're focused very much on

    18:09

    predictive AI and building more classical machine learning models. Yeah. Okay. And um the the way I see the

    18:18

    world now uh Gregor is kind of how you described the the data science world and

    18:25

    the world of data scientists is is changing and evolving really significantly. Right? So if if you

    18:33

    encounter you know 10 data scientists eight of them are thinking about okay do I become an AI engineer or do I you know

    18:41

    just focus exclusively on prompting and LLMs or doing something besides what I

    18:48

    was sort of trained to do or what I've been doing for the last 5 to 10 years right um but that is what's giving us a

    18:57

    lot of opportunity which is you know you we go to clients that have a big need to

    19:03

    go and refresh their models to deploy a whole lot more models in production and

    19:10

    without some kind of automation there's no way in hell they're going to be able to achieve that right I was uh I was

    19:17

    talking to a a mutual friend of ours just yesterday Ben Miller from from

    19:22

    Foran um and he was talking about the fact that you know the bulk of his time

    19:29

    just gets consumed consumed on building and maintaining features and feature pipelines, right? So, the automation

    19:37

    will allow him to do 5x basically what he was able to do. And you know, as as

    19:43

    we were talking about similar topics, he was like, look, any data scientist or

    19:48

    any data science team that's not thinking about automation is just going to be left behind. This is like, you

    19:55

    know, just like coding, right? If you're not using one of the coding assistants

    20:01

    or some kind of LLMs to help you code better faster. Yep. Y

    20:06

    it's it's not an option anymore. I mean, if you're a super coder that, you know,

    20:12

    can can be very productive even outside of using any of these coding agents or

    20:18

    tools, good for you. But you're in the top 1% of coders out there.

    20:23

    There's they're so good. I mean, you're you're completely right. You'd be insane not to use them. I just did u

    20:30

    uh like a live coding event uh like a webinar where everybody was kind of on coding together and we were we were

    20:36

    building a optimization for electrical vehicle charging stations uh in a in a synthetic city, you know, it was just

    20:43

    that was that was a problem we were working on and we we divided the problem up into

    20:49

    steps. So it was me and and one of the co-founders observed that were doing it. Yeah. And so I divided the problem up into six

    20:55

    different steps and each one of those had a prompt. And so I was like, "Okay, do do step one, step two, step three.

    21:02

    Six steps." Remarkable. It worked. It was great. And Jason, my co-founder, he one-shotted

    21:08

    it. He put the entire thing in one prompt. And the agent just boom spat out all the

    21:14

    code you needed, all the everything just done. Boom. In one prompt. like the the the level of quality coming out of these

    21:21

    models from like GPT 3.5 to today is is

    21:26

    remarkable. It's crazy. It's remarkable. Yes. Yeah. And is it's it's only going to improve. We all know that. And you know similarly

    21:34

    these tools like you know what we're building at feature bite I mean the the results that we've been able to show is

    21:41

    a process that takes 2 to 3 months to go off and build a a a good production

    21:48

    ready ML model. We're able to get it down to 2 to 3 days. But the the more

    21:55

    interesting part of of this whole thing is we're able to show an improvement in

    22:00

    the overall model performance by just building much better features cuz at the end of the day features are the ones

    22:06

    that are driving the performance of these ML models and you know we've seen anywhere from you know just three four

    22:12

    to all the way up to 18% improvement in performance which is just mind-blowing right can be huge yeah absolutely

    22:18

    and yeah so when you have these automated tools that give you both, you

    22:24

    know, the two dimensions of value that you care about, which is, you know, how fast can I have my teams go off and

    22:30

    execute on on these projects and and build these models and what's the quality of those models? I mean that's

    22:38

    we you know it's it's that's there's no argument to say okay well no no I I like

    22:44

    doing things my my own way writing a bunch of code and maybe instead of 3

    22:50

    months it's going to take two months now but you know it's uh yeah sometimes it's a

    22:57

    it's a mind-blowing exercise to try to justify you know why automation is good

    23:02

    and now you guys just kicked off a big contest a few like a month or ago. Yeah. Yeah. Yeah. Yeah. Yeah.

    23:08

    Do you want you want to talk about that? Absolutely. So, it's it is based on, you know, the kinds of results that we're seeing. We're like, you know, and and

    23:16

    even with, you know, going back to to to Ben when we we did a P with with him and you

    23:23

    know, he's I I consider him like one of the top 1% of of of data scientists.

    23:29

    Yeah, he's great. At least at at the robot, right? And there had a strong collection of some of

    23:36

    the really good data scientists out there and he was one of the best and we

    23:41

    were able to show a 10 12% improvement on the models that he had built within a

    23:47

    matter of a couple of days right and so I'm looking at that and going

    23:52

    okay well if you can do that with dense models and here and there and you know and and

    23:59

    and you know this This is this is something that we,

    24:05

    you know, we'll just put a challenge out there and and just ask any data

    24:10

    scientists that wants to test their production models against uh data science agent. we'll

    24:17

    just do an open competition because you think about it, Greg, you know, once you once you're behind a firewall, you've

    24:25

    got, you know, your proprietary data that you're you're using to build a production model, there's no way to

    24:31

    benchmark how good that model is, right? It's whatever your team has done

    24:36

    or whatever, you know, [laughter] yeah, some some somebody or some team has has built and you're like, okay, I I

    24:44

    guess it's it's as good as it gets, right? And if that model is important for you and it's going to get you know

    24:50

    it's typically it's not the only model you've got a suite of models and if those models are important for you you

    24:58

    at least need some way of benchmarking it. So he said hey look this is this is a competition that's for business. It's

    25:06

    not like a Kaggle competition because you have Kaggle out there you know you can just um create some uh curated

    25:13

    engineered data sets. Nothing like that. This is real world. You bring your data, your model and test

    25:21

    it against feature bite uh model that feature bite builds and see how good or

    25:27

    you know uh not so good it is. And if you if you can outperform the feature

    25:34

    right model will actually give you a a cash prize of $10,000

    25:39

    for the winner. And so we're running that competition right now. It's u you know it's I encourage every data science

    25:47

    team every organization even you know business folks who are interested in knowing okay well how good their data

    25:55

    science team is. Mhm. Test it out. [laughter]

    26:00

    Now I'm curious about how you guys uh how you guys go to market uh because our

    26:06

    Zerve is a is a PLG play. So the way we talk to data scientists is like sign up

    26:11

    for free, if you need more credits, you can buy them. That sort of thing. Uh that's our approach, but it sounds like you're you're more kind of B2B type.

    26:19

    Yeah, we're we're at this point we're very much B2B. Uh we want to open it up

    26:24

    um you know for more of a PLG model. I think the the the

    26:30

    based on kind of how feature bite works and where it adds value, we need to

    26:36

    connect into the data warehouse or data, we need access to the raw data in order

    26:43

    to do its thing, right? In order for for the agent to do its thing. And typically

    26:48

    that requires you know some somebody

    26:53

    at the executive level to say yes you know I I I want to bring this in and I want to you know I believe in the value

    27:00

    I at least want to see what it can do. So we can connect from the outside but still we need access to the data

    27:08

    repository or we get installed inside which again obviously requires infosc and uh it and

    27:16

    all that. So yeah, so so we do uh we we tend to go to the folks who are running

    27:23

    these data science teams uh who care about productivity and ultimately the

    27:29

    the performance of the models to say look we can help you on on both of those metrics.

    27:36

    And how so as these large language models get better and better how is that

    27:42

    going to impact the way that feature bite works? because you know they're they're good now but they're still

    27:49

    it's still not a set it and forget it kind of thing like you still need a lot of supervision. Yeah, I think so again

    27:55

    when it comes to tabular data in particular, right? Um so dealing with the large volumes of historical tabular

    28:04

    data or data that's primarily tabular, you know, it can have text, it can have images, etc. But uh the data is

    28:10

    primarily tabular. Um the classical

    28:15

    algorithms and models still outperform even the you know the the deep learning

    28:21

    ones, right? And um there are you know lots of research projects to try to

    28:27

    build tabular foundation models in some ways but they only work on a a single data set. There are some limitations on

    28:34

    accuracy or limitations on scalability etc. Uh and even the accuracy it's you

    28:39

    know and works well in certain situations and not so so well in others.

    28:44

    The way we see it is, you know, as these tabular foundation models develop, we'll

    28:50

    just use them pretty much like how we utilize something like XG Boost or Light

    28:55

    GBM to build these models. What is a tabular foundation model? I've never heard of that before.

    29:00

    You'll have to just go and look it up, you know. So, it's it's it's the equivalent of a foundation model built

    29:08

    for tabular data in particular. So um you know it's those are research

    29:14

    projects right now. There are some open source um you know libraries and open open source models but uh none of them

    29:22

    have you know crossed over or you know delivered the scalability performance

    29:28

    etc to you know become mainstream just yet. Right. Okay. And so yeah we you know as far as

    29:36

    we are concerned we see agents and an agentic framework as a much better way

    29:43

    of approaching data science and being able to model out data science as opposed to um you know just trying to

    29:51

    build one single model or or trying to to somehow sort of you know just twist

    29:58

    and and uh turn the LLMs to work on tabular data.

    30:04

    Got it. Yeah. No, that's fascinating. I I'm not aware of the those tabular foundation. Do you have to train them?

    30:10

    Do you Yeah, you have to train them on different data sets and some of them are

    30:16

    pre-trained models. So, they're trained on tabular data that's, you know, it is

    30:21

    publicly available. One one of the biggest challenges with tabular data is most of the the data is proprietary,

    30:29

    right? is sitting behind firewalls. And um there's no inherent kind of context

    30:35

    and structure which is you know which which is sort of uh uh you know in in in

    30:43

    in some ways ironic because it's called structured data but at the end of the day you know 2 3 4 in u you know one

    30:51

    table and one column within table means something completely different than 2 3 4 in a different one right this same

    30:57

    number has a completely different um set of connotation

    31:04

    from you know one column to another, one table to another, one business to another, one industry to another. Right.

    31:10

    Sure. Yeah. And so it's it's not like you can just have a series of numbers and you know

    31:16

    train an LLM on it and get some meaningful out output out of it. That's one of the biggest limitations out there

    31:21

    with Tableau data. So yeah. Yeah. Yeah. Yeah. Wild. All right. What's next for you

    31:27

    guys? you got uh some big stuff on the horizon as far as roadmap goes. Uh yeah, you know, it's uh it's building

    31:35

    out a bunch of different uh different types of use cases. So we we're just in the process of adding uplift models

    31:43

    which we're starting to see becoming more and more mainstream. Uh so it's uh causal.

    31:49

    What's an uplift model? So it's it's being able to identify the you know just

    31:55

    the causes the features for u not just

    32:00

    correlations but uh using for or or using the these features for causal

    32:05

    modeling to understand yes um you know what actually causes a

    32:12

    certain outcome and a certain prediction which can be very useful for all kinds of use cases from modeling to risk

    32:19

    analysis to fraud etc etc. Right? So you know we're doing that um you know

    32:26

    just uh scale and um different types of use cases is you know something that we

    32:33

    continue to build out. So yeah and then you know just going off and uh raising

    32:39

    more money and scaling our go to market. So that's that's the the next big thing for us.

    32:47

    Yeah. I don't want to talk about fundraising right now. [laughter] That's the fun part, right?

    32:52

    Oh, yeah. That's Yeah. Sign me up for more of that. All right. Well, hey, last question. Uh

    32:59

    tell talk to us about some uh some non-b businessiness related uses that you've got for are you a chat GPT guy? Do you

    33:07

    use Claude? What's your what's your model of choice? I usually I mean personally I use chat

    33:12

    GPT. Okay. Um and what's what's the last thing you asked it that was non-b businessiness related? Uh well I I do a lot with with

    33:22

    chat GPT on a regular basis. Uh so I'm a regular user of it you know everything

    33:28

    from you know just uh analyzing resumes to helping you know brainstorm

    33:36

    ideas for uh blogs and LinkedIn posts to helping polish up you know whatever

    33:43

    messaging business related. Yeah. Most of it is business. Come on man. You know, you're a you and I are

    33:49

    are startup founders. Is there anything else to to think about and do besides the business?

    33:54

    Uh I use it for cooking all the time. GPT is really good at recipes.

    33:59

    Uh okay. So that's that's a good one. Um I just did a um

    34:06

    uh Chat GPT has a thing that will send reminders now. Like it'll you you can send a

    34:11

    notification to the app on your phone. Okay. And so like if I'm if I've got something going during the day, I could

    34:17

    say, "Hey, every 30 minutes, send me a notification reminding me about this or something like that." So interesting.

    34:23

    That's that's been really useful. Okay. You know, I can set that up on a on my

    34:29

    calendar as well. [laughter] All right. All right. Details.

    34:36

    Well, hey Rosie, [laughter] thanks so much for taking the time. This was really fun to catch up. Great to

    34:43

    catch up as always, Greg. It's good to see you. All the best.

    34:49

    Wish you the best, too. Thank you very much. Excited to see where you guys end up. Likewise. Likewise.

    34:54

    All right. Cool. All right, man. Thanks again. We'll see you later. [music]

    35:04

    [music]

Related Videos

Decision-grade data work

Explore, analyze and deploy your first project in minutes