Agentic Coding Live: Three Data Projects, One Hour, Infinite Chaos
November 11, 2025

Join us live as we take on an ambitious, slightly chaotic challenge: building three full data projects – Who Owns My Neighborhood, Corporate Landlords of Salt Lake County, and Tax Fairness Explorer – simultaneously, in one hour, using agentic coding. We're not doing this the polished way; we're doing it the real way. By experimenting, asking for help, and letting our AI agents do as much of the heavy lifting as possible.
Win, lose, or crash halfway through, we'll use Zerve's agentic canvas to explore, clean, model, and visualize Salt Lake County's parcel dataset…all at once. You'll see us juggle workflows, parallelize tasks, and let AI assistants write, refactor, and fix our code on the fly.
By the end of the hour, we'll have at least a first draft of each project, or a glorious mess to learn from. Either way, it's going to be fast, raw, and completely transparent: a live look at what happens when human intuition meets agentic automation in real-time data science.
Speakers: Greg Michaelson and David “Gonzo” González
0:56
Good morning everybody. Good morning Gonzo. Good morning Greg. Good to see you man.
1:02
Well we're gonna we're gonna have a lot of fun today. Um we're we've got a project uh three projects to work on
1:07
this topic or the title of this live stream is Agentic Coding live through data projects one hour infinite chaos.
1:15
Uh so we're uh we're pretty pumped to do it. I'll get into the details of what we're going to be working on here shortly. Uh but uh first introduction.
1:22
So I'm Greg Michaelelsson. I'm one of the co-founders here at Zerve. Uh and I'm joined by David Gonzalez, my good
1:28
friend Gonzo. Uh who is just an outrageously good fella. Fella. I said
1:34
fella. And uh I'm super pumped to have him with us. Say hi, Gonzo. Tell us a bit about
1:41
yourself. Yeah, Greg. Uh yeah, so happy to be here. Yeah, we've we've done some really
1:47
fun projects together. Um we've done some really hard projects together. Uh yeah, we've done a lot of work at like
1:53
3:00 am, which is always the best time to be working, right? Yeah, that's fact.
1:59
So, yeah. Yeah, I'm super excited to be here. Excited about what you guys are building at SER. Nice. Well, we'll get into it and
2:06
hopefully everything will go good. This is definitely one of those tightroppe with no net type situations. But yeah, I
2:11
think some of some of my funnest times were working on uh pandemic problems uh in the middle of the night uh with you
2:18
at at Data Robot trying to figure out what in the world was going on and and how we could help and and all that stuff. So, we'll get into some uh some
2:26
interesting problems here today as well. Nothing pandemic related though, fortunately.
2:31
Yeah. Yeah, it'll be good. So, um pop quiz, why don't you tell us a
2:37
little bit about Zerve? We'll flip the tables. you don't work here. Yeah. So when I Yeah. So Zerve is a
2:45
um it's a solution for doing data science, right? And doing um we can call
2:52
it like loosely AI. Uh I I think the way to think about Zerve is what if uh a
3:00
data scientist uh decided to build something that would actually deliver
3:06
production code that others could engage with. Um, and I feel like you guys have done a
3:12
bang-up job. The the the solutions that kind of are in this arena, uh, have
3:17
typically skewed very notebook ccentric, uh, very much running, you know, even if
3:23
it's not running, uh, strictly on your laptop. It kind of feels like it does. Um, and, uh, you know, I I'll try and
3:31
keep it PG on this, uh, you know, on this live streaming. Uh but uh the
3:37
typical reaction to data scientists work from people who aren't data scientists
3:42
is what the hell am I looking at? And I think Zerve is trying diligently
3:48
to uh be able to skirt that question and get to something a little more uh
3:54
approachable or quite a bit more approachable. Now your background is in engineering. Yeah. Or data science. Oh, I I wish I my
4:02
background like I I've become a a pretty workable engineer, but you know I I'm a
4:07
I'm a tragic case, Nick. Uh Greg, my my background is primarily stats. So
4:13
Oh, nice. Okay. A fellow stats guy. Yeah. So I spent a lot of time in uh got
4:19
all the way up to 800 level stats courses. Um didn't graduate with stats. I I did actually evolutionary ecology of
4:25
all things. Wow. Okay. I don't know what that means. What is evolutionary ecology? That's a topic
4:31
for another day. That's another day. Yeah, no one cares about that right now. All right, sharing my screen here.
4:39
Uh there it is. It's popped up. Uh we're going to jump into Zerve here. Uh we I've put together a data set. It's
4:45
interesting data. It's a uh partial data from Salt Lake City or Salt Lake County.
4:51
Uh so Salt Lake City is a big city in Utah and Salt Lake County is the county that it's in. And so um I put together
5:00
this data set and we're going to try and solve three problems uh in this data set
5:06
today. Uh so let me see if I can explain what they are. So the first one is called who owns my neighborhood and
5:13
we're going to try and just explore this data set, filter it, aggregate it, visualize these property records. This
5:19
data set is uh at the uh parcel level. So like the the plot of land that a
5:25
house or or a business or a building might be on. Uh and so the first project is like who
5:31
owns stuff like you know what does the distribution of ownership look like? Got to kind of try and explore kind of an
5:37
EDA type uh project. Um the second one is called corporate landlords of Salt Lake County. So here we're trying to
5:43
look at are there big institutions or organizations that own lots of property
5:49
in uh in Salt Lake County and that we're going to try and do some like cleanup
5:54
and normalization of the owner name uh in and try and and get that working
5:59
good. And then the third one uh is called a tax fairness explorer. And
6:04
we're going to try and get a sense for uh who's paying higher taxes and and how
6:10
does that vary by determine, you know, how does how does it vary based on like
6:16
where you are in the county and stuff like that. So, uh it's it's interesting data. Um we've intentionally not looked
6:23
at it before. So, what I've got here is three tabs that we're going to try and solve all three of those problems in.
6:28
The only work that I've done here is upload that data set. So, we've already uploaded that CSV file into into Zerf.
6:35
And uh as soon as Ganso gets it, clicks into these projects, we'll be able to see him in there. We'll see his mouse
6:41
moving around and and so on. But just to get started, I'm going to just talk to our agent a bit and say, uh, can you
6:48
load the CSV file and explore it and describe what's in it, please? just some
6:58
basic EDA. Now, uh Gonzo, have you gotten to play
7:03
around in Zerve very much or is this kind of a new uh This is new. Yeah, I I I went to a talk
7:09
with you guys. Um I did log in one time. Yeah, but this is
7:15
brand new. I I figured Yeah, we'll we'll we'll give her a run and it'll be it'll work out all right.
7:23
All right. So the way the agent works over here uh is that it's going to present a plan. So it's load it's
7:30
telling us what the plan is. I'm just going to say approve. We don't have time to go into the details, but we could
7:35
iterate with the the agent on the plan. Uh and so um the thing about you use
7:41
chat GPT fairly extensively. Yes, Gonzo. I do. Yeah. A lot of API work, but I do.
7:47
Yeah, I work even in the front end a lot. Okay, awesome. Um and any other uh
7:52
agentic or or um yeah uh Anthropics cla Oh, you use cloud code?
7:58
I do use cloud code. Um and then I do run not production stuff but I do dabble
8:06
with a lot of uh locally running things trying to get Yeah. trying to get some
8:12
like Gemma to work locally reasonably well for some projects. Yeah. Gotcha. All right. And what's your sense
8:18
of how this all this agentic coding stuff is impacting uh the way people do
8:23
their work in the data science space say? Well, I yeah I think it it we're it's kind of hitting a different strata,
8:29
right? So at the low level I'm seeing it uh create quite a bit of uh uncertainty
8:36
around junior talent. Um there's just a lot of froth in the system right now and
8:42
a lot of junior talent is you know ostensibly trained to go write code for a living and is finding themselves a lot
8:49
of them are finding themselves faced with do I want to be a product manager do I want to be some kind of you know
8:56
like working on product or working on analysis but not necessarily writing code because I think a lot of
9:02
organizations are are pausing and seeing well how much productivity can we get
9:07
out of AI with our current existing talent base. Um, and kind of the mid tier when we talk about uh, yeah, I
9:14
don't know, inexperienced engineer. Um, there's kind of the whole rash of
9:20
YouTube uh, and Tik Tok vibe coding, which
9:28
streams really well, looks really good on screen. uh haven't and and if you
9:33
follow that long enough or if you try it yourself uh you will invariably find yourself uh yelling and screaming at the
9:40
screen because it's it's hard to quote vibe code meaning
9:45
like you know throw your hat over the wall and hope you'll get an a production application on the other side that
9:51
that's pretty unreasonable do you like that term vibe code
9:58
like there coding and vibe. Yeah, I I mean I do like the term vibe
10:04
code because I think it aptly describes a way of doing things. And I'm sorry, my
10:09
screen's freezing, but I think you guys can still hear me. Um, yeah. So, I I think it's an apt description for not
10:16
being very thoughtfully engaged with what you're doing and hoping that magic works, right? It's a vibe, man.
10:23
Yeah. No, I definitely think there's it's such the first time I heard that that phrase, I thought, "Wow, that is
10:30
that is annoying." Yeah, it's And it is annoying. It's annoying to watch. It's annoying to do.
10:36
Um and it's annoying if your teammates are doing it and you're trying to get actual productivity out of what you're
10:41
doing. So I I do think that most people who have figured out a real productivity boost who've actually delivered anything
10:48
uh have left vibe coding in the past and are working very much at the agentic
10:54
level um either directly or with something that compartmentalizes and and
10:59
and discretizes a problem into you know finite steps and then focuses on
11:04
actually the the steps working not trying to just cross your fingers and pray that you know one prompt will get a
11:11
solution, right? Uh so what's the difference? I mean these terms are not very well uh
11:17
well defined in terms well so I'm looking at kind of what you're doing here and so um kind of the
11:23
rudiments of my path from going from like well
11:29
yeah I' I'd say about a year yeah it's been a little over a year a little over a year ago I think my engagement with
11:35
LLM as a coding assistant before that I would say that I I I got
11:42
and still to this day get a tremendous this productivity boost about around autocomplete and autocomplete seems to
11:48
get better and better and better with each iteration and truly like today autocomplete is great um it writes more
11:55
lines better cleanly um so I think autocomplete's a fantastic thing um and
12:01
it's kind of hard to tell what's behind the scenes on an autocomplete but it you know it is getting smarter and so
12:07
obviously there's a lot of uh a lot of systems where autocomplete is tied to an LLM Um, I think, you know, if you're
12:15
thinking about it almost like a a video game, that's like I would call that first person shooter versus like I'm
12:21
right here. You can see the gun. I'm I'm doing the thing, right? Um, I think what
12:27
people really want is like they actually just want to watch somebody else do it. They want streaming, right? They just
12:32
want to want they want to watch a a a Twitch stream of AI building software
12:39
for them, right? That's what they really want. And that's really hard to do. It's doable, but for me, the process has gone
12:45
from essentially replacing Stack Overflow with
12:50
like an LLM. Like that was a lot of my path was just like, yeah, instead of going to Stack Overflow and pouring
12:57
through a bunch of answers, sorry Stack Overflow, I do love you and I think you're a wonderful resource. Um, but
13:02
it's just a lot easier to go ask chat GPT or Claude and say, "Hey, how do I write this bash script or how do I how
13:09
do I do this thing or how do I do this, you know, flip of a data set?" Um, yeah,
13:14
you know, in in pandas or what have you. Um, currently I I'm pretty close to kind of
13:23
watching something build for me, but it's it's required a lot of work. Like I
13:28
have a lot of infrastructure. I have my kind of my own agentic wiring of I use I'm currently using a
13:36
lot of uh GPT5 and codecs. I do think authent I do think anthropics
13:43
cloud code is a a phenomenal solution. Um and I think a lot of people
13:49
are getting a lot of productivity out of cloud. Gotcha. All right. Well, agent has done a fair bit of stuff here. Vibe coding is
13:55
definitely a bit more passive and we've been a fair a little passive here as we've been sort of chatting about we have. Right. Yeah. But it's it's
14:02
Yeah. It's breaking it into smaller problems. Yeah. Yeah. So like you know we're doing some basic data quality stuff. So uh it looks
14:08
like the street street diir maybe direction I guess it runs northsoutheast
14:15
or west and street type those tend to be fairly missing but everything else is relatively complete. So these are the
14:21
top 10 columns in in terms of like missing value percentages. Um looks like we're we're doing uh no
14:29
duplicate rows. We've done some kind of basic data quality checks. This looks like unique values for categorical
14:35
columns. Um so street direction there's only four values. So that probably is
14:40
northsoutheast west though it's mostly missing. Um only one value for tax class
14:46
three. So, you know, some tax class one has 22 distinct values. So, it looks
14:51
like that's a nested structure in terms of class tax classifications maybe. Um,
14:57
we've got uh numeric ranges, so high and low values. They didn't print out very pretty here, but we can go we could go
15:04
and look at that in a bit. Uh, looking at is summarizing data issues. Let's see.
15:10
Something I want to call out here, Greg, is like this is very introspectable, right? This is very different from
15:17
typing into chatgbt or or into claude and saying do EDA on this data set which you can do. I we can do that. Um and
15:25
what you'll get is a essentially a trace of everything that it was trying. It's all obscured for the most part. You can
15:31
click into it but it's pretty hard to to look at. I I think it's I think it's actually
15:37
brilliant that the code that created it is right there right there on the left.
15:42
It's very introspectable. It means that you can also riff as an individual. You can think through and
15:48
reason through the problem a lot easier than just yammering at the LLM. Like do it differently and then
15:55
I do like a good yammer though. Oh yeah. Who doesn't, right? So uh so we've we've kind of gotten a
16:02
flavor for the data. It looks like it still wants to draw some visualizations. So we'll get a few blocks coming up here in a minute. Um but what what's your
16:08
sense of things? What's the next step here if we were So this first project is who owns my neighborhood and uh the
16:14
thing that we said we would do is to filter, aggregate and visualize property records to identify top owners,
16:21
clustered holdings, and surprising trends like multiple parcels owned by a single entity.
16:26
Yeah. So we should just ask the the agent to Yeah, let's just ask the agent to do
16:33
that and see what it does. All right. I'm gonna we can have a good job time. So I'll just start a new one
16:39
and I'll say I just copied this right from the description of the uh u from
16:45
the the description of the live stream. So by filtering aggregating and visual property records you can identify you
16:51
can I guess I should make that a question or a command identify top owners clustered holdings and surprising
16:57
trends like multiple parcels owned by a single entity. Uh, I'm going to say include visualizations
17:05
and summarize findings.
17:10
Period. All right. Uh, so we now have two agents cooking here simultaneously. I we can
17:16
switch back and forth. Uh, but we'll stick with this one. It's going to come up with a plan here uh to do its work
17:22
and hopefully they won't step on each other. Uh, oh, we're doing some data visualization here. So that's let's dive
17:28
in here. here. All right. So, let's jump into
17:33
Let's get a better view of this guy. So, we've got parcel acres distribution.
17:40
Oh, that's interesting. U Xclass looks like it probably will be helpful.
17:46
Xclass. What? Where are you seeing that? The tax here. I'm gonna click on your picture so that we can
17:52
Yeah. So, I'm over here. Tax class. Yeah. I'm just following you. I I don't know what those mean, but those are
17:57
probably high off to owner type pretty cleanly.
18:03
Fair bit of fair fair number. It looks like that bottom one is missing values. Yeah. And then lot use.
18:12
Yeah. Interesting. Okay. Interesting.
18:17
All right. Cool. So, we'll let that one think. Let's jump to the next project. Um, I'm going to say read in the I
18:26
suppose we could have done this all in one project so we wouldn't have to read in the data set multiple times. Uh, read
18:31
in the CSV file and do the following. Uh, and I'll copy this one. The next one
18:37
is called Corporate Landlords of Salt Lake County. Says, "This project uncovers the largest institutional property owners by cleaning and normally
18:45
normalizing the messy owner own name field and then grouping parcels by total
18:50
acreage or market value. Okay, interesting. So, I'm going to
18:55
paste that in here. Uh, so I'm going to change it to uncover the largest institutional property owners by
19:01
cleaning and messy and normalizing the messy owning field and group parcels by
19:07
total acreage or mar value and visualize the results.
19:13
So, we'll get that one cooking, right? And I'm I'm gonna answer some of these questions we've got coming in the
19:18
chat. Um, good. Yeah. So, I you're asking for like what skills and experiences should I focus on
19:24
developing, right? And uh yeah, I I I wish I had a much better answer right
19:31
now, but I I'm going to say kind of the honest answer is um it's really hard to
19:37
know right now what people want in junior talent. uh because we're trying
19:42
to figure out real time how much can we task AI with the task
19:50
that we used to task junior talent and I know that's not a very encouraging answer but that's I I think that's the
19:55
brass tax reality right now and so the skills you can develop I think are and
20:01
and I I'll frame it this way um I feel like I have to repent uh about 11 years
20:07
of my career What do you mean by
20:12
Well, because I went on this kicksotic quest to go democratize AI, right? I was like, I wonder if I can invent a thing
20:19
that'll do my job. And you know what? I invented a thing and then I sold that
20:25
thing to a company that invented another thing and it did a lot of data science
20:30
job, right? It built models. It, you know, so I'm talking about the the company I started was called ZF. uh and
20:36
then sold that to data robot where I met Greg and data robot did a tremendous job
20:43
does a tremendous job of doing a lot of the work of data science uh but there's a bookend problem in data science which
20:49
is framing and scoping um and then uh
20:54
organiz organizational adoption and everything in between is really cool right it's building models it's
21:01
interrogating data it's understanding all that stuff but at the end of the day if you don't have a very well-framed and
21:08
scoped problem and you don't have a path to adoption it doesn't matter how cool your models are and that's why you know
21:15
depending on what stat you want to look at x% of data science and AI projects
21:21
fail because they don't they weren't really very well framed they weren't really very well scoped and they didn't
21:26
actually have anybody with the skills to do change management
21:32
Greg is a person who has tremendous skills of change management like astounding skills of change management.
21:38
He also happens to be a very reasonable data scientist. He's quite skilled that way. My wife anyway.
21:44
Yeah. But and and so the reason I bring that up AJ is like I think then's the skills,
21:51
right? I think the skills of understanding either change management or understanding how to properly scope
21:58
and frame problems um are the things that allow you to apply your trade and do your craft uh very valuably and
22:07
without Yeah. So entry level data scientists can punch way above their weight these days and they don't cost a fortune either.
22:14
Um yeah, that's fair. What about it's fair on the technical side, but it isn't I wouldn't say that it's fair on
22:20
the part that where the rubber hits the road and you're kind of competing at the same level of like well I can punch into
22:28
you know I can punch and deserve a prompt and get data out and I would say well if you really want to figure out
22:34
how to be employable figure out how to do one of the bookend problems. So,
22:40
you've got kids, kids. Where you What would you tell them to study if they were interested in this whole field?
22:46
Oh, no one wants to hear my advice to my children. Okay. Not your kids. My kids. Who Who uh
22:51
you you think people should study computer science anymore? Is it worth doing? Oh, I do think it's worth doing, but I
22:57
think it's worth doing. Like getting math, a math degree is worth doing. Like it really will challenge you. You you
23:02
will think much sharper. Um you'll have a very very strong perspective on how things work. Uh but you probably won't
23:10
use it that way. Okay. So it's kind of your chores type
23:16
thing and learn like programmatic thinking that sort of situation.
23:22
Can you become great like Greg from 2020 with attentive data science? No, you
23:27
can't become like Greg from 2020. Um not without
23:33
again. Now, I would I would argue that Greg's true value that he brings to the table on top of his very capable skills,
23:42
manipulating data, crafting, you know, good information streams, finding good
23:48
insights, building reasonable models, Greg really has paid the price in kind
23:54
of his temperament, his orientation, the way that he approaches things. Greg sees
24:00
that things get done. He really does. He's a person who's very very driven to see that things that get done stay done
24:07
and that things that get done were the right things to do. That wisdom, some of the data tells you that, but
24:14
most of it is Greg's no like temperament, wisdom, and
24:20
honestly just his proclivity to go look things that are shiny are nice, but if it doesn't stay done, who cares? Well,
24:27
you know, given that I actually was Greg in TW from 2020 at one point, uh I'm not
24:34
I don't know if that's exactly something to aspire to anyway. Well, let's look at some results here.
24:40
Um let's look at some results. Started to get some interesting stuff going on. So, we've got about uh $6
24:47
billion worth of land uh that is owned by trust not identified. So, we've got some some missing data there. Oh, it's
24:54
probably it's probably the LDS church. Oh, you think so? Maybe if it's a lot of land, it's
25:01
probably the LDS church. Wow. Shocking. Well, that would be outrageous. Uh Salt Lake City and Salt
25:06
Lake County are the next two biggest land owners. And then the Utah Department of Transportation, West
25:12
Valley City, the federal government. Oh, this is LDS.
25:17
Okay. So they own $47 million worth of uh property I guess
25:24
in Salt Lake County, 570 parcels. And then Kennott Utah Copper Corp. So it
25:31
looks like a mining company maybe. Then a couple of couple of more cities. Snowbird Resort.
25:37
There you go. Skiing place, right? That's $127 million.
25:43
Uh so okay, interesting results there. Let's get our third project kicked off. Uh, what do you think about the the
25:49
ownership stuff? Yeah, that was that was great. That was what, three prompts to get there. That's
25:56
that's pretty ridiculous, man. That's pretty good. All right, next one is tax fairness explorer. It says, "This analysis
26:02
explores fairness in property taxation by comparing taxable value against full
26:08
market value across neighborhoods and visualizing which areas pay more or less
26:13
tax per dollar of market value. Okay, let's look at that. That's
26:19
interesting." Yeah. Uh read and player one. Yeah, absolutely. I think you're seeing like, you know, this CSV is what 188 megs.
26:26
It's a It's a decent size file. Yeah, it's decent size, right? And you can go larger, right? Zer can handle larger. So
26:34
yeah, definitely Zerve I I I think AJ again kind of coming back to your your
26:40
question because I think it's a very important question. Um so I'll I'll I'll share with you something that one of my
26:46
favorite engineers Lance said about three months ago.
26:51
Lance is a very capable full stack software developer. Not he's got a math
26:57
degree, right? who's very very sharp. Um, and he's been he's been doing software development for 27 years I
27:04
think. Um, three months ago, Lance had an epiphany
27:11
and his epiphany, Greg, was okay, so writing code by hand is a premature
27:17
optimization at this point and where we're at. And
27:23
you know, he he'd been suffering through a lot of our agentic work, a lot of our LLMs, and resisting it, frankly, just
27:30
kind of like, look, I know how to write code by hand. Uh, and he he was doing kind of the Stack Overflow thing. And
27:36
then he just something clicked. We we started working with uh PRDS and
27:42
decomposing our problems into smaller things and finding that the LMS were doing a really good job at small
27:48
problems, right? Like peace-wise. and think enterprise software development. So these are pretty sprawling problems.
27:55
Um analysis is a little more directional. Yeah. And so I think tools like Zerve
28:02
really step on the gas of writing code by hand is a premature optimization.
28:09
No longer is every keystone precious. Um but you still need to understand and
28:14
kind of reason through the problem really smartly. And I do think that with a data science background
28:20
that that's going to be easier for a lot of folks. See,
28:28
okay, top 20. So, I've gone back to the first project. I had the tax fairness project kicked off now. The name
28:34
normalization stuff is is cooking. And you can see some of the stuff that it's doing. It did um you know, it's like fixing uh
28:42
comma LLC and normalizing it to no comma, that sort of thing. So it's removing those kinds of suffixes and and
28:48
this one was doing replacing the amperand with the word and. So okay. So we got a normalization. Yeah.
28:55
Yeah. So we're we're trying to um rationalize this data set a bit. Yeah. Spelled differently, that sort of thing.
29:01
And then it's summarizing by the uh the biggest owners again. So, we're
29:06
seeing a lot of the same results, but I think once you get down into the the smaller, less less prominent ones that
29:12
you'd start to see some differences from the previous analysis. And then I kicked off the text explorer one uh as well. Uh
29:20
one thing I would say about the um uh the big data set question is that
29:26
this is all running in the cloud and it's all running serverlessly. So when each of these uh when each of these
29:33
cells actually or not cells these each each of these blocks actually runs uh it's spinning up compute on the fly and
29:40
you have some control over what compute is is used. So by default we use lambdas
29:46
which are the fastest. They're a bit limited. AWS uh has a hard limit of 15 minute
29:52
runtime on the lambdas uh which is more than enough for most most things that you're going to run. If you do need
29:57
bigger stuff, you can always pick uh you know larger compute types. Uh you can
30:03
use GPUs and and so on. So if you do need uh you know if you're training a neural network or if you've you know
30:09
you've got something that that needs GPUs, you can go down that road too. So you can get you know pretty chunky bits
30:15
of memory if you're if you're dealing with with bigger data. But then if you've got like huge huge data then you
30:21
have the same problem that you'd have anywhere else. So like if you tried to analyze, you know, 10 gigs of of data or
30:27
100 gigs of data on a on a Jupyter notebook locally, you know, you'd have you'd have a pretty hard problem. So you
30:33
end up doing sampling and and stuff like that. Yeah, that's right. And then you then you actually have to reach for your toolkit, right?
30:38
Exactly. Yeah. Yeah. Exactly.
30:43
All right. Let's see here. Um what do talk to me about production? So you've
30:49
done a fair bit of work uh Gonzo doing uh you know moving notebooks into
30:54
production, moving data science projects into Yeah. Yeah. How's that go for you?
31:00
Yeah. Well, let's let's you know, let's kind of go through the you know like the experiences. Um moving not notebooks to
31:06
production sucks. Like the one of the worst handoffs in the world is uh I I've
31:12
built this notebook. It runs on my laptop. Can you get it in production? Um
31:17
yeah. Yeah. And it and it sucks because they're just they're not they're not a terribly productionworthy
31:24
uh artifact. They're they're mostly I mean literally the name is notebooks.
31:31
Like that's that's what they are. They're a place to think through and reason through problems, right? Uh but
31:37
they haven't been designed and they still continue to struggle to be usable
31:42
as a production artifact. Uh, and so then, you know, and here's one of my my favorite horror stories of getting a
31:48
notebook into production, getting a notebook from one of my favorite data scientists and struggling for four hours
31:55
to get the damn thing to even run. Like, I'm looking at like, what what it is it was kind of a messy, right? It's kind of
32:01
a messy thing. And so, I'm trying to get the thing to run. I I finally get to a point where it's kind of running, but I
32:07
I still don't know what the dependencies are in this thing. And I'm trying to reason through it from the code like back into what these because it didn't
32:13
come with a requirements.ext file. Didn't come with any of that. Right. I I finally went back to to the data
32:18
scientist and I said, "Hey, can you help me, you know, figure this out?" And uh
32:23
he's a dear friend of mine, lovely human being, kind of scattered, right? A little scattered in his approach, right?
32:29
So, so he shows me it running and I'm like, "Okay." And then I see it's running in system Python. I'm like,
32:35
"Okay, not the best practice, but that's fine. whatever. And I said, "Would you
32:40
type in pip freeze um and output the result to a requirements.text file?" And
32:47
folks, I've I've never seen pip freeze take a minute. Like, it's a very fast
32:53
process. And this thing kind of stalled for a second. I'm like, uhoh, that's not a good sign. And the requirements.ext
32:59
file that came out had thousands of lines of dependencies. So he'd been working in a single environment for many
33:06
years and it just had a lot of crut in there. And so we had to we had to decide
33:12
how we were going to go about this. And that kind of leads to the second thing. It's like, well, how else can you deploy
33:17
something? Well, you could spin up a Docker container, build the whole thing in Docker, snapshot the Docker
33:22
container, and ship it over. Same kind of problem applies, right? I've I've received and I've uh embarrassingly
33:30
shipped Docker containers that were several gigs, tens of gigs that probably
33:35
truly ran about a few hundred lines of code. And that's that's a disproportionate weight, right? 400
33:41
lines of code that takes 10 gigs to run. That's not a very efficient scenario. Um
33:47
yeah, and then I've also shipped things using platforms like data robot. I've shipped things using um yeah some open
33:55
source tooling. I I shipped a model in H2O years and years ago shipping models there. Yeah, I think at the end of the
34:03
day, uh, the problem with shipping models is that it's kind of an
34:08
afterthought and so you end up finding yourself kind of hamstrung between,
34:14
you know, the the individuals who did the analysis and did the the work um, and the people who are going to be
34:19
responsible for keeping it running. And they're not even today in 2025 most
34:26
DevOps and SRRES and QAS are not
34:32
they're not expert enough to know how to inherit something like a notebook or even a model for that matter because it
34:38
it it's a probabilistic kind of thing. comes up with a bracketed prediction,
34:44
you know, when that's kind of the deliverable or it comes up with yeah visualizations
34:50
that that don't run as React components or other. So there's there's a fair bit of issues, right, in getting things into
34:57
production. So what what's really nice is to have an opinionated architecture honestly like that's a really nice thing
35:03
to have. Do you think that the large language models will start to help with
35:09
um with with the deployment stuff because data scientists aren't trained in that?
35:14
Uh so there is that kind of handoff like there's you know I have a PhD in statistics and we didn't even think we I
35:20
never even heard the word DevOps until I was working at data robot right which was many years after I uh after I
35:26
graduated and had been working out in the industry. Uh, so is that something are there any tools out there for for
35:33
you know aing DevOps? AIing DevOps? Not not that I'm super
35:39
familiar with. I'm not a DevOps expert, right? Um, I I I know that you've got, you know, a
35:45
lot of the SAS data science uh platforms
35:50
uh will have some kind of deployment strategy or some kind of opinionated architecture.
35:56
Um, yeah. Yeah. And then you know there are like Google notebooks has uh I can never
36:01
remember what they call their AI studio. Collab. Collab. Yeah. There's collab. There's SageMaker. A lot of those you can
36:08
essentially surface an endpoint after you've built something. You kind of tell it this is the cell and this is what I want and then it'll surface an API
36:15
endpoint. Um which is a pretty reasonable thing to do. I don't think it's the worst. um they they do have
36:23
some production problems like where's the logging, where's the failover,
36:29
right? Where where's the where's the stuff that we care about. So yeah, so it's I don't think I don't
36:36
think the situation's awesome in 2025, but it's definitely better than it was 10 years ago.
36:43
Okay. Um let me dive in. Oh, these two are both done now. I think it says it's finished
36:49
looking at both of these guys. Here's a summary. That's interesting. Uh we I don't want to read the summary. That's
36:54
too much reading. That's a TLDDR type situation. But I did see a plot here that had
37:01
um it showed that in fact that trust not found guy that was the the Mormon
37:07
church. So that that's interesting. That doesn't surprise me in the least. Yeah. How did it find that? Yeah. The vision. Well, there was a owner name
37:14
field. I think it that I don't think it was using here. Oh, okay. Zoom out here. We've Where are you at?
37:20
fair bit of visualizations here. Uh let me see if I can find
37:26
where that was. Here it is. Okay, I see. Yeah. Yeah. So, here's your here's our plot.
37:33
So, yeah, the the LDS church, uh the Mormon church is by far by a factor of
37:40
almost 10 the the most the biggest land owner by value, followed by 200 South owner. I
37:47
don't know what that is. Um, fashion place is a shopping mall, right?
37:52
That is a shopping mall. I think it is. City Creek is another shopping mall which is owned by the church as well.
37:59
City Creek. Is it really? Yeah, the mall. It's a billion dollar mall built by a church. It's fantastic.
38:04
Wow. It's a beautiful mall. That's wild. I had no idea.
38:10
So, so anyway, that's fun. Uh, so I want to skip to this one because I think that the agent's actually having a problem uh
38:17
figuring out what's going on here. Um, so we got in here and we we looked at we
38:22
loaded in the data and then we've got the distributions here and we've
38:27
identified it identified some columns. So like taxable value, total full market value, adjusted and assessed value, uh,
38:36
tax districts, that sort of thing. It's not quite done with its work yet. that's still working on fairness, comparison,
38:43
visualizations, and insights. Um, but the the thing that is bugging me is that
38:48
all these values are the same. So like the total assessed, total adjusted, total full market value. Oh, they're not
38:55
the same either. So the averages are they're the same for adjusted versus assessed.
39:00
Um, so these two variables seem to be duplicates of each other. U, but these two are different. So full
39:06
market value versus taxable value, those are those are very different. So that's interesting. So there may not be a
39:12
problem there. Um this is just counting missing values.
39:18
Uh this is sample records. Maybe that's not so interesting. Um so we have tax burden calculated for 98.3% of of uh
39:28
properties properties. That's so let's go into summary stats.
39:34
um mean tax burden. So what how taxdere's tax burden tax burden. Uh let
39:43
me dive into this mode here and just resize some windows.
39:50
Um so tax burden looks like it's full market value.
39:56
So you have to have a a non-m missing full market value greater than zero. Uh
40:02
and then you divide taxable value by total full market value.
40:08
Yeah. So taxable divided by total full market. We are gonna probably have some issues because
40:14
people aren't required to to report their market value.
40:20
But that's interesting. So the average taxable value is 55% of the the total market value and the
40:28
standard deviation is actually pretty wide. It's really wide. Yeah. Yeah. So, you have some where they're the
40:33
same. You have some where there's zero taxable value. I guess those might be like government entities and that sort
40:39
of or just unreported. It could just be unreported, too. Yeah, that's true. That's true. I guess
40:45
I don't know. You It seems like you'd have a taxable value. Um
40:50
but then the for the biggest ones, it looks like our tax burden is one. So, that's that's interesting. Mhm.
40:58
So maybe for the biggest districts, those are all like owned by governments
41:03
and such. Um, we just Yeah, we just saw that most of the
41:10
property is owned by the LDS church, which is non tax. Oh, yeah, that's fair. But they would
41:15
have to pay property taxes still, huh? Don't you think? I don't know. I don't know how. Yeah, I don't know what either.
41:20
Churches work. Oh, I can see also we're going to have an issue here because these neighborhoods are just numbers. So,
41:26
we're not going to know what they actually are, but there are there's some neighbors neighborhoods that are um have
41:33
a very low taxable burden, which is surprising. Wow. This is this is this is
41:39
actually pretty surprising that the tax burden would be like, you know, less than 10% for for all of these guys,
41:45
whereas it's like 100% for for this crowd up here. So, that's interesting.
41:52
I wonder what that difference is. Um probably one maybe next step for another day would be to figure out the mapping
41:59
because there's no latl longs in here. There's no way to actually plot the uh the parcels uh to to be able to kind of
42:06
figure out where you know draw a map and identify like the rich. This is great this little this summary
42:12
on the output. That's great. Yeah, this is properties in the neighborhood. So this would be the size of the neighborhood versus the average
42:18
tax burden. So that I suppose that's interesting. There doesn't seem to be any correlation
42:24
there. Maybe more well certainly more variability for smaller neighborhoods
42:29
than for bigger ones with an average with the average being
42:35
pretty common. It looks like most of your properties are taxed at about I don't know 55%
42:42
of their of their total full market value. So yeah, and I think that's right. That's that's fair.
42:48
I suppose that's interesting. So 805 neighborhoods analyzed, highest burden
42:54
neighborhood is number 78.880, but there are a fair number that are at one. And
42:59
then there are some that that have a zero taxes. So that' probably be worth digging into probably a next step. Oh,
43:05
you know what we could do? Let's uh let's start a new one. Oh, well, it finished here. So for nerhoods
43:14
that have 100% tax burden and 0% tax
43:21
burden, can you list out the largest or the most common
43:28
owners of these parcels?
43:35
That might be interesting to look at. that would that would certainly kind of verify your your hypothesis that uh
43:41
maybe the uh the Mormon church is taxexempt in some way. I suppose it would have been good research to like
43:47
try and figure out what the actual rules were around taxation before we uh or we could just infer them and see if
43:53
totally say okay so here's a here's just a summary of the analysis this is a markdown
44:00
y uh let's see key findings geographic disparities start that plan uh so there
44:07
are highest burden neighborhoods meaning properties pay taxes equal to 100% of their market value an annually lowest
44:14
burden neighborhoods show zero percent burden paying no tax relative to their full market value.
44:22
So the difference the difference between highest and lowest burden is effectively infinite. It is
44:28
indicating fundamental inequities. Median tax burden 55%. The standard
44:34
deviation there 21,000 zero tax properties. So five%
44:40
face no tax burden at all. those those has to be government entities.
44:45
Got to be, right? It's got to be like Utah State. And I wish it would have said um uh the
44:53
value here. Not just the count, but also the value and the percent by count. That would be interesting. We could ask that
44:58
after. Uh roughly balance split between above
45:04
median 411 and below median 417. Well, you'd expect that. It's called the
45:10
median. Uh neighborhoods facing higher tax burdens. Oh, here's your market values.
45:16
So, these are uh these are interesting.
45:21
And the five lowest. So, the numbers aren't wildly different. 20 million, 20 million, 300 million, 360 million,
45:30
499 million. And these are these Oh, 2.6 billion. That's a big neighborhood. That
45:35
has to be the church. I guess that has Yeah, that has to be right downtown or something.
45:40
Uh, fairness concerns, extremes, outliers, zero tax properties, geographic clustering. The wide
45:46
variation suggests that where you live significantly impacts your relative tax burden independent of property value.
45:52
Uh, oh, there's an inverse relationship property uh possibility. Some lower burden neighborhoods have higher total
45:59
market values, suggesting higher value areas may face proportionally lower tax burdens. A regressive pattern. Well,
46:06
that doesn't seem fair. And it gives us some uh recommendations
46:11
to identify the extreme cases, which we actually just kicked off. So there there might showing up on that.
46:17
Uh there are oh review exemptions. So there are 21,000 zero text properties.
46:23
Uh geographic study. So this would be like the mapping that we already talked about.
46:28
Uh and uh the conclusion. So all this data, by the way, is completely public. I just downloaded it. I actually used
46:33
the API. Um, they have a a pretty slick API to download the stuff. Uh, and so
46:40
let's see here. While we're diving into
46:46
Okay, this is the 100% tax burden um stuff. Let me shrink our font a smidge.
46:53
Looks like uh Kennott Copper Corporation,
46:58
Lakefront Gun Fur Reclamation Club. This is just a person.
47:05
So some people, it's not just the really big properties. Let's look individuals. Here's the zeros. Okay. So yeah, these
47:13
are government entities. There you go. Division of state lands, uh, lands and forestry.
47:19
Yeah. So these are Oh, nan. Nan. Yeah.
47:24
So some data cleanup issues there. Um, yeah. So it looks like the zero uh zero
47:31
tax burden ones are government entities. Uh so the 100% tax total market value is
47:39
69 billion and the 0% tax is 47 billion. So that's interesting,
47:47
huh? Okay. So lots lots of cool stuff going on there. All right. Anyway, I
47:52
think we I think it's pretty fair to say that we uh we tackled these problems pretty successfully.
47:57
We did. Yeah. in short order. A little diving in to see what's going
48:03
on there. But I did want to have a look at some of these questions that have been coming in in the last 10 minutes that we've got. There's certainly some
48:09
interesting questions. Uh let's see. Can I uh Randle says, "Can I connect my own data sources or does everything have to
48:16
be uploaded?" Yeah, you absolutely can. You can connect to Snowflake or data bricks or MySQL or we have lots of
48:22
connectors uh that are already built or you can just connect by code. So like anything you could do with code you can
48:29
you can connect directly and use S3 or orever what it would be. Uh so really easy to connect other data sources.
48:37
Um and then we've got uh Nomi uh who says how does the agent decide on the
48:43
granularity i.e. how much code goes into one cell? Good question. Um it tends to
48:49
heir on the side of uh uh more modularity than less. uh which is a a
48:56
smart thing to do uh I think one of the characteristics of the way that Zerve works uh is that it it's like every
49:03
block is kind of like a checkpoint. So all the outputs are serialized and stored at every point. So you'll notice
49:10
like we were in my screen and Gonzo was in there as well. He didn't have to run that code in order to see those results.
49:15
He could just go in and look at any block and see the inputs and the outputs and visualize what they might look like
49:22
and and all that kind of things. uh and he could write code and run code and so on. So more modularity tends to be
49:27
better when you're uh when you are you know like debugging and when you're writing so that you can kind of look and
49:34
see along the way. So when I'm writing the code I generally say all right whenever I want to look at stuff that's
49:40
a block uh and then the next block might run some code and then I'd look at stuff again. That's how I think about it. Uh
49:47
depends on the day with the agent. I mean sometimes these agents are are a bit unpredictable u but the agent like I
49:52
said tends to air on the side of more modularity versus less. Also the agent can see you may have noticed as we were
49:59
watching this thing go the agent can see the output and the data and the code. So
50:04
it's got full context into the problem. So that's a big difference between like um you know like a a cursor or something
50:11
like that where you have to teach it the data. Like if you were trying to do this in chat GPT, you'd have to like paste in
50:17
the output and show it what actually happened and then it would get the variable names wrong and so you're like
50:22
correcting that stuff and so on. So you could do something like this with the chat GBT, but it doesn't have your
50:28
context. So it actually ends up being kind of a pain to uh to use.
50:34
Um let's see, next one. Do I need to install anything locally or is everything browser based? Yep, everything's browser browser based.
50:40
We're we're cloudnative. Uh we this was running in AWS but we can install uh on
50:46
any of the cloud environments. We're actually designed to be self-hosted. Uh people's data tends to be uh sensitive
50:54
and so they don't want to be shipping their data out to a third party in in some cases. So uh the install on AWS is
51:00
actually really straightforward to to do. So, you know, you can get self-hosted and then your data and your
51:06
compute live in your cloud and we help manage orchestration.
51:11
Um, okay. Do you see any interesting ones? Oh, are there academic or student licenses? Uh, yeah, actually there is.
51:18
Uh, you can get uh well, everybody can sign up and use Zer for free and get
51:23
five credits a month. Uh, the academic and student licenses if you verify your school email gets you double credits on
51:29
that free tier. So you get 10 credits instead of five. Uh which is great. So all the students should definitely jump
51:36
in. Um you guys got you guys got in git integration, don't you?
51:42
We do. Yeah, we integrate with git uh GitHub or or Bitbucket or any of the uh
51:47
the big providers of of source control type stuff. So uh yeah. So now do you there's version control?
51:53
Yeah. Oh yeah. Yeah. Yeah. So we have that. Uh we also got some uh some uh uh roll back
52:00
features. So like if you were using the agent and you want to like see what it did and and undo those kinds of things.
52:06
Uh in the notebook view that we'll be releasing pretty soon, you can you can see those and kind of approve them before before things go. So we used kind
52:13
of the scorched earth approach when working with the agent today. We were like just solve this problem. That's not
52:20
the only way to work with the agent. So you might want much more fine grain control and say look in this cell uh you
52:27
know this is what's happening this is what I want to happen just change this this cell or this block uh to do your to
52:33
do your bidding so like that uh that kind of stuff is completely doable uh
52:38
but we didn't do any of it today because we were just kind of like you know throwing yolo a hand grenade
52:44
into the uh into the problem. Um let's see one more. How do you ensure
52:50
that agents understand the real impact and importance of visualization outputs while keeping stakeholders perspectives
52:57
and needs at the forefront? Gaz, you want to take a crack at that one? Yeah.
53:04
Again, you you're going to have to do that work, right? like you you can have it cough up a bunch of stuff, but at the
53:10
end of the day, if you want stakeholders perspectives and needs at the forefront, then
53:17
then you're going to need to articulate what those needs are. I don't think you necessarily have to hand handiccraft
53:23
every visualization but yeah you know if if they need to understand uh certain
53:29
dimensions of the data or if certain comparisons of information are going to be more insightful and drive to more uh
53:36
decision clarity yeah then you just need to make sure that that's in the prompt that you're doing I think right I think
53:43
it'll do a reasonable job and you can see right maybe it'll maybe it'll suck but uh I don't watching what we watched
53:49
today I we can bet that it's going to do a pretty reasonable job if you uh if you
53:55
articulate any constraints. All right, and
54:03
we've got about five minutes to go. Uh I think we'll close on the questions there and just uh thank you Gonzo. This was
54:09
really fun. It really was fun. What are your closing thoughts on this? Did was expected Did did you expect it
54:17
to go the way it went? I mean, kind of. I kind of did expect it to go the way it went because, you know, I've been
54:23
working on this stuff for a while, and I've been really impressed with what you guys have built. And so, I kind of
54:29
thought, well, you know, if if Graves got any thumb on the scale observe, and it seems like he does, uh, it's going to
54:35
do a lot of the work that is tedious to do. Um, and it did. It did a fantastic
54:40
job. I I love how modular everything is. I love how everything's broken out. Um,
54:46
I love that the code is available and visible and introspectable. Um, yeah, I I think you guys have built
54:53
something that uh, frankly, I would happily advise any
54:59
any level of talent to try um, and see if they can make it really workable in
55:04
this system, especially if the alternatives are they're hammering on a notebook running locally. I'm like,
55:11
please just stop. Just stop doing that. it's not that helpful to you. It's not that helpful to others.
55:18
There's some exceptions, but they kind of prove the rule, right? And even if you're more experienced, it's like, look, if you're if you're
55:26
kind of doing VS Code data science, which a lot more experienced data scientists are doing, right? They're no
55:31
longer in notebooks. a lot of them are coding in VS Code and you're using
55:36
I don't know GitHub Copilot or you're using you know Anthropics Cloud or you're using Codeex.
55:44
Greg's point and I think we all saw today you piecing together all of the context windows and chaining together
55:51
the context that's needed to prosecute a data science analysis correctly in code
55:56
that way. It's going to take you a long time to get there. you're going to build this kind of, you know, hay stack of
56:03
operations. God, Zer's done a great job. It's doing a really good job. I I think your time
56:09
and iterations are better spent with a tool like Zerve than they are in trying to convince ChattBPT to be a decent data
56:15
scientist. Yeah. The the thing that we've seen when we've kind of like interviewed our users
56:20
is that the the real power users tend to put an awful lot of work into building prompts. uh like the the the first four
56:27
or five uh user interviews that I did, I saw guys and gals pasting in like two-page prompts into, you know,
56:34
and like literally working with the GPT to write the prompt correctly, right? And building a two-page XML bracketed
56:42
prompt with, you know, correct reason. Yeah. Yeah. I I don't know if that's
56:48
necessary. Uh, I mean today you did you were not that I mean you were these were in
56:53
well-intentioned directioned prompts so they weren't they weren't bad prompts. Yeah,
56:59
but they they were definitely not two-page prompts.
57:04
Well, I think the uh I think the the future of agentic coding is is pretty
57:09
exciting and I'm I'm excited to see how it develops. What's the uh what's your closing remarks here? We'll let you uh
57:16
wrap us up. Thank you for being here with us today. This is I mean my my look I I'm I'm I'm told you
57:22
this before a few times, Greg. I am delighted with what you folks are working on. I think what you're doing um
57:28
is a real it's it's it evidences a lot of
57:33
craftsmanship to build something that is compelling to use. And what you all have
57:39
done is that it's compelling to use. It's it it and I'll use the term like
57:44
and it feels lovingly crafted. Uh, this is not built by somebody who doesn't know what they're doing and hopes that
57:50
it'll kind of work. You know, it definitely has the flavor of Greg
57:55
Mickelson written all over that says like look, this is a job worthy of doing and it's worthy of doing well
58:02
and and I and I'll attest and Greg doesn't want to be in charge of even his
58:08
own code thinking how do I carefully partition this notebook into three things and ship it over like I think it
58:14
solves a lot of the problems that you don't want to do, Greg. like you don't want to be thoughtful about building a notebook and especially
58:20
careful and you know and uh mindful of the DevOps who's going to inherit it.
58:26
You just want to get something done and move on. And so Zer does a lot of the dirty work, right? Making things a
58:32
little cleaner. Killer. Killer. Well, it's always fun to
58:37
hang out you with you, Gonzo, because you're always so complimentary. So, thanks for that.
58:44
I swear we're not paying him at all. I did not pay him this. I'll I'll pay you after. God seriously
58:50
bourbon. I I get I I I take payment on bourbon. Yes. All right. Well, next time I'm in Salt
58:56
Lake City, I'm going to drop by and we're gonna we're gonna hang out and have a drink together for sure. That would be fantastic. Thanks for jumping on. Thanks for
59:01
everybody listening. Uh feel free to send more questions in as you uh as you come to them afterwards uh through uh
59:08
through our Slack community. Uh you can send them in through our our website or or anything else. Happy to to answer
59:14
more questions. But I will close by saying you can get on Zerve and you can try it out. It's free to use. Uh get in
59:19
there and send us your feedback. We'd love to hear what you think. Thanks, Gonzo. And we will sign out.


