Earlier this week, we hosted a livestream on coding agents and why context changes the game for data science.
Two of our co-founders, Greg Michaelson and Jason Hillary walked through real problems with generic coding assistants, showed a live build in Zerve, and answered practical questions on ETL, modeling, and scale. Here is the recap:
What we covered
- What coding agents are and how they work inside IDEs or terminals
- Why autocomplete feels useful but clunky for data work
- Where hallucinations creep in when tools cannot execute code or see results
- How context windows, tool calls, and retrieval shape an agent’s output
- Why data science needs execution, state, and results at every step
- A live demo of Zerve’s agent building an EDA workflow as a DAG
- Risks, guardrails, and sane permissions for agent autonomy
- What the team wants next from agent UX and reliability
Why context matters in data science
Software development follows stable patterns. Data science does not. You need to inspect data, run code, see outputs, and adjust. That loop creates context across variables, schema, charts, and interim findings. Without that context, an assistant guesses. You lose time verifying work and chasing errors. With it, you move faster from idea to impact.
The coding agent landscape
Autocomplete helps unblock you. Chat based code generation can draft files. Both struggle with half done projects and long running analysis. The root issue is context. Many tools cannot execute notebooks, track state, or read outputs. They infer results from file names or popular datasets and slip into confident nonsense. Hallucinations show up as invented APIs, surprise styles, and insights that never ran.
What we showed in Cursor
Cursor indexed a small project and produced a Streamlit app. It did well with planning, file edits, and boilerplate. It could not execute notebook cells. It proposed EDA findings before running code. That is fine for scaffolding. It is not enough for reliable analysis.

What we showed in Zerve
Jason ran the same EDA request in Zerve. The agent planned four steps, executed each block, and passed results forward in a DAG. Every code cell ran in the cloud. Outputs, charts, and data frames fed the next step. The agent updated its context after each run. It summarized real results, not guesses. You can schedule runs. You can version work in Git. You can scale in parallel and keep working while jobs run.

Guardrails and safety
Power needs limits. Zerve narrows permissions, sandboxes execution, and supports read only credentials for risky data sources. Agents can be judged by separate evaluators before changes are accepted. You see what ran. You see why it ran. You decide what ships.
ETL and modeling
Agents fit ETL well. Discovery, transformation, joins, and validation benefit from repeatable plans and parallel blocks. Zerve’s DAG, scheduler, and Git flow help teams move from notebook experiments to production runs. For modeling, context windows and catalog links help with feature understanding and leakage checks. Expect stronger data discovery across large warehouses as the next wave.
Scaling context without overload
Context windows keep growing, but size is not the only answer. Zerve refreshes context at each step with what matters now. It reads state, outputs, and metadata, and compacts history when needed. The goal is enough signal for the next action, not a dump of everything.
Productivity and learning
Agents help experts move faster. They give newer users patient guidance and runnable examples. The best gains come when the tool can run code, see outputs, and adapt its plan. That is where Zerve invests in execution, orchestration, and a clean UX you can interrupt or guide.
Key takeaways
- Data science needs agents that execute, observe, and iterate with state
- Context must include code, data, outputs, and decisions across steps
- Cursor excels at scaffolding code but stalls on notebook execution
- Zerve’s context first agent runs code in a DAG, updates its context, and summarizes real results
- Guardrails, permissions, and evaluation keep autonomy safe
- ETL and scheduled pipelines are strong early wins for agentic workflows
FAQs
What is a coding agent?
A coding agent is an AI that plans tasks, edits files, runs commands, and iterates toward a goal. It goes beyond chat and autocomplete by taking actions through tool calls.
Why do generic tools hallucinate during EDA?
They often cannot execute notebook cells or read outputs. Without results, they guess from file names or common datasets, which yields confident answers that were never run.
How is Zerve different from Cursor for data science?
Cursor is strong at scaffolding code. Zerve executes each step as a DAG, reads outputs and state, and updates context before moving on. Findings reflect what actually ran.
Can I use Zerve for ETL and scheduling?
Yes. Build pipelines as blocks, run them in parallel, version in Git, and schedule with cron or built in intervals.
How large is the context window in practice?
Zerve uses the underlying model limits and typically operates in the tens of thousands of tokens. It updates and compacts context across steps to keep the right signal.
How does Zerve reduce risk from agent autonomy?
Sandboxed execution, least privilege credentials, allow lists for commands, and evaluator checks before accepting changes.
