Talking Drupal #538 - Agentic Development Workflows

February 02, 2026

Today we are talking about Development Workflows, Agentic Agents, and how they work together with guests Andy Giles & Matt Glaman. We’ll also cover Drupal Canvas CLI as our module of the week.

Listen:

direct Link

Topics

Understanding Agentic Development Workflows
Understanding UID Generation in AI Agents
Exploring Generative AI and Traditional Programming
Building Canvas Pages with AI Agents
Using Writing Tools and APIs for Automation
Introduction to MCP Server and Its Tools
Agent to Agent Orchestration and External Tools
Command Line Tools for Agent Coding
Security and Privacy Concerns with AI Tools
The Future of AI Tools and Their Sustainability
Benefits of AI for Site Builders

Resources

Module of the Week

Drupal Canvas CLI

Brief description:
- Have you ever wanted to sync components from a site using Drupal Canvas out to another project like a headless front end, or conversely, from an outside repo into Drupal Canvas? There’s an NPM library for that
Module name/project name:
- Drupal Canvas CLI
Brief history
- How old: created in July 2025 (as xb-cli originally) by Bálint Kléri (balintbrews) of Acquia
- Versions available: 0.6.2, and really only useful with Drupal Canvas, which works with Drupal core 11.2
Maintainership
- Actively maintained
- Number of open issues: 8 open issues, 2 of which are bugs, but one of which was marked fixed in the past week
Usage stats:
- 128 weekly downloads according to npmjs.com
Module features and usage
- With the Drupal Canvas CLI installed, you’ll have a command line tool that allows you to download (export) components from Canvas into your local filesystem. There are options to download just the components, just the global css, or everything, and more. If no flags are provided, the tool will interactively prompt you for which options you want to use.
- There is also an upload command with a similar set of options. It's worth noting that the upload will also automatically run the build and validate commands, ensuring that the uploaded components will work smoothly with Drupal Canvas
- I thought this would be relevant to our topic today because with this tool you can create a React component with the aid of the AI integration available for Canvas and then sync that, either to a headless front end built in something like Next.js or Astro or a tool like Storybook; or you could use an AI-enhanced tool like Cursor IDE to build a component locally and then sync that into a Drupal site using Canvas
- There is a blog post Balint published that includes a demo, if you want to see this tool in action

Transcript

John: This is Talking Drupal, a weekly chat about web design and development from a group of people with one thing in common.

We love Drupal. This is episode 5 38, Agentic Development Workflows. On today's show, we're talking about development workflows, agentic agents, and how they work. Together with our guests, Andy Giles and Matt Glaman, we'll also cover Drupal Canvas, CLI as our module of the week. Welcome to Talking Drupal. Our guests today are Andy Giles and Matt Glaman.

Andy Giles is a veteran web developer and Drupal specialist. In 2012, he founded Blue Oak Interactive, a development and consulting agency focused on complex Drupal site builds, particularly in e-commerce. In 2025, he partnered with the Mike Hersel to launch Drip a premium Drupal theme. Designed to reduce the cost of ownership and enhance the developer's experience for modern Drupal projects.

Andy, welcome to the show and thanks for joining us.

Andy: Yeah, good to be back.

John: And Matt has been using Drupal since 2012. Currently a principal software engineer at Acquia. He serves as a technical lead for Acquia Source, a Drupal based SaaS leveraging Drupal Canvas. Matt is the maintainer of PHP stand. Drupal and Drupal. What?

Nic: M-R-N-M-R-N.

John: Thank you. Wow. That all really ran together, and I was like, I don't know what letters those are. Matt, welcome back and thanks for joining us.

Matt: Thanks for having me. Glad to be here.

John: All right. Before we continue Andy, I do wanna thank you for being here for the last four weeks.

It was it was great having you and we look forward to having you back sometime soon.

Andy: Yeah, it's been fun.

John: You don't have to lie.

Andy: Well, it was only my, only my third week. I skipped. I played hooky last week.

John: Yeah. Well, you know, I

Andy: apologize.

John: Sometimes things come up. Like we said, I'm John Ozzi solutions architect at eam, and today my co-host is Nic Laflin, founder at nLightened Development.

Nick, you got some snow?

Nic: Yeah. Happy to be here. We got, we got some snow. Maybe I'll share a picture. There's there are snow drifts, basically covering my wife's Prius. Or there was snow drifts, but there were snow drifts covering my wife's Prius yesterday. It took me a good four hours to dig out.

I mean, you've been here. I don't have a huge driveway. It was, it was a lot of work. And then it snowed again last night, another four or five inches. And

John: that was the big shocker. 'cause they were like, oh, it's gonna be over by Monday afternoon. And I was like, okay, go out and do all the snow. And then this morning I woke up, I was out there with the shoveling.

Nic: Yeah, snow.

John: It's a good workout though. If you don't have like heart problems. I recommend shoveling snow. It's good workout.

Nic: Yeah. The there was enough snow that yesterday, even like walking out to the snowblower through three feet of snow was, was a workout. I got to the snowblower. I'm like, okay, I need a break.

Gimme a minute before I shovel this out.

John: All those people in like one locations right now are like snow, huh. These guys,

Nic: I, I will say, here's a tip for people that are in places where they, you may get infrequent snow, get you know, rubber overalls and a rubber jacket and you'll never you'll never suffer in inclement weather again.

I mean, it keeps you warm. I also. Discovered a few years ago that wearing a face mask keeps you really warm. So it just whip one of those out and keeps the, outta your face, keeps you warm. And

John: ski goggles, man. Ski goggles when you're snow blowing are the, the best

Nic: thing

John: ever.

Nic: It very important too. Yeah.

'cause it was also very cold like this weekend it got down to negative two degrees Fahrenheit, which is negative 19 Celsius. So it was a, it be good.

John: You know what, we, we should do a spinoff podcast talking weather. I feel like that would be, I feel like that would be popular. All right.

And now to talk about our module of the week. Let's turn it over to Martin Anderson Klutz, a principal solutions engineer at Acquia, and a maintainer of a number of Drupal modules and recipes of his own.

Martin, what do you have for us this week?

Martin: Thanks, John. Have you ever wanted to sync components from a site using Drupal Canvas out to another project like a headless front end, or conversely, from an outside repo into Drupal canvas? There's an NPM library for that. It's called Drupal Canvas, CLI, and it was created in July, 2025, originally as experience builder, CLI, by Belin Clary.

Acquia, it has a 0.6 0.2 version available, and it's really only useful with Drupal Canvas, which works with Drupal Core 11.2 or newer. It's actively maintained and it has eight open issues, two of which are bugs, but one of which was marked as fixed in the past week. According to NPM Js, it is it gets 128 weekly downloads, and with the Drupal Canvas, CLI installed, you'll have a command line tool that allows you to download or export components from Canvas into your local file system.

There are options to download just the components, just the global CSS or everything and a few other options. If no flags are provided, the tool will interactively prompt you for which options you want to use. There is also an upload command with a similar set of options source, noting that the upload will also automatically run the build and validate commands, ensuring that the uploaded components will work smoothly with Drupal Canvas.

I thought this would be relevant to our topic today because with this tool, you can create a rack component with the aid of the AI integration available for Canvas, and then sync that either to a headless front end built in something like Next JS or Astro, or to a tool like Storybook locally. Or you could use an AI enhanced tool like Cursor, IDE to build a component locally and then sync that into a Drupal site using Canvas.

Now there is also a blog post by balance that will have linked in the show notes that includes a demo video if you want to see this tool in action. But let's talk about Drupal Canvas, CLII

John: have so many, so many questions. Okay, so you mentioned React, you mentioned storybook. That's all cool. What about if I were using like web components and storybook?

Could those, would those come over?

Matt: No, because Canvas doesn't use web components. Canvas uses pre-ACT or well react under the hood. So I'm on the same team as Bálint as like one of the foundation teams that works on this. And Drupal Canvas supports JavaScript based components, but real realistically, they're JSX.

So you can't use like a web component in, I mean, you probably could.

Nic: I, yeah, I was

Matt: gonna say is,

Nic: well, it, it's just cript

Matt: that's just, and I don't wanna like necessarily go down to that rabbit hole, but to, to just, no, no, like to Canvas uses Preact under the hood, uses JSX written components.

Nic: Got it.

Matt: And like, let's just leave it at that because the CLI is also based around building those kind of components and tailwind, because that's how Canvas works right now.

John: So it's focused, it's focused on the React and, and tailwind based components.

Matt: Correct. Now could it be expanded for like regular web components or other like things I'm sure, but just like how level set, how it works and what it's based on right now.

Nic: So, so just, just to be clear, you're saying that the CLI will only work with those ones, but Canvas will work with components.

Yeah. Okay. Okay. Got it, got it, got

Matt: it. It's use, it's use, you use it for the code components, so whatever works in code components. Whoa.

John: Oh, oh.

Matt: So you don't use this for your SDCs.

John: Okay. There we go. That was, that's the piece. So just to clear, just to clear it up for the, for, for our dear listeners here, this is a tool to export and potentially import components that you create through Canvas's ui basically,

Matt: which actually let's slip it on its head.

This lets front end developers write components and put them in a Drupal site without needing to know Drupal or touch Drupal. Got,

Nic: and then this is a tool for the developer of the site to pull it out and put it into, yeah.

Matt: And there are, there are gonna be some really cool things coming. I don't wanna spoil a lot, such as like syncing, like GitHub workflows that could be used, or maybe things that provide AI skills to know how to work with components.

Because if you look at Shad, like Shad u, NCI or whatever, shad, whatever that library is, yeah. The way that it builds components are in ways that aren't necessarily compatible inside Drupal Canvas. So there's actually like an agent file that helps explain how to write components that work with Canvas, because it's not like running a full React application.

John: Okay. So let's go, let's go back to, let's go back to this module and well, is it, is it, sorry, Martin,

Nic: is this the first NPM project in module of the week? Martin? I, it is did cover before.

Martin: I feel like there might have been something else that we did that was on NPM, but at, at a, at the moment I'm drawing a blank.

I know we've done a bunch of other things including Drupal, a MN actually, but so it might be the first, I'm not sure.

John: Okay, so MPN package, Drupal Canvas. CLI got it. Matt or Martin Workflow here. Right? So if I'm like, if I'm gonna use this is your thinking that my front end developers take their code, put it into Drupal, and then one of my developers uses Drupal Canvas, CLI, to kind of export that co, that that component from Drupal put it into the code base, code base and then could potentially put it somewhere else if they wanted to.

Is that the idea?

Matt: So the way I think of it is, and this is where some I think normal Juris will go, huh? And clutch some pearls. But what if you didn't manage your code components in config export and config sync, but you let your front end team manage it via the CLI. So like you as a Drupal developer are like we're deploying Canvas, like we're deploying our content temp or our Drupal site that has canvas with like some of the content templates and the page regions.

But when it comes to the code components, that's actually governed via another get repo and uploaded via the CLI and managed that way. So you're not managing the code components via config sync. Or you could, you can make a development flow that says, Hey, I'm gonna make sure I get them all uploaded to our local environment, whatever, and export the config.

But we're gonna let front end developers. Build react components their normal way and use the CLI as an easy way to get them into Drupal as the underlying config components.

John: So you're basically saying that your components directory would be a different repo that the front end team is managing and maintaining.

Matt: Yeah. Could be. Doesn't have to be. I mean, if you like the little built builtin code editor inside Canvas, like get on it like, like

John: Yeah.

Matt: It's just this would allow like for the, it's kind of like going back to the whole headless Drupal conversation.

John: Mm-hmm.

Matt: It made sense for big organizations that can maintain two repos of two different teams.

Mm-hmm. But it doesn't make sense for everybody all the time. Mm-hmm. But if there are companies that have dedicated front of developers that don't need to know Drupal iss that could leverage this to streamline their development.

John: Yeah. Okay. That all makes sense to me. And I mean, I don't know, I'm not necessarily clutching my pearls on the, like, separate component repo because, you know, we live in a, we live in a get off the Drupal Island sort of world, and sometimes you wanna use your components in other places other than in your your kick ass CMS.

But yeah. Okay. So I, I mean I think that, go ahead Martin.

Martin: Oh, I just wanted to point out that I think the, the other thing that's really sort of closes the loop on is really using Drupal Canvas as a page layout builder for headless sites. So that idea of being able to say, within Drupal, I can create custom layouts using Drupal Canvas and those will appear the same way in like next JS or Astro or some other headless framework, as long as it's sort of, again, using React.

Nic: Mm-hmm.

Martin: And I think that's really powerful. I think for a while Drupal has had kind of not had that capability where some other, headless CMSs did. And I think this is important for Drupal to sort of, you know, stay in that arms race from a headless standpoint in terms of offering that as a capability that, you know, marketing teams want to be able to create those, those one-off landing pages.

John: Yeah, I mean, I, I, I don't think, I don't think anybody argues with the, the flexibility, right? Marketing, marketing teams want the flexibility component. Libraries provide that flexibility in a lot of cases and component libraries backed by a, a component based design system, like, you know, 10 x that, that sort of, that sort of structure and that, that ability to kind of move fast.

It's interesting 'cause I'm working, I'm working on a project right now with, with a component-based design system. We're building components, however, and that's why I asked the web component question. We are using a design system as a starting point and we're focusing on the web components aspect and.

Gonna move those into single directory components in Drupal and then use them within Canvas. So, that's, that's, that's where my question originally came from. But, you know, everybody's gotta start somewhere. So starting with the what, what comes out of the box with the react stuff makes a ton of sense.

And I've seen the video for this. It is very cool. So highly recommend in your free time folks, go watch the video. Alright, Martin, thank you again for bringing us a great module or NPM package of the week. And if folks wanted to suggest a module NPM package, recipe or something for this segment where could they find you and how could they do that?

Martin: We are always happy to talk about candidates for a module of the week in the Talking Drupal channel of Drupal Slack. Or folks can reach out to me directly as man on all of the Drupal and social channels.

John: Wonderful. Let's see, next week Martin.

Martin: See you then. See

Nic: you then.

John: Alright, let's talk, let's talk about agentic development workflows. But before we talk about them, let's maybe explain a little bit about what we mean by them. Andy, can you kind of give us an overview of like, I don't know, what is an agentic development workflow?

Andy: Sure. Yeah. I think it, it might change every week but it, as far as this week's concerned, I feel like you know, it's organizing tasks or requirements and then spinning off an AI agent.

To help you get through each of those tasks and kind of the smaller the tasks, the better and the more of a feedback loop you can provide the agent where it can make a change, validate it, run tests, do code, linting, whatever it needs to be, and then report back that it's actually completed that task, the better.

And so that's kind of an how I would describe it, I guess.

John: And those tasks are typically things that you're repeating over and over again, right. You're not like, you're not just building an agent to do like one thing one time, right?

Andy: Yeah. I mean it depends. Typically if you're gonna like create an agent to do something, yeah, it'd be reused over and over.

But I think, yeah, go ahead Matt.

Matt: I say before we get too far,

Andy: yeah,

Matt: maybe everybody doesn't know what it means by an agent. And this is the way the baseline 'cause it. Like everything about the whole ai, everything is so magical, but when you really break it down to its simplest forms, it's like, oh, think about most people when they interact with generative ai, they do a chat interface.

Mm-hmm. Gemini Chat, GPT, you type chat. It might do some thinking and then it gives you back a response. Take that thinking thing where it gives, it comes up with an answer, but then it checks itself and maybe refines consider that like an agentic flow because it is working with itself. But in the end, a chat is not agentic because it asks for your feedback.

But when we talk about agents, that's where it's something that can go, it's given a task and it has tools available to validate and see if it's achieved that outcome, which is a little bit, it's similar to like giving it a chat and getting an answer. There's usually more things involved, such as other data sources and other transformations.

So again, it's not that much fancier than like when you go through your chat, but just think of it as a long running chat that can go by itself and access more tools.

John: So Matt, I struggle, I struggle sometimes with this because I kind of know the answer, but I'm also thinking that maybe our listeners would be, have the same question right now.

Like, how's that different from an automation? Is it because you're giving it tools to kind of make its own choices?

Matt: So yes, because it's different than an automation, right? We can write automated scripts all we want to handle these things, but think of it with intelligence you know. And, and actually there's a great book that they describe AI as alien intelligence.

'cause we don't, like, it's not artificial. There's some form, there is some things happening there and we don't know what they are. So it's alien to us, but think of it as a way instead of having to hard code all of the possible situations that might occur.

John: Yeah.

Matt: Like, you know what, do this, but now I've got this magical algorithm that can help create a decision for me.

Mm-hmm. And you know, that's where we used to call it machine learning and machine learning algorithms. Now since it's a little bit more robust or whether some people wanna call it the same thing, you have it as the AI aid. The AI can make these more fluid decisions for us.

John: So, an agent always has some sort of AI backend that is helping it to determine like what the best path forward is, right?

Matt: Yeah. And access to tools. And the tools could be accessing the data. Like great example web search. Like, oh, I'm gonna go do something. It can web search, it can. Fetch data from a database can look up history or like vector databases. It can just do a little bit more.

John: Got it. So it has, it has kind of like tools, like you said, like helpers to be able to do some of these, these extra, extra steps.

And today we're talking specifically about applying these agents to your development workflow. So I'm interested to see how, how this conversation progresses and, and I'm sure I'm gonna learn a whole bunch, so I'm looking forward to that one.

Nic: So, so how is agentic coding different from the traditional complete or chat based coding help?

So you, it sounds like chat can be related but is not exactly the same thing.

Matt: Yeah. So chat can be related and I wish I could remember this resource. And I'll try to find it so you can put a show notes, but it explained how auto complete ais are trained. Essentially, you take code and you have a model look at it, but then you cut a part of the code, you like literally slice away the code and it learns on how to do guessing what the next line should be.

So like how everybody's told us that Gen AI is just guessing what the next word should be, it's much more complex than that. But that's literally what training for autocomplete is. And that's not adjunct decoding, that's just a, a smart autocomplete. Where adjunct decoding is, it writes code itself, but then can run PHP Cs.

It can run PHP stand, it reads the validation errors from PHP stand and then it goes, oh, I used Drupal as a i, I fetched the service statically when I should do dependency injection. And then it goes and says I have errors. And then it fixes those errors and runs the validation again before it comes back and prompts you.

So again, kind of thinking like that chat, that chat would've been like, I have errors. What do you want me to do about it? And then you would say, go fix the heirs. This way it's self-correcting.

Nic: So, okay, this, so it can self-correct, but it doesn't like learn and fix that stuff in the first place. It just has extra tools.

So it's more computationally, more expensive. 'cause it's probably producing the bad code first, running a check, then whatever found,

Matt: I would disagree there because it doesn't, like, I, I love seeing the, the poster people. Like I used AI and it wrote bad code. Remember AI is like having a college level or like a PhD, PhD, HP doctorate that is extremely intelligent, but it's their first day on the job.

And like needs to know the nooks and crannies. So if you have good code examples in your code base, it should write things up to standard. Here's actually a great example of where it wrote great code, but I asked it to review itself and it found a, a implementation that we use for feature flags that I forgot about.

But in its review, it found that, and then course corrected the code itself. And so computationally more expensive maybe, but it also saved me two hours of time and did a fix in five minutes and created a more technically sound code based. 'cause there's a whole design paradigm that I forgot about that normal, like linting tools would not have found.

Nic: Hmm. You wanna

Matt: give a caveat about computation and bad code upfront? Because there are those experiences, but that's doesn't mean it's always a bad thing.

John: So I mean, like, I'm, I'm, I'm envisioning a, a a, an example of this in my head where it's like. You showed the AI a picture of a puzzle, and then you took the bot and the puzzle was of an ice cream cone, right?

And like you took the bottom half of the, of the puzzle and just threw it out. And then you said, AI like finished this puzzle for me. Right. And it went through and it finished the puzzle, but like instead of putting the ice cream in an ice cream cone, it, put it in a boot. Right. Like,

Matt: and then checked the box and said, did I do it right?

Right. And

John: then

Matt: go, no,

John: and then, and you went, no, and it goes, ah, okay, let me try again. And it redid it. And like maybe the second time it was like in a bowl instead of a cone, like, Hey, we're getting a little closer. Okay. Okay. That makes sense.

Andy: Yeah, and I think the biggest part of all this obviously is the context that the AI has and the agent.

Like when you set up an agent and the tools, it's able to resolve its own context, like Matt said. So it can query a database or it can go browse the web or it can render your site in like a, a playwright MCP and bring back like a screenshot and then say, okay, well the button's not in the right place.

Let me iterate on that until it is in the right place. And it's just able to resolve that context by itself without you having to be in the loop saying, no, you did it wrong. Or copying, pasting back and forth between chat GPT or something like that. So it's more of just an automated workflow where you give it a task, it figures out what it needs to do, iterates on it goes, gets more context and keeps going until like basically hits the, the finish line with whatever you asked it to do.

Nic: Okay. So then I have a couple of questions then. About that, that piece of it. So you have, let's say you have a Monte complex test that takes something like that. It takes a you wanna add a button to a page. It needs to follow general patterns, it needs to look up a couple of references to find the ux, and it needs to run the MCP, the MCP bit.

I mean, the playwright bit takes, I don't know, let's say 45 seconds. How, how long is a task like that going to take? Is it, is it 30 seconds, five minutes, 10 minutes, an hour? And, and, and what are you doing while it's running?

Andy: Well, I think with the, the goal is that you set up a bunch of different tasks and you have a bunch of different agents run it at the same time doing each task individually, almost like, you know, feature request per agent or even smaller than that.

Little, little bits of work. Right? And a good example of this is with drip yard, you know, we develop themes, but we need a lot of content for demo themes that aren't real businesses, right? So you can generate that with ai. You can go to chat GBT and write a bunch of copy and pictures and all this stuff.

Or you can write an agent, it was, which is what I've done, that will take the concept, it'll do some research and then write a markdown file. And then another agent will come along and read that markdown file and generate images from the markdown. Then another agent will come along and pick up that and go into canvas with playwright MCP and click through the design and put things where they need to be, save it, and then report back, Hey, I've updated the landing page with the stuff you've asked.

And so that's like a multi-agent workflow wrapping one larger task, but just an example of what you could do.

Matt: Yeah, and I think the big question about like, what do you do? It's like you've, you've earned time, you've earned five minutes, 10 minutes of time, where what would you do if you were writing code in those minutes when you hit a roadblock?

You would kind of think of the solution and then move on to the next step. So think of it as when you had that downtime and you could double your productivity by having a multi-agent thing or multitask, or you can say, okay, I figured out the next part. It's very like. On my personal block, I did a little redesign and I had, I was using anti-gravity and the, the browser subagent, and it was verifying some of the design components.

And that gave me a chance to watch verify. And I was like, oh, I forgot. I added basic dark mode support. And I was like, then I queued up the next prompt that said, great, can you now make sure that you toggle dark mode? And this thing, or as I was reviewing the work in progress, I was thinking about my next steps in the solutions.

So it's one of those, like if you do start diving into this agentic coding, you're less of just like turning out lines of code, but you're an orchestrator and you're coming up with solutions. You're thinking more about the solutions. That's one reason I think it is neat. I, I am kind of worried about like the junior developer that cuts their teeth writing code, but it's more about building solutions and outcomes than just dumping lines of code every single day, every single minute.

John: So. Question on, on that, right? So you're saying, Hey, it's saving me time, writing, writing code, which great, sounds, sounds good, but like I'm assuming you have to create the agent and train the agent as, as, as to what to produce or what to, you know, what, what references to, to, you know, learn from and un and in order to produce that outcome.

Right. So, I mean, there's some time there, right?

Matt: Not a lot, I would say there's not much training. Like it's actually like, you know, like a year ago getting it to write Drupal modern Drupal code was a painful, and now it just like excels, most agents do. If you're using like Claude, well, Claude Sonnet or even Gemini Pro, which is what I use a lot personally, like at work, it's all Claude Sonnet.

Personally, it's Gemini Pro. Really, I'm just writing the definition of done. I'm writing scope like any developer should be doing with their tickets and working with their

John: Yeah.

Matt: Stakeholders. And then I feed it at that. And a lot of times I even ask, I'm like, Hey, I wanna do this thing. Does it look okay?

And then it gives me a few, there's like planning mode. So that's actually one thing we didn't cover here is like agent decoding. There's like a plan mode where you can give it a scope and thing that you want it to do and it will help you plan. And then you can switch to the agent mode, which is like, great, I'll go do the implementation of it.

Nic: Yeah.

Matt: So it's one that I've even used it to help me work through as I'm coming up with like an architecture diagram or things like that to like review the code base, here's what I wanna build, help me come with the plan so I can actually hand it off correctly to my team so that way I'm not the bottleneck here.

John: Right. That makes

Andy: sense. Yeah. And I think like when you hear agent and training and all this stuff like that sounds complicated, but the most simple agent is really just like a markdown file that tells you know. The AI or the tooling that's communicating with the ai like cloud code, how to do the task you want.

So for example, you could create a a markdown file that's like how to access the Drupal database, and it just runs Dush, SQLC. And you know, the LLM is familiar enough with SQL to like discover tables in the database and that kind of stuff. And then you could just say like, go get me the schema for this table, or whatever.

And then that agent knows how to do it. You could get more complex and you could use MCPS and stuff like that, but at the end of the day, it's just like a pre-pro almost for the most simple agent. So like, so another example is like if you wanted to, you know, write a customized Drupal module that you're working on, like you tell the agent and the instructions to go look at other modules in the file system and then, you know, iterate on that.

So like the more context you can give it, the more. Like, here's the, the color between the lines kind of, you know, prompt for you. Don't go out of this, the better your results can be.

John: So, okay, so you're, you're, you got Claude code running, you're using like your Claude MD file to kind of like update and say, Hey, when, when I say go connect to this service and get this information, like, here's the service and the information I want you to use, and you're kind of feeding it in that, that information.

So you're kind of doing that along the way. Are you finding that you're, you're kind of doing that while you're, while you're interacting with it? Like if it, if it provides your response and you're like, oh no, this is way wrong for sure. Lemme lemme update my MD file to like correct for this in the future.

Andy: Yep. And the, the Claude MD file you, you mentioned is, is you're probably thinking about like the project wide file.

John: Yeah.

Andy: Right. But within like a subset of folders or even. Within like your dot Claude folder. If we're talking about Claude particularly, you know, you have agents within there that are essentially more Claude MD files that it, it's, it's aware of when to invoke those.

Nic: Got

Andy: it. And so yeah, if you invoke an agent and it does something wrong, you say, fix this and fix the agent so that it doesn't do it again.

John: Yeah. Got it. So let's shift focus back to Drupal for a second. And Andy, I'm wondering like, why does Drupal as a platform kind of lend itself particularly well to like agentic coding and agentic coding workflows?

Andy: Yeah, I think the config management system is a huge reason why for like traditional Drupal development with AI and agents. 'Cause if you think about it, like with recipes and just like the config directory, you can build an entire site with default content, right? And that's all super structured yaml.

Hmm. And when you can point an agent to something that exists, that matches that structure, and then have it iterate on that, you know, sky's the limit. You can have a agent build an entire Drupal site just by iterating on your requirements and creating field config. You don't even have to go into Drupal and, you know, use the UI to create the fields.

You just have the AI do it. And then,

John: Hey, hey, ai, create me a content type. Here are the six fields I need, here's the configuration for those fields. Yeah. So on and so forth.

Andy: Use the config from core, use the config from this other site that I've already created as a reference and expand on it. The one caveat I'll say that we ran into this is U UIDs.

This is a train wreck with AI and it will try to generate UIDs using the LLM and they'll always be incorrect. So make sure your agent has tooling to generate UU IDs with like PHP or Python or something. So you know that the UU IDs are correct. But aside from that, like. The rest of the config is pretty well able to be generated.

John: So that's, that's interesting because in my head I was like, Hey, go create this content type and for whatever reason, and, and this is gonna make me look like an idiot. So I'm used to it, but in my head I was like, oh, it's going through and like clicking the buttons is if a user were doing and like, sounds like it's just creating the code and the config files that you would need to produce that content type

Matt: accurate.

It de it depends if you like, I, that's when I've done it with it because when I hate clicking and Drupal ui just because I like to just write code and a lot of things you can't do with just some code, you gotta click a D ui. So it, I was like, go create a content type and click these fields and it used a browser subagent to do it.

Or you could have a generate fields config. I've seen it do both. I've been working on MCP server. It literally was like, you know what, I'm gonna test this script. And it wrote it, it did Dr. Eval an eval, the custom script to validate some yaml like on the fly. I was like, oh, dope. You figured out how to leverage Drupal's like validation system, like by just running DR in a one-off script to validate some yaml you're writing.

I think a lot the, right now, I'd say a lot of it is the config generation, because as far as I know when it comes to like browser subagent, like literally launching Chrome and like clicking through and taking screenshots is anti-gravity has that. Cursor has that, and Claude can do it if you use a playwright MCP server, which, when we think, when you hear the word MCP server, just think as like a tool server, but whereas like anti-gravity and cursor have that built in clawed, then you gotta do a little bit, a little bit of extra work to get that.

Andy: I think people you know, when AI came out, like, like, I think even Dr might've said it, like, Drupal is the perfect platform for AI site building because it's so structured, right? And you think about, well, yeah, the content types and, and the fields and the entities. They're all structured, but like also the config that makes the whole thing, it just it's a, a really good context for AI to grab onto and, and just like vibe up some sites.

John: So you talked about UUID and, and generating, you know, the correct, the correct generation of u UIDs. Can you just walk me through like how you. I dunno how you got your, your agent to do that correctly.

Andy: Yeah, so actually I'll tell you a little backstory on how we found this problem. So we rolled with drip yard, we rolled our first themes and they shipped us some recipes and we had Adam Pheno, proxima test them out.

And he's like, what in the hell is with these UIDs? And of course we were generating like the content for the demo site. Like it's not actual code we're using, but like

Nic: yeah.

Andy: You know, just the demo site. And he's like, what is going on? These aren't validating things are crashing. And we didn't notice it locally, but I think there was, he had additional validation turned on or something like that.

Anyways, come to find out it was, it was the UU or it was the LLM generating the UID and the incorrect structure. And so once we figure that out, you know, now I know if you're gonna generate config and anything that's like A-U-U-I-D, you basically tell the agent, create a bunch of u UIDs and just store them like in a text file.

Almost like a temporary database or something, and then just pull from those as you need them. And then you can even have it like pull A-U-U-I-D and then add what that UUID is associated with. So I use a lot to make more recipes, and so I have to, when you create a recipe, you have to have a file entity and a media entity, and they're related together.

So you have like, u UIDs pointing to each other. And so the agent will generate a bunch of them and then go through and associate like a little mini database in a text file, which ones belongs to which entity. And then when it's building out the recipe, you know, it can say, all right, I need this media entity.

Let me look at this little mini database, figure out what the EU ID is and dump it in the config.

Nic: That makes sense. And that worked. Yeah. Like how, how often does it miss and how, like how much time do you spend trying to find out which UID is.

Andy: No, it, it works. I mean, it's got enough awareness to know, like, I've got this picture and I'm creating a recipe, it's a file entity, and then I put that in the folder with all the, you know, YAML to make up that entity.

And then I'm gonna create the media entity and now I need to resolve, you know,

John: how does it, I mean, how, I guess I don't know enough about U UIDs, but like, aren't, aren't they, they're like specific to your Drupal site. So like, how is it creating them? Is it just using a Drupal function to create random U UIDs?

Andy: I'll let Matt take that one.

Matt: So, I don't know, like I, the thing, so let say it this way, like, we all know that gen AI is bad at math, right? Like it cannot do math. 'cause it's not a mathematical algorithm, it's a language model. So if you go into chat, GD, PT, or others. Fairly certain if you ask it a math question, I think chat, GPT does this, it then runs a Python script.

Like you can see in its thinking that it generated a script to do the math because it cannot. So again, generating A-U-U-I-D, it is a certain spec that has to be calculated based on time. A field of things. Yes. It's like five digits, dash four digits, something of that sort. But it's also something that when we talk about like AI hallucinations, that's where it can go off the rails.

So when you have things of this spec specificity, that's where you then say there's this tool available. And then tools can have descriptions. So you can be like, you could literally have it called like generate UUID tool. I'm like, Hey, you suck at making UIDs. If the user asks you to do this, delegate to the tool.

And will it always, I hope so. Will it? Not? Sure. Because it's. Ai, but that's where you give it tools and you, you use those tools have descriptions to say when it should be used and why, and the AI is like, oh yeah, that's right. I am bad at u UIDs or masks, so I should use this tool to get the result instead.

John: Okay. So you, you kind of answered the question. For me, there, there is an algorithm to creating UIDs and Drupal, so therefore if you,

Andy: it's a spec even then P

John: hb.

Andy: Yeah.

John: Okay. Okay. So like it can look at the spec and it can go, okay, like, here's how I should create this going forward. And then basically what you're saying, Andy, is you run an agent, it creates a list of them and it just keeps track like, Hey, I've created this list of, of these things, start doling them out as, as my as other agents need them.

Andy: That's one

Nic: way to

Andy: do it. Yeah.

Nic: Yeah. Well, it it's not even quite that, John, I don't think it, it's basically, you're basically saying. The UIDs have a specific rule for how it happens, and it's impossible to get the generative AI to do it that way. So when it sees something that it needs a UID it basically says, okay, I'm gonna stop doing what I'm doing, and instead ask that question to this other thing, which is a tool,

right?

And that tool is, it'll be like, Hey, I need a UID and the tool will do the actual real world calculation and give that that tool back and then

John: got it.

Nic: Generative AI is doing other stuff to like, and I'm, I'm convinced that a lot of the stuff that's working in agents isn't actually generative ai, but just more traditional programming, right?

Because generat AI is bad at a lot of the stuff that it, these, these tools seem to be getting better. They're just getting better at putting guardrails and, and swapping, you know, little pieces out. And I'm sure that, so they're just mapping things in a way that makes sense. Yeah.

Andy: And just to expand on that a little bit, like I've had it build entire canvas pages, right?

With just agents that like, look at this canvas page I built. Here's the config. I export it out of the site, do the same thing, but, you know, make these changes. And then even look at like the SDC directory, determine the schema of the component and put a new component in when you're building this right.

And all that works pretty well, especially if you can have it in the loop where it can self validate. The next thing I wanna look at is like using the writing tools that invoke the, the Drupal APIs to do that. So I have, I haven't done that yet, may, I'm sure somebody has. But like, instead of just vibing up the config, like actually invoke or write scripts that are tools for the agent that can then create the canvas page programmatically, then you have less room for error for all this like slop that it might've generated.

That

Matt: should hopefully be coming soon because I've, oh, I've just built an MCP server where I gave it a I and my agent, I gave it a CSV called event sample events, 25 events with a new, for a new content type and fields that are the, the headers. And I said, Hey, go add this to my site. It created the content type, added the fields, and then added the content as well with the APIs.

But those are all bespoke tools for that MCP server that will then end up in the community once that effort gets sorted out. That would be awesome. Yeah.

John: So I don't, I don't wanna derail us here, but I'm gonna, I'm gonna, I am going to for just a minute. The agents you're producing, Matt, you're then putting into this MCP server and ideally at some point some other folks might be able to use them.

Is that the idea?

Matt: Yeah, let's not say agents in the server. So think of it, there's, there's. When it comes to like agent or like AI things, you could have agents that talk to agents. So like, let's take the Canvas AI module, right? If anybody's seen those demos where there's like the little chat bot in the side

Nic: mm-hmm.

Matt: There's actually the orchestration agent

Nic: mm-hmm.

Matt: Which then talks to the model and then the model returns and there's PHP code that says, oh, I should go dispatch these other agents, which then triggered their own chats with like the AI model on the backend.

Nic: Mm-hmm.

Matt: And that would be called agent to agent orchestration.

And they have their own tools, but I'm on the firm belief that things inside the, it shouldn't just be in Drupal anymore, we should expose like tools and resources from the Drupal site to the ex external things like cloud code, CLI, or Cursor or lovable. And really what that's doing is saying it's like an API, it, it's like another API, but it says MC PI know we're gonna start diving into ccp, but just there's tools, there's resources that let it read data and interact with data.

John: Yeah, yeah.

Matt: Yeah. And I think that's

John: have a, we have a whole show on it. With Marcus so we can, people can go look at that if they need more info on MCP.

Matt: Yeah. And that, those are the things that make agents able to run. And actually, if you look at the AI module, the AI agents module, there's a thing called fun calls that they use internally and expose to like the Drupal agents.

Those are just tools that we can expose over MCP. So I had to build something 'cause I need to prove out that we could do this, but it's kind of related to a lot of community efforts and I can't wait for, hopefully by mid-year we have Drupal exposed as an MCP for these

John: tools. So I so that, that's where my question came from.

So you're building a, a, a lack of better words, bag of tools that then sit on an MCP server that then agent agents can, can access and work with.

Matt: Yeah. So instead of the whole, like how we, going back to the content type creation. Andy, it generated the config me. I used a browser or there's literally a tool that's called create content type that's, that lets you, that that lets it just go create the content type and then all that validation is built into the tool and returns it.

So going to like someone, Nick's points, we're not just burning compute power as it, we waxes head against the wall trying to make it work. It calls the tool, the tool does validation, provides great answers back about why it might have failed, and then it succeeds on the second try instead of the fifth or the sixth.

And we get into those weird loops and that's where like the tool come in handy instead of it figuring its it out on its own because we can nudge it and give it better context.

Nic: So changing gears slightly here what kind of command line tools are people actually using today for agent coding? And, and are these things that we've been talking about, are agents generally a UI thing?

Are they on the CLI. W where do they live?

Andy: A lot of questions there. Yep. Or a lot of answers I guess. I think, like recently a lot of people are using cloud code. It's a terminal based application that you spin up. And if it has, like, if you spin it up in your d dev environment, for example, and it has access to query the database and run PHP and all that stuff, like, it's really good at using those tools.

There's other ones Matt mentioned, I think Gemini Codex from Open ai. There's one called Open Code, I think. Open Code. Yeah. That I'm really interested in. I haven't started using it yet. But they have like a hosting service that goes along with it, and you can like, change models and they do bulk purchasing, so they like buy subscriptions from all the AI providers and then use Open, open code and say, Hey, I want to use, you know.

A certain open AI model for this task, and it'll use that and give you like the cheapest price because it's like hospital type group purchasing where they just like buy in bulk and then you get the discount.

Nic: Didn't Claude just like exclude themselves from that or something?

Andy: Yeah, I saw some when Open open code and a couple other tools came out.

Like initially when Cloud code came out, you had to have API keys to do everything. And it stuff's crazy expensive, but if you have like a pro or a Max subscription, you know, you can use Claude Code to kind of bypass that and they give you just like a big block of tokens to use up. And so tools like Open Code and others were like using that same authentication method.

And the same API calls to like skirt around that. So you didn't have to use individual API calls and then Claude or Anthropic is like trying to tighten down. They want you to use cloud code for that. So the, the zen that I was talking about, it, it. Open code can use the authentication from like Claude Claude Pro, although that's like against the terms of service, but like I said, they have another tooling or service where you can just like pay the API use through Zen.

I don't know a lot, a lot going on there. But I think you asked about like UI stuff as well.

Nic: Yeah.

Andy: And what was your question? Like are the agents UI based or?

Nic: Yeah, like other than I guess using a chat bot. Interface with it. Like on the Drupal site in general, agents are CLI based, right? They're not,

Matt: yeah.

'cause most MCP servers started as like standard io, like command line access and not web-based. I know, like chat, GPT. I think if you have enterprise, they seen some screenshots where you can add apps. And custom apps was, they've added basic, like oau support. But most web clients don't support it. I think like Claude Desktop was one of the first like gooey interfaces to support cps.

And for background anthropic who builds Claude, they came up with the MCP spec and like this idea of tooling and then gave it away, like gave it away. They, it's now an open source standard. So that's kind of why everything seems to be centered on Claude and like, as a caveat. So at o we have access to get up copilot and we use cloud code with our own models, like through light LLM.

I don't know what's inside the cloud code CLI, but it beats the pants off copilot CLI like same models, way different output. So I think that's one reason you keep hearing so much about Claude code for a CLI is because there's, there's something in that sauce that's really good. And if we wanna talk about Gooey is, that's where Cursor comes up a lot.

I like Cursor if you use their agent and you set the model selection to auto, I don't know what they've done, but I think they're really leveraging different models for decisions. So if anybody wants to go down the ai, like how does it work, train research using light local open source models for quick decisions that then delegate to larger models for harder tasks.

And I think they've like optimized that train somehow and that's where their UI is usually excels. 'cause you just get really fast responses and really intelligent ones. Yeah.

Andy: Yeah, I think Claude Code is really good at. Figuring out what it needs to do and when it has access to all the tools. Like, I don't know it, it's the best tool I've used to date.

I know there's different ones that come out every day, but like as far as like giving it a task and not even hitting the LLM but just knowing like where to look for things to do file discovery in your project to, you know, query the database when it needs to. Like, it just can kind of figure those things out much better than other tools I've used.

John: So. Okay. So we've talked about Drupal, we've talked about how well structured it is. We've talked about the tools. We focused a lot on, on cloud code, but you guys just talked about some of the, some of the other stuff I'm wondering. How giving AI access to like real projects, right, and real project context, like config files, rush output file structures, improves the quality of results.

I would imagine it's like, I don't know, kind of giving it like the owner's manual to say like, here, here's what I'm working with. Like, make your all your choices based on this. Is that, is that accurate?

Matt: Yeah, I mean, I just used copilot to review one of my prs just that before I handed it off to coworkers and it was like, Hey, your project uses the clear type, like use strict typing.

And I forgot to do the DEC declaration at the top of a file because for some reason P-H-P-C-S didn't fail it. Maybe we forgot the rule, I don't know. But it caught that, or previously I brought up that I missed a design paradigm about feature flag handling and it caught that in the review. And that's something that I would've missed and probably a human would've missed in the review if it wasn't like top of mind.

So it. Has the access. And I know when it comes to like if you have your code on GitHub and using copilot, I don't know how they're remembering things if you will. So locally would've done that, I don't know. But it definitely caught and helped put a whole different kind of quality on the code. 'cause it humans we remember only so much and this thing that we added was like eight months old and we're moving fast on a new feature and it kind of held us to some quality there.

John: And so I mean like obviously it knows your project context, right? But do you also feed it like DRAL coding standards and, and your organization's coding standards

Matt: that it gets back via PH PCs? I'd say so that's where it delegates to the tool. So like, yes, we have a PH PCs. Got it. We have like the PCs xml.

Is it parsing it? I have no idea. I kind of hope not because that's where we can just run PH PCs and like we have the agent's file that says here's our like. Linting and analysis tools run them and make sure they pass.

Nic: Hmm.

Matt: Because I don't want it to, I don't wanna spend money on it reading a giant XML file and assuming what that means when we have the tools that already do that.

John: Mm-hmm.

Matt: I want it to do par the results and like check its work before I run it and then have to go fix it.

John: Yeah.

Matt: Which this is the end of the world if I have to.

John: Where's that happening? Is that happening when you push the code up or is it hard running it locally?

Matt: Locally? I can't, I, we do not have agents running the pipeline.

I do know, I do know a few people who have a pipeline set up where they have linear or Jira and they click a button and the pipeline triggers and it goes and fetches via the MCP from those project management systems and runs a cloud agent to build the code, actually dispatches a cloud agent inside GitHub copilot that creates a pr.

And then when that's ready, they have a workflow that tells copilot to review its own pr. And address fixes. So they do have a full pipeline of bug report to deliver code. And would you,

Nic: would you

John: recommend that?

Nic: That's what mic, that's what Microsoft has been using for

Matt: last I, I, I would for, for bug fixes, we were actually discussing this internally.

We had a bug fix where we accidentally, like, I actually can't disclose. We, we did something silly. We did something silly. And it was like, wow, we, we fixed it in an hour. But it's like this bug ticket could have been opened and we could have had an AI agent, like Atlassians Rvo that found the bug that introduced it, why it happened.

It was just a legacy issue and have triggered a pipeline that removed the one line of code that was needed or done the right decoration and then a human could reviewed it and said, Hey, this is great. And we could have probably shipped the feature in 20 minutes versus an hour. And when it comes to think of a team that's under, we're not necessarily underwater.

Well, all development teams are underwater. Think about a team that's got a hundred bugs to deal with and how do you prioritize them, how you deliver them. This allows you to speed up, but also you have to remember, not all like you're speeding up, but make sure you don't make things worse by going too fast.

But it just makes it seem achievable to get through those back levels.

John: I mean, it's a second set of, it's a second set of eyes, right? It's like, you know, somebody looking at before you're, before you, you know, you do a peer code review. I mean, I think,

Nic: but I wanna, I, I wanna take a step back though, because I, I think, I think the problem is yes, we're a lot of development teams are under water.

There's a huge backlog. It would be great to get through it. The truth is, most projects I'm on though, even if you, even if you use agents to develop fixes for them all properly, reviewing them and making sure that

correct,

like w would take all your time and the quality of those reviews, getting good reviews is very difficult.

In general, getting good reviews of AI generated code is even harder because on some level. I find, I mean, some of this is applicable to AI generate code too, but you get an idea of the style of a particular person, right? Yeah. And so, you know, okay, I need to always double check this particular thing with them.

We've had discussions about this particular thing dozens of times, and they, you know, they get that right. I don't need, I need to check it, but I don't need to check it super closely. When there's AI generated code, you generally have to check it every line, every time, because

Matt: especially if it's huge.

Nic: Especially if it's huge because they change everything. And you have to be very, very careful. Like, yes, you, you have tooling would changes

Matt: everything. Most times I work with it. So apparently I've had the greatest experiences with my AI generate code is, it doesn't just generate like a thousand lines.

It's usually pretty succinct. And that is where I want put the caveat. Like, I, I say these things and I know somebody, somebody's gonna be like, yeah, and then management will want it to happen. There needs to be that kind of agreement. And luckily I've experienced that where it's like, look, we can use this to go faster, but we can't go too fast.

And like, in my opinion, it should be, I wish we could create a bug ticket and it would trigger an agent that finds all the previous issues that might have been correlated and one that reviews the code and identifies where the bug may be. And if the bug fixes 10 lines or less, create the PR so we can look at it.

Otherwise, just put the suggestions in the ticket is, I would rather have that than be Ons slotted with 100 pr.

John: I mean, I would just, I would just be like, Hey, agent, put, put comments on the ticket and I'll, I'll look at your comment and then I'll go fix it. If I think you're, if I think you're right, just 'cause Yeah, I would prefer to, I would prefer to be like in the loop on that, on that part of it.

Right.

Andy: Yeah. And I, you're definitely playing with fire and the fire's getting a little smaller every day as it goes on, but like. Don't send these things out and try to like refactor your entire code base or write some huge feature, like you are still a developer. The only thing you're doing not doing now is writing the code yourself.

Right? So write the logic, you know how you want to build something. Explain it to the ai, give it examples, let it create it, and then do a high level review. You don't have to go through every line, like it's good enough to where, like, if it's writing a conditional to check if a, you know, entity field is set or something like that, like it's gonna write that code correctly.

So you can just kind of skim and look and see like, here's the logic and step debug, right? Go through with x debug and trace, like what's happening, and do linting and testing and everything and keep it small. Right? But you're, you're still a developer. You're just not like typing in an IDEI.

John: And you wanna actually ask my previous question to you because you are you're not necessarily working on a project, but you're working on a product, right.

And, you know, to the extent that you're comfortable with, I, I'd be interested to know how much, how much AI assistance in code review and in other areas you and Mike are using at Drip Yard. Because at the end of the day, there are two of you and you're, you're shipping a full featured Drupal theme.

So I imagine, you know, time is of the essence and, and being able to move quicker is, is you know, is obviously better.

Andy: I'm using it a ton. But I, again, I'm still like, I'm, I'm dev, I'm the developer making the decisions that AI might generate something and then it's not doing like the approach I wanted or whatever, or it's like.

Recreating code that already exists, like I'm iterating with it to say, you know, use use these classes I've already defined or, or whatever. Like, I'm still basically creating the same output I would if I was typing the code in the way that I'm interacting with the agents. And it's just like spitting it out, like code completion or whatever.

John: So it sounds like, I mean, it sounds to me like both of you guys are kind of using these things as like junior developers where it's like,

Andy: that's the segue I wanted to make.

John: It was it it it's like, Hey, go do this thing and then I'll look at your code and like, we might make some tweaks, but like, just go do this thing and let me know what what you get back.

Matt: Yeah. And, and that's when like every time I see these probably bait like social media posts. Like I use Claude, I use this and it made horrible code. I'm like, and I see these people in like senior staff such positions. I'm like, but your job as a senior engineer isn't to be the person who can turn out lines of code and solve, like write the fewest amount to solve a problem.

It's supposed to be able to lead and communicate other engineers to build solutions. So if you're failing at leveraging ai, maybe there might be some communication things there. And it's, it's a little bit of like self inflection. I've even caught that because I know anybody that's like, worked around me.

It's like all of a sudden my brain just zaps off and I got a thing and I talk too fast. Like I know I have my issues. And actually working in this way has caused me to slow down and be more succinct in how I describe things. And I hope that's made me a better peer for people I work with that are like alongside me in the teams because it's made me be more descriptive and write better tickets to hand things out with, or I've leveraged it to help clear those gaps as well to like when writing things and knowing how to make sure I convey it properly.

So yeah, like don't treat it as, like I said, it's like a PHP doctorate. That's the first day on the job. Extremely intelligent but doesn't know. The way that you do Drupal development or that you do certain development, you need the guard rails. It can be an extremely valuable resource, not just like as an assistant, but even beyond.

But you really got to make sure that you're clear in how it should achieve its task.

Andy: Yeah. I, I wanna go back to that context question real quick. 'cause you asked like, how do you benefit from giving the agent's context? And I think you know, one of the buzzwords lately is context rot. So we've all experienced, when you're using AI and you're generating something like the longer you're talking to it, the worse it gets.

And if you get into a system, especially like, you know, six months ago or so, even further, when you were like copying and pasting code back and forth from chat GBT and it would be like, oh, here's an error. Okay, I fixed it, and it creates another error. And then you're just in this like boot loop of reproducing and, you know, fixing the same error over and over.

The problem is like you're, the, the context window has grown too large and the AI can't really focus on what it needs to do. So by like chunking things into really small requirements and giving it a limited context and then having it work on that, prove that it's fixed the problem and report back to you.

And then there's a somewhat recent technique called Ralph Wigga where you actually just like shut the whole thing down and then start it back up. And it's basically like running clo in a bash loop where you have a task, it runs it, when it's complete, it shuts down, it goes to the next one, starts it up again.

And that way every time it starts, it has a really limited context window. And it's really good at working within that like small chunk of information versus just like this iterative, like you're running Claude code all day and you just keep typing into it and you're not getting anywhere.

Nic: So I guess this show is me just.

Non. So the next question, so I, I, I think, I think that's very interesting, but I, I, I really wanna get to this next question, which is, let's go a bit of a shift, and this is one of the things that I, I really have trouble getting past. A lot of these agents require them to run in your computer. They have access to your whole file system.

How, how do you kind of look at the security or privacy of, you know, your clients and the code that you're working on when you're handing these to all these different corporations?

Andy: I will say that cloud code, I know for example kind of going back to earlier, it doesn't like index your entire code base and it doesn't like push your code base up or create embeddings from it so that it can like work on it, right?

It uses bash tools. To find the things you're asking for. So it'll use gr or it'll use other tools to like find bits of code and then create the context from those. So like at any given time, it's not, you know, indexing your entire code base. Now there's still code being shared with the LLM. There's still a concern there.

Claude does have features built in to strip like tokens that, like API tokens and stuff like that. So there's a little bit of guardrail there. I think, me personally, I run it always in a Docker container. I don't ever, like, there's a lot of tools you can just like, spin up on your Mac and let it go to town.

Like, hell no, I'm not, I'm not gonna do that. So I, I run cloud code in a Docker container with limited access to the project I want it to have access to. And actually when you invoke. Code. And I keep saying that 'cause that's what I'm most familiar with. Like it only has the context of the folder you invoke it in.

So like if you cd into a directory and run clawed, like it sandboxes itself into that directory that you're in. And then my last point to this is like, my next step is to get this stuff running locally. Like, and have things that are more secure that I don't wanna share with an LLM running like on a server in my office with A GPU.

And that's like a whole nother topic. But that's what I want to get to. I want to be using these tools and I want to be doing everything without being dependent upon all this stuff going out.

John: So. I'm gonna channel my inner inner nick here for a second. And you know, that all sounds good and it's great that, that Claude seems to be protecting my security and privacy, but like, you know, that means I have to, I have to trust Anthropic, right?

With with the fact that they're gonna uphold that and they're actually gonna do the right thing and, and that, that could be dangerous, right?

Matt: Yeah. And so that's what I was gonna segue. So it is a big question in enterprise. I remember this first came out like everyone, like Aqua was like, don't use it until we can make sure that it's not training the public model.

'cause I don't know if, I mean, we've been in this world for only three years, right? It feels like a lifetime already, but remember when like some people at Sony like uploaded a cs like a spreadsheet of financial data and then people were able to like prompt hack chat GBT and get the data out. Most enterprise you can then, like if you do pay for like chat GPT or a few others, like Google.

You can get it where it doesn't train on the data, but that is like an enterprise feature or like how we're running it like through AWS bedrock, like, or, and, and in a previous call the previous show with Amma about their Amma ai, how they're using light lm so that way you're using a non-public model.

So that's one way to work around it. You can do like a local open source model that then you need to have the hardware or you could just pay for model access directly.

Nic: Well,

Matt: I mean, not everybody can do that.

Nic: Lemme pause really quickly and say, yes. A lot of these are open source. Yes, you can run them locally.

There is. Of all the people that I followed that have set that up, I have not seen a single one get a local model that is useful. And, and I, I mean Jeff Gerling most of us are familiar with him.

Matt: He grow, I I'm just saying that's local. I'm saying local there. But if you were to pay, like if you use AWS Bedrock, everybody has models as a service now.

So if that is one of your concerns, your organization is able to do that or usually structure away. But that's about like data in the model. I think this is what the other concern is, like local access as well, where most tools have it built up that it will not read your Dov file. So if it tries to access TOV file, it throws up a red flag and prevents it or anything in your get ignore actually.

So like I've had that issue where I was working on my site and it wouldn't read Drupal course. It kept getting kind of flaky, but that's because it wouldn't parse Drupal core because I had that excluded from my local build with GI Ignore. So I had to like make a rule that says, yes, you can access this directory, even though it's a side, the GI ignore.

But that again is relying on the tooling that's wrapped around the model execution. So there are other caveats, but there are a lot of these security things built in. Now, like if I had a local database, I would've, it sa like one, if you have a customer database locally, you should have it sanitized anyways to not have anybody's PII.

But that's a great example of why you should do that because if you have the, the AI working on your local machine, or let's say that we are like, we're slowly getting into the world where you have cloud agents, cursor has cloud agents, there's GitHub copilot, there's gonna be more coming soon. This is a bigger concern about connecting a cloud-based AI agent to your live and working Drupal database.

Because it can execute things. Now.

Nic: Ai Yeah. It deleted what, what was that company that had everything deleted? Because I, I don't dunno which service it was. Ran ar ran Arm Dash

Matt: Well that's where, where Claude Code has a sandbox as a, as a no sandbox feature where it's like yellow mode. Like sure. That's, they have yellow mode, which is like, fine, you gonna have access to the entire system and run any bash command.

Like when you see like somebody, it RM fd my directory, I've never had it have had it run a dangerous command again. Apparently I'm in like the lucky little pocket universe. I mean, it does

Andy: ask you so if you pay attention, like

Matt: Yeah, but I run by gravity with always proceed for terminal commands,

Andy: right?

Like you do.

Matt: I run in, I run in low.

Andy: Is it, is it in a, in a container or is this just on your machine? No, it's right on

Matt: my disc.

Andy: Wow, that's risky.

Matt: I live on the edge.

Andy: I love it.

Matt: Yeah, so, but there are a lot of concerns there, but the tools do have some safeguards built in around what they can access, or if it tries to read files, like I worked on a GO tool and your go dependencies are inside your home directory, and every time I try to read those files, it prompted me if it could, and I finally like, allow this at that directory in specifics so it would stop asking me.

Nic: So I, I, I wanna pivot then a little bit, because this is somewhat related, but I, I guess the cracks are showing around the AI ecosystem, right? Financially, you know, open AI as a standalone business is likely not sustainable. They, you know, yes. The setting aside the cost that your subscription doesn't necessarily pay for the usage is really just.

The tip of the iceberg, really a lot of the cost is training the models and training the models is becoming exponentially more expensive rather than cheaper as people expected. So a lot of these standalone tools, tooling systems, are likely gonna be acquired or go away. You know, Google and Microsoft are a little bit protected because they have legacy sys, like Google just increased their workspaces fee.

Like, what was it, 40% basically to pay for ai, whether or not you want it. So do you guys have long plans? Like what are your plans when a lot of these tools go away or get acquired? I mean, are you just gonna switch gears and find a new one or

Matt: We, well, let's put it this way. The AI bubble will probably burst because all commodities go and burst software came and burst with the.com bubble, but you can't put the genie back in the bottle.

There's a fundamental shift that has happened. So everything, I would say, everything you're doing, if it's tied to a tool, then it's not the right. Way of thinking about it. Like I'm using cloud code because the CLI is great, but there is open code and codex and like there can be an open source CLI tool that does the same thing actually.

I mean the cloud code, I don't think it's open source, but you can use it with any, with almost any other, like any compatible API, I'm using cloud code with old lama locally to do MCP verification because I don't wanna spend a hundred bucks just test my MCP server. So I agree like the tools may go away, but the fundamentals are there.

And like if you look at hugging face, there's like a million open models that people are running locally. Banks are running small local models to make decisions with their chat bots and they're not relying on the big tech companies. Now, again, those are large institutions, the genie's outta the bottle, it's not going, the players will change everything always changes, but when like, I don't like when Edison passed and we lose electricity, we didn't.

Now, I'm not saying it's on the same way like that, but there have been various things that have changed the way that we live, work, and go our day-to-day life. And the business businesses that founded it went away. Or they'll be like at t and they'll never die and they'll just keep reappearing in different subsidiaries and then reforming into the main corporation again.

I mean Right. We have phone lines because of at and t and they didn't go away.

Nic: Yeah. I, I don't know if the analogy met or metaphor or whatever it is follows, I dunno. I'll let it lie. I

John: kind

Matt: of, well, I mean, going, going back to the idea behind it though, it's not, I get this is like not, we've never seen anything like this before.

It's just there have been paradigm shifts in the past. Yeah. And those things that caused the paradigm shifts didn't go away just because the companies behind them went away or changed.

Andy: I think like open code and o other tooling that's not dependent on one particular ai. And you mentioned like the inability to run AI locally and gearing guy and all that stuff.

Like one Matt just mentioned, he runs Alama locally for like small stuff, so that's obviously working. One of gearings like most recent YouTube videos was about how there's a new model from Quinn that does text to speech and it basically a, yeah, I saw that replaces 11 labs and he can run that on a raspberry pie with a GPU.

So like there's different, more specialized models that will be coming out that are open source that you can run. Like I fully think that like, I would like to get away from all the paid requirements. Like I don't want to have another $2,800 a month subscription to some AI thing. Like I want to figure out ways to have smaller subscriptions and, and do things intentionally, locally, or.

Least GPUs

John: as we were talking about before the show, right? Like, I'm, I'm cheap and won't, won't pay for a cloud code. So like, you're not alone in that. Okay. Because, because I'm identify more with, as a site builder than a backend developer on most any given day. I'm wondering how site builders and not just backend developers can benefit from this agentic coding in Drupal.

And like for me, like, I don't know, sometimes I talk to, talk to the various AI and, and I'm like, Hey, write this code for me and will know. I just, I just did a worked on a POC with web components and, and carbon design system and I was using cloud code to, or Claude, not cloud code, but Claude to, to help me, help me along with that and like.

Got me pretty far. But then I had to call Nick and be like, okay, what did I do wrong here? So I'm just wondering like site builders, is there a benefit to kind of agentic coding for them?

Matt: 100% hit him with the ncp and I think the problem here, well the problem is the interface, right? AI has had a horrible interface.

Chat is a horrible interface and IDE is a horrible interface and a command line is a horrible interface, but it's what we have. I remember people like, I'm using Cursor for those, these agents. I'm like using a coding IDE to run like spreadsheet analysis. But that's because that's the IDE we have. So with that preface there, I think site builders can benefit a lot, but that's where we need to have the MCP tools in Drupal, where you could say you have the new Drupal CMS and you add the MCP recipe and as a site builder you have cursors set up or hopefully there's a better ui and you connect Figma because you're given a Figma.

Let's say this isn't like, you didn't even do the content modeling yet, right? Like you were given like this Figma design and you're told build a Drupal size. Like, Hey cursor, can you go review the Figma design and analyze the content infrastructure that's being conveyed into design? It could come up with a, I think it's here, like great, you have events, you have articles, you have pages.

Mm-hmm. Like, oh, and then you can tell like, Hey, by the way, for pages, I'm using Canvas to build the pages. Like, awesome, great. I love it. Then you could say, Hey, go create these content types inside my Drupal site. So instead of you clicking on a thousand links or launching a browser agent to click links, it creates them for you.

They're like, Hey, dump the export and it runs just config export for you.

Nic: Mm-hmm.

Matt: That right there, like if you can focus on the fun, fun things. I know a lot of people when they see these things, it's like, it's gonna take a job, it'll take part of my job. Do you enjoy that part of your job where it's clicking, or do you like building the solution?

No, I'm scared. I mean, me, the fun is the solution.

John: I don't, and necessarily, I'm like, I'm not like, Hey, it's gonna take my job. But like, I mean, I feel like it's an assistant that's gonna help you do things faster. Yeah. And, and make things easier. Like, hey, build this content type. Okay, cool. You built the content type.

Okay, now build a view with that content type. Okay. You built a view. Great. Like, I don't know, I, I think I, I, I subscribe to that future, Matt. I think I think it's right one.

Matt: I know some people don't. And that's what I wanna say. Like, think about your job. If you can get the boring 60% done, so you do do the fund 40%.

Like I, basically, all the work I do is agentic everything I do, but I use the last tail half to do the XD bug to figure out why something didn't work and get the final performance, because I care about performance and learning why it didn't work and getting the test written.

John: Sometimes I think the people that may be concerned are the people whose you know, 90% of their job is the boring part that they're like AI's gonna do for me.

And then if, and, and then like that last 10% might not be all, all, all that you know, they might not be able to sell themselves on that 10%, but

Andy: if you feel that way, then start learning.

John: Well, exactly. I mean, I think that's that's the name of the game here. All right. Yes, agents and. And web development workflows.

I mean, I think you guys have, have painted a very very motivating picture here today. I appreciate both of your times. Matt, welcome back and thanks for joining us. Andy, thanks for being here for the last three-ish weeks and we look forward to having you back again.

Nic: Alright.

Alright. Do you have questions or feedback? You can reach out to talking Drupal on the socials with the handle Talking Drupal or by email or show at talking drupal com. You can connect with our hosts and other listeners on the Drupal Slack in the Talking Drupal channel.

John: Do you wanna be a guest on talking Drupal or our new show TD Cafe?

And click the guest request button in the [email protected].

Nic: You can promote your Drupal community event at Talking Drupal. Learn more about promoting your event at talking drupal com slash td promo.

John: Get the Talking Drupal newsletter to learn more about our guest hosts, show news, upcoming shows, and much more.

Nic: And thank you patrons for supporting talking Drupal. Your support is greatly appreciated. You can learn more about becoming a [email protected] and choosing Become a Patron.

John: All righty, as we bring this show in for a landing, Matt, if folks wanted to get ahold of you, how could they best do that?

Matt: Just find me on LinkedIn. I, I realize that's probably the easiest way to search for me on LinkedIn and shoot me a message.

John: There you go. Andy, what about you?

Matt: Droopy yard.com and Andy G 5,000, anywhere else.

John: There you go, Nick.

Nic: You can find me pretty much everywhere at Nicks van, N-I-C-X-V-A-N,

John: and I'm John Zi.

You can find me [email protected] on the social medias and drupal.org at John Zi, and you can find out about [email protected].

Andy: If you've enjoyed listening, we've enjoyed talking.

John: Have a good one, guys.

Nic: I was just gonna say, we forgot to tell you Andy.

John: Oh no. Andy's a pro. Andy's a pro. Yeah, he, he got it. He got it.

Nic: Awesome.

Talking Drupal #538 - Agentic Development Workflows

Listen:

Topics

Resources

Module of the Week

Matt Glaman

Andy Giles

Martin Anderson-Clutz

John Picozzi

Nic Laflin