MCP servers and the role tech writers can play in shaping AI capabilities and outcomes -- podcast with Fabrizio Ferri Beneditti and Anandi Knuppel
Audio only
Listen here:
Links mentioned
-
Why I built an MCP server to check my docs (and what it taught me)
-
Failing Well: A Practical Guide to Growth for Technical Writers
-
Claude Skills are awesome, maybe a bigger deal than MCP + Hacker News thread
About the co-hosts and guest
Takeaways
Here are 10 key takeaways (AI-generated) from the podcast conversation:
- Writers’ domain is expanding: The role of the technical writer is moving beyond traditional documentation. As Anandi put it, “wherever there are words with regards to a technical product, that’s the technical writer’s domain.”
- Writers are critical for MCP performance: In MCP servers, the writer’s most critical job is optimizing the documentation inside the tools, such as the tool descriptions and input schemas.
- Clear descriptions guide the LLM: A well-written, unambiguous tool description is essential for the LLM to understand and select the correct tool for a user’s request, directly impacting the system’s accuracy and performance.
- IA and modular content are still vital: Information architecture isn’t obsolete in the age of AI. A well-organized, semantic, and modular documentation hierarchy is crucial for an LLM to process information effectively, as it struggles with large, unstructured “soups” of content.
- “Evals” are a new frontier for writers: Evaluating the performance of prompts and tool descriptions (known as “evals”) is an emerging, critical skill. This involves testing the AI’s output for accuracy and may use methods like “LLM as a judge.”
- Non-determinism is a core challenge: A major problem with agentic AI is that it’s non-deterministic, meaning it can give different answers to the same prompt. A key goal for writers and engineers is to figure out how to ensure “repeatable accuracy.”
- MCPs can be tools for writers: MCP servers aren’t just a product for end-users; they can also be a tool for writers. Fabrizio, for example, built an MCP server to “tame” an LLM to perform specific, predictable style checks on documentation.
- Writers must now edit “AI slop”: A new, time-consuming challenge for technical writers is editing “AI slop”—first drafts generated by non-writers that may be 70% correct but are filled with style violations, “soulless language,” and other issues that need fixing.
- AI tools have a sustainability problem: A long-term challenge is that many AI tools, prompts, and evals are being built and fine-tuned for specific models. This creates a significant maintenance burden, as these tools may break or behave differently when models are updated.
- AI accessibility is increasing: New tools like “Claude Skills” are being developed to make agentic capabilities more accessible to non-technical users, essentially packaging complex instructions into simple “documentation packages” or “recipes.”
Shorts
Here are some shorts pulled from the longer video.
Segments
The following are several “segments” from the longer video. They’re longer than shorts (3-5 minutes) but still just highlights from the full video.
Transcript
Note: This transcript was made more readable with AI. If you want the verbatim transcript, expand the details in the YouTube video.
Tom: Hi, welcome to another podcast from Fabrizio and Tom, and we have a special guest, Anandi Knuppel, with us today. We’re recording our second episode where we have this joint conversational format. The last episode was actually really popular, which was very encouraging. Thank you for all the feedback. It was great to start this effort. People really liked this more casual, conversational idea exchange that we’re having, so we’re going to continue that.
I’m Tom Johnson in Seattle. We’ve got Fabrizio, who’s calling in from Berlin where he was just giving a presentation at Write the Docs. Is that right?
Fabrizio: Well, that’s tomorrow, but yes.
Tom: So you’re in Berlin. Okay, all right.
Fabrizio: Yes, I was just confirming the location, but the talk is tomorrow.
Tom: I know what that’s like, being the day before you present. And at Write the Docs, you go up on center stage, usually, right? So you probably are… but you don’t get nervous, do you?
Fabrizio: Well, I don’t know. It’s the first in-person event in Europe after a while. So, yes, it’s going to be quite emotional.
Tom: Oh, nice. Okay. All right, Anandi, you’re our special guest today. Do you want to tell us a little bit about yourself, where you’re from, and what your background is? Sorry, I meant to say where you’re currently based. We just had a long discussion about how she’s from all over the place.
Anandi: Yes, all over the place on both of those fronts.
Yes, like I had a map and I was pointing to them. I’m in New York right now, and I’m at Amazon. My title is Programmer Writer, doing the things with the words here at Amazon. So that’s where I’m at. I’ve been a writer off and on, purely doing technical writing since about 2007. I stopped to go get a PhD in something totally different, but I was always doing digital research stuff in that space.
Tom: I saw that your PhD was in… was it visual anthropology and religious studies, or something like that? It’s very interesting.
Anandi: Yeah, it was fun. It was like a little side quest. But yes, I kept up doing digital scholarship, which is humanities research stuff, doing digital things. So I was still writing. I couldn’t escape it if I wanted to—writing documentation and doing coding and stuff while I was there.
Tom: That’s great. The topic that we’re going to focus on today is one that people mentioned being of interest from the last episode: MCP. Anandi, you actually reached out and said that you’ve been doing a lot of MCP-type stuff, optimizing documentation inside of MCP server tools and seeing how the next evolution of tech writing might involve a lot more emphasis on this. Can you tell us a little bit about what you’ve been doing with MCP and what your challenges or interests are there?
Anandi: Yes. So it’s interesting. I’ve worked in different domains: hardware and software back when, just doing SaaS stuff recently, and most recently doing API work. The evolution is weird because people are still writing for all of these different types of tools. It’s a new domain, though.
So, moving and shifting… what does it mean to be a writer here? I’m of the school that wherever there are words with regards to a technical product, that’s the technical writer’s domain. Whatever is selling the product, what’s teaching people about the product, how they’re learning about the product, and how they’re using the things… all of those domains are where a technical writer could and should be involved, in my view.
Not every place allows for that kind of moving across orgs, but I do try reaching out to marketing, support, education, and all this stuff. On the engineering side, working more in APIs, working more closely with developers… what does it mean? Those are all these discussions, right? What does it mean to be an API writer? How do you affect the final product?
In the design, you’re working your way back. In the design of the API, what are we naming things? How are we thinking about the information architecture of an API and of data models and stuff like that? So you’re abstracting away from the final thing that maybe the user is actually getting to. And of course, then there’s the documentation that people can access in public or internal documentation.
But then for MCP, engineers are dialing up tools, creating their tools, creating the infrastructure, creating the MCP servers. People are just throwing out tools. They’re deriving from APIs, either one-to-one or one-to-N. And so there’s this onslaught of tools. Lo and behold, I’m lucky that I’m in an org that recognizes there are words in MCP tools.
And guess who’s really, really good about obsessing over words and how to efficiently use them strategically, surgically? We have a whole writing team. So, great. Let’s see how this works. And we don’t know. It’s really fun. It’s a fun minute of figuring out paths forward with the MCP servers and tooling. What does it mean to write here? And then there’s a host of problems. It’s a fun minute to go read up on all of the problems.
If you have a simple API getting turned into a simple tool, simple enough, it does one thing. But you have these huge complex APIs, and then you’re running up against all of the limits: the context window limits, you’re running up against performance limits. I think just because it can take a million tokens doesn’t mean it should.
And then what does it mean? Where does the documentation live? Are you actually getting your public documentation into your server as a resource or whatever you’re using? There are prompts as well. I’ve been focusing really specifically in the tooling. If a tool is pulling from an API, it’s pulling in all this information and all this input schema. And then your API then has to be really well documented. Are they always? No, no, they’re not. If you’re…
Fabrizio: So it feels like… well, you mentioned API design. I agree that API design is a domain where writers can bring lots of expertise and direct support. And then MCP servers appeared, and they are like an additional layer, you know, sometimes—very often—between an existing API (say, with an OpenAPI spec) and the LLM. In between, you have this MCP server.
So you mentioned prompts. How important do you think it is for a writer to be there and say, ‘Let’s create that additional interface between an almost-human user and the API’? Do you have any kind of collection of examples where this made an impact, or this was something that really made a difference there?
Anandi: So it’s interesting, there are other places in the MCP server tool domain where writers can make a big impact. One is just being aware of the words going into it and their formatting. So if, let’s say, a tool is being derived from an API, like a direct one-to-one. It’s like, we have this API that’s going to give you the weather, and we’re going to have this tool that’s going to answer the question, ‘What’s the weather?’ Right? So that’s the one-to-one, seems pretty straightforward and fine.
But if you’re ingesting these huge complex APIs, a writer is going to know what that knowledge base is. And if you’re going to end up with these context window problems where it’s just clogging the context window, we need to optimize the documentation that’s going in there.
That’s one place where it’s less about the finesse… well, it is about finessing content. There is a sense of understanding what critical information has to get into a tool for the LLM to choose the tool. And then there’s also finessing that to make sure it’s explicit, it’s clear, and there’s no ambiguity: ‘This tool does this one thing.’ The writers are going to obsess over these details.
So there’s that part of the tool description itself and what’s in the tool, what’s in the tool schema. That’s a great place for writers to make huge gains in the performance of the LLM and accuracy, and how the tool is actually going to run. If it doesn’t have the right input schema and that’s not written well, the LLM is not going to know what to do with it and it’s going to spit out the day’s date, not the weather, or something.
But the other thing is the prompting. There are tools, prompts, and resources. And the prompting itself, as well. You can have writers who are creating prompts and, again, obsessing over the writing to make sure it’s really clear that the tool knows what to do and how to spit those results or the response out for the LLM, for whatever that purpose is.
And then the resources as well. Your public documentation can be one of the resources. And then you’re going to think about how to do that. Are we using LLMs.txt? Are we flattening it otherwise? Are we only using a subset of our documentation in here? So in all of the domains for MCP server work and MCP tooling, a writer can be heavily involved in making real performance improvements.
Fabrizio: So maybe, Anandi and Tom, what do you think? Because some of the listeners might not know what MCP is. They might’ve heard about it. So let’s see if we can describe it in a way that is very simple. To me, MCP is essentially providing an LLM, which is kind of powerless—it just answers based on its training—it’s providing it with the ability of doing things in the world, like running a command or calling an API endpoint, et cetera. What’s the basic structure of an MCP server?
Anandi: Yeah. Do you all want to do that? I can chime in.
Fabrizio: Like there’s this file where you list the tools, right? But what you call tools really is like an entry point, almost like an endpoint if we’re talking about APIs, right? And you have a set of keywords that kind of trigger the command for an LLM, right? And then you have the prompts. Is that the basic structure?
Anandi: Yeah, it’s kind of this abstraction level. Like I said, if you have your API that’s going to get you the weather or something like that, you can make a tool so that the agent can… you know, you’re typing to a chatbot and say, “Give me the weather.” The agent is going to go and say, “Okay, well, this tool is called ‘get weather’ and it’s got a description: ‘I’m going to give you the local weather.’ Oh, great. We’ll choose this one.”
And so that’s where the tool description comes in. Then the prompt, the resource is going to go look up what it needs to know, what relevant documentation is available, or whatever else—resources could be any number of things. And then using the prompts, it’s going to give you perhaps insights on how to send the response, the API language, but send the answer back to the agent, to the end user.
It’s just this level of abstraction. In the one-to-one, that’s pretty simple. Even me, I’m like, “Well, why do we really need that tool? Why aren’t we just connecting straight to the endpoint?” Right? We’re just getting the weather.
What’s fascinating for the engineers and where it’s the wild west and we don’t really know what’s coming and how things are going to look, is you can take that… if it’s a more complex request, like, “I want a series of flights for something”… flight examples, which I would never do anyway… is that you can have the one-to-N.
And so then you’re having one tool. It’s like, “Recommend me some flights for these dates, this area.” And it’s going to go call… the agent’s going to go look around, the LLM is going to look around for any number of tools that are going to help it do the thing based off of the descriptions and the names of the tools. And those might not be an API. It could be a subset of an API and just only the parameters it needs to find flights coming out of New York or something like that, like budget flights out of New York. Not all the flights out of New York for the whole year, but just the budget flights for today or something like that. And it’s only going to have like three input parameters as opposed to a hundred.
So you get this increase in efficiency and kind of exactness with that. And so then you’re going to have all these niche tools being created as these levels of abstractions over, or built on top of, APIs.
Tom: I find this fascinating, this whole step that the industry has taken with giving agents capabilities to actually perform actions. I remember the previous year it was just chat interfaces with a response, and you could copy and paste it and so on. And now routinely when I’m writing docs, I see that it’s executing a lot of commands, not just to read files and write files, but sometimes to do other things, create files and so on.
I like the flight example you gave, Anandi. My question is, most of these APIs that we’re documenting require authorization. So how do you handle API keys for these different endpoints that the agent is going to run?
Anandi: So this is a little adjacent to my domain, but I will say that the auth question is a big one. And getting it right takes time. It takes a lot of research and time for the engineers who are working on the auth question. I’m watching it unfold, and it takes a while to get it right and to make sure that it’s secure against prompt injection and things like that.
Tom: Because for sure, if you’re working in a version-controlled system and you’re just making updates to docs, it could go wild and you could just revert it and say, “Yeah, don’t commit this.” But if you’re making an API call and let’s say there’s cost associated with it or it’s doing something external outside of VCS, it does seem a little more scary. I hope it interprets things right.
Anandi: Yeah. Right. Exactly. The part where I don’t think anybody has an answer to this, and we can protect against it somewhat, is the non-deterministic part of doing these tasks and asking the agent for these tasks. You ask it twice and it’s going to do something different each time. Right. So, how can you build an MCP server and the tools to ensure repeatable accuracy from the LLM? That’s the…
Fabrizio: I mean, that’s very interesting because it brings up the question of, let’s say that something goes wrong in the MCP server. Maybe you tested it, et cetera. Like, who is liable? Is it the LLM that is incapable of understanding? Or is it the writer that hasn’t explained well enough in the prompt how to do things? It brings a whole new dimension for us writers because, yes, we are responsible, of course, for documentation when it goes to the user, but usually we don’t get direct feedback. With an MCP server, you do get direct feedback of what went wrong.
Anandi: The evals question, the evaluations question, is a really fun one. I’m starting to get into it for the work that I’m doing and trying to figure out how, as a writer, I can leverage eval frameworks to test the prompt. Well, primarily tool descriptions is what I’m most interested in—tool descriptions and input schema. So the bread and butter of getting the tool selected and working.
And then working from there, you know, is it 500 characters? What’s the sweet spot for character counts around there? What does it look like for my use case? Is it more or less description? How is it going to handle if markdown is still floating around from the front end, like the OpenAPI spec that this is pulling from or something like that? Leveraging the evals and getting writers involved in the fact that there are evaluations here, I think, is really critical to getting a handle on the impact of the words in the MCP server.
Tom: Anandi, are you actively executing on eval iterations? Like, are you doing some kind of evaluation test to measure your doc quality or task quality and then iterating on that to improve it? Is it something…
Anandi: That’s the end goal. We’re just at the very early stages of me trying to figure out where evals fit in my work and what type of eval framework is going to work best for the system. Right now I’m in the research stages of just taking all of it in and then figuring out with the dev team what we should implement to do this particular type of work.
And it’s cool because I’m really lucky to be involved with the dev teams where I’m able to talk to them about this stuff. What do the writers need? Because this is a joint effort of them getting the tools ready and writing tools, but then also the prompts and the descriptions and everything. It’s the thing that we’re testing with the evals. They have the structure in there and then I’m coming in with the words. It’s a thing together, so it’s a joint effort to figure out how evals are going to work for our particular use case.
I’m really interested in trying to do, I guess, some A/B testing around it. The “LLM as a judge” type of eval, where you’re having another LLM go in and grade your interaction there with the tools and the prompts and everything, seems… I don’t know, it’s an endless recursive cycle. You get your LLM as a judge to evaluate your prompt and response, then you have to then train that LLM to do that work. Then it’s just like, you could keep moving back in the training. At some point… yeah. So it’s the evals framework. I’m still trying to wrap my head around, but I know it’s going to be necessary and it’s going to be a huge part in how we get better, accurate tools.
Tom: I’m really curious to know if there are any early conclusions or early realizations about how writing needs to change to optimize for the evals. For example, do I not need to worry about my sidebar structure and how I’m organizing files? Is it all just one big soup? And do I need to just load in a bunch of detail that normally I might say is a little too much? How’s it changing how we write?
Anandi: Yeah, so I am an information architecture person to my dying days. If you’re looking at the structure of a doc site, you said sidebar, and I’m thinking of the nav, right? Where is stuff and how are we categorizing stuff? Taxonomies and pretty abstracted things, getting down into the details of, okay, how are we listing our overviews and our responses and the specs and all the user guides and things.
The whole pipeline of information architecture, I think, is still really important. Even if you’re going to take your doc site and flatten it out and feed it into an LLM, there’s a hierarchy there that can be understood and a prioritization. This is how we organize information. This is where it is. If you’re looking for overviews, they’re all going to be in this one spot. If you’re looking for errors, they’re all going to be in this one spot, and user guides, et cetera.
So I think thinking about the organization of the knowledge base that’s going to go into the resources is critically important. I think if you throw it into the soup… I have thought about this. I’m like, “Well, why do we need this dev portal to still function in this way? Or can we just throw everything in there and figure it out?” And I think that probably your listeners here have tried something like that and realize, “Wow, the agent is just not good at this.”
So the clearer and more precise we can be as far as how information is organized and feeding that as a resource into the MCP server, the better we can do that. I think the better we equip the server and the tools to do what they need to do. I think it’s still important.
Fabrizio: That’s reassuring.
Anandi: I’d like to think so. No, absolutely. I mean, I think anytime you’re feeding tons and tons of information to an LLM and expecting any good writing to come out of it, it can’t handle huge chunks of information like that. It’s going to do well with the beginning and the end, and the stuff in the middle is just going to be effectively soup. So the best way you can modularize your documentation—modular documentation—is still really important. Organizing it is still really important. Thinking semantically about how that stuff is organized is still really important. And that feeds directly into the tooling.
Fabrizio: Yeah, I mean, it feels to me… I’ve been playing a bit with MCP servers lately. I built a very simple one to call a tool to check the style of the docs. And my impression, Anandi, I don’t know if it’s also your sensation, but I think this is one very promising direction of documentation tooling in the sense that it’s mixing the best of both worlds.
On one hand, you have the deterministic aspect of, say, defining a tool call, defining an API endpoint, or maybe you have a tool just to feed that information architecture or that taxonomy to the LLM in a more structured way. On the other hand, you have the enormous, but also non-deterministic, capabilities of the LLM. But it’s a way for me of taming the LLM a bit, to have something more structured in that layer of the MCP server, right?
Anandi: Right. Yeah. I think, just to note that, the MCP server that you kind of wrote about recently, I’ve already saved it. We’re going to talk about it as a team. Like, “Can we do this?” Because we wanted to have some additional checks on our work and this is fantastic.
What I’ve mostly been going on about here is how, as a writer, you’re working on MCP servers and tools as a product. But then there’s the flip side, what you’ve written about, which is cool. I’m literally looking at this every day and wasn’t thinking about how I can create an MCP server to better the docs themselves, which I was like, that’s really fun.
So yeah, I think that you have this great opportunity with good writing and good planning and good organization to reduce that non-deterministic part of the LLM agent experience. And for docs, it’s critical. Right. So if you’re any agent in part of that process, you cannot veer from the expected outcome every time you’re calling it. Anything that you can do there—and that’s why following best practices for the tool descriptions and schemas and outputs and things like that is really critical.
Tom: I would love to see… well, so many people would love to see the style problem solved. So Fabrizio, your experiment with the Veil server and trying to address style concerns in that way seems like it’s very, very much a sought-after goal. The style thing seems like it would be a simple problem to solve. It’s not. I spent probably an hour just editing somebody’s doc that they sent me, and it was clearly generated by AI. And I’m like, my gosh, you know, I have to fix so many issues. Like just sentence style capitalization was the first one. And then we had other conventions that we had to fix. And I’m like, why is this still a problem? Can’t we implement something that would just address all these issues? But it’s difficult.
Anandi: Yeah, do we want to have a formal plug for this? I was really interested in it. Fabrizio, if you want to say some words about the Veil MCP server.
Fabrizio: Yeah, essentially, to me, the solution in the long run would be to either train a small model or fine-tune a model on the company writing, for example. Because using a model that has been trained on README files is always going to be problematic. If you ask for documentation by default from that model, it will have emojis, title casing, bullet points without a period in the end, always, always, always.
With that in mind, if we have to use a generic model, the advantage I see in using MCP is that it brings some predictability, structure, and especially repeatability to those interactions. The LLM is going to get a structured output with a prompt that is always going to be the same. Of course, you could do it without it, but it would never be as predictable and stable as if you were using an MCP server. Because every time you use it for style checking, your prompt will be slightly different, unless you copy and paste.
That’s what I was calling it. Is it glorified copy-pasting? Well, in a way, but it’s also like chaining instructions together so that you really try to reduce the noise and the randomness. I asked the LLM yesterday because the Veil linter is able to output JSON. And I asked the LLM, “What sort of output do you prefer? Do you prefer JSON or do you prefer just plain Markdown or plain text? Because, I mean, maybe it’s fewer tokens for you.” And the LLM surprised me by saying that they prefer JSON, for some reason, “because it’s clear to me where this ends” or whatever. I don’t know if it was being overly polite. Perhaps that’s the case.
But it’s interesting that while I was building the tool, I was also asking the LLM… it’s like repairing a robot, and you ask questions to the robot. Like, “Where do you want your arm?” “No, I want it that way.” So there’s also that interactive aspect to building the MCP server. But yeah, I think for me, the next step… I bought this book about building an LLM from scratch. It’s fascinating. I don’t know where I’ll get with that, but the budget perhaps is also a problem. I really think that the next goal, especially for style, is to really fine-tune or train a model from scratch on those tasks, like a very small model.
Anandi: Yeah, yeah. It’s interesting. I would be so happy if I never saw another bulleted list replaced with emojis in any output. I would be so happy.
But yeah, I think, like you’re saying, chaining the instructions to each request and response interaction is really key there. And then going in and seeing where all the places are where you can make improvements in the actual protocol—the name and the description, the input schema and the output schema, the errors, the whole bit. Making sure that’s really clear so that you’re reducing that non-determinant aspect of it. You can rely on it actually producing the type of writing or changes or whatever it is that you’re looking for.
Again, I was looking at this every day and I never thought, “Wait, we can apply this to actually writing as well.” And it’s brilliant. It’s great. I was really excited to see that.
Tom: Well, Anandi, it’s great to have you on this show. It’s great to just get different voices. You’ve got a lot of enthusiasm. You’re doing some innovative stuff. Do you like working in this new realm, all this new work that is probably different from the old style?
Anandi: Yeah! Yeah, it’s pretty different. I like it. I like thinking the thoughts. I like the challenge of trying to figure out what this new space looks like and how, again, going back to what I opened with: there are words, and that’s a place for the technical writer.
One thing, I guess, maybe I can leave you all with this thought experiment. I’m really curious if we’re… you’re saying you want to really train an LLM from scratch on this style stuff, so it has really strong guidance on how to write the way we want it to. But this is all relying on models. For MCP servers, we’re relying on models, we’re building to a certain model, and we’re evaluating on a certain model. Six months from now, what are we going to have to do to keep up to date with the latest?
That’s concerning because we’re building around certain expected responses. We are pretty deterministic beings. We’re writing around expectations of what we’re supposed to get back. And as the models are changing and we’re updating models, what does that mean for all of this work in this space?
Fabrizio: Yeah, imagine in the changelog you’re going to get something like, “Well, model version three has a different personality. It’s quirky.” And it’s like, my… how do I adapt to that?
Anandi: Yeah. Exactly. It’s like verbosity or how it’s going to change, and just all of the questions that come along with using different models. And then do we allow people to select their models? What happens when those models are changing? I haven’t seen anybody writing on that about how that’s going to affect these things. We’re just really trying to keep up to speed with the changes in the MCP space and like, okay, then what’s next? The new agentic reality.
So I’m kind of curious, yeah, okay, so six months, a year from now, and all this stuff is changing, then what? And what does sustainability look like there? I have all the questions. I have maybe one answer, but all the questions.
Tom: Well, thanks so much for coming on to our show today. I know you have a hard stop coming and other obligations, but if people want to reach out to you, can they just reach out to you on LinkedIn?
Anandi: Yes, absolutely. So LinkedIn, I am Anandi Silva Knuppel. You can find me and send me messages, and I love chatting with people. Wonderful. Thank you so much for having me on here. And I look forward to listening to the rest of the conversation. Thanks.
Tom: All right, thanks Anandi.
Fabrizio: Thank you.
Tom: All right, so the format that I was mentioning earlier, I really like this idea of having a guest on for the first half of the show, and then we continue the conversation afterwards. We’ll see how this works. It’s still an experiment.
And the other thing is I don’t think that any specific show should be dominated by just one single topic, because people have broader interests. They want to keep up with the news. Somebody might say, “You know what, MCP? Not doing anything with that, but I still want to keep up with what’s going on.”
So Fabrizio, just to kind of shift directions a little bit, tell us… I looked up your Berlin talk, your Write the Docs talk. And I thought I would see something related to the Veil server, AI, anything. Instead, it’s “Failing Well: A Practical Guide to Growth for Technical Writers.” Nothing on the surface to do with AI. So why are you focusing on this?
Fabrizio: Well, you know, that’s a good question, actually. There’s always a dimension of tinkering and a dimension of tooling. And I love that. But I felt like we are going through a time, tech writers everywhere, that feels full of uncertainty. The other day I mentioned getting into the technical writing subreddit and reading all those questions, remember? People feeling lost, not knowing what direction to take.
I’ve been writing a bit, it’s one of the recurring themes on my blog, about how to find answers, how to find motivation, how to get out of the pit of despair, or whatever you want to call it. Because I’m seeing this pretty much everywhere in the channels, in the community.
I think we need to take the reins a bit. That’s what the talk is about. Failure is a way of life, and failing is sort of ingrained into technical writing. We fail often because that’s what we do. We try to change the way an organization communicates or documents things. And more often than not, we are doomed to fail because those are hard changes. But it doesn’t mean we shouldn’t keep trying, because that’s also how we advance.
So what I advocate for in the talk tomorrow, what I’ll be presenting, is some strategies. And actually, I quote you in one of those. Remember you wrote about the value of speaking up? Remember that post you wrote?
Tom: Probably. Yeah, one comes to mind, but I’m not sure if it’s the same one.
Fabrizio: Yeah, it was the one about, you know, telling… if you are in front of bullshit, just saying it. You wrote that pretty clearly. And that’s one of those things of, don’t stay silent if there’s something that you think is wrong, speak up. And in general, I think technical writers should be doing that. Anandi brought up a very interesting concept that I agree with wholly: ‘Whatever there are words, tech writers can help, can be there owning those words.’ But you do that only if you have a certain attitude and a certain take on things and you don’t feel like, “Well, this is not for me. I shouldn’t be doing this.”
Well, right now is the moment where tech writers should be more vocal, more present, and take a step forward and say, “We can own these words.”
Tom: I’m sorry. Now I’m remembering. Okay. It was the post that I was thinking of. It’s “Speaking up and calling out BS when you see it: some reflections on Jonathan Rauch’s book, The Constitution of Knowledge.” I was going to this book club called the Seattle Intellectual Book Club, and this was a book that we read for it. It was all about not being silent and speaking up about so many different aspects. It wasn’t focused on tech writing in general. It was more about how our whole knowledge system is constructed on this idea that you have various bodies that negotiate and find balance and criticize each other in order to come to some sort of truth, using the idea of the Constitution having separate bodies in place to provide that balance.
Coming back to the other part, you mentioned just a lot of uncertainty that people have and maybe people should speak up and be more open about that. Yeah, really interesting focus. I like that. I get the sense that AI has saturated so many conversations that people are fatigued by it, and there’s no clear thing that people need to always do. That’s part of the fatigue.
I see this. At my work, I’m constantly declining meeting after meeting where it’s like, “AI demo,” “Hour of AI,” “AI this,”… some kind of just constant AI training, tooling. And I’m giving these as well. So I know that people are probably doing the same thing.
Anyway, it’s an interesting time and it’s full of uncertainty. You never quite know. I do think there was something very, very encouraging that happened just a couple of weeks ago, and you posted about this on your LinkedIn profile. This discussion about Claude Skills. You were responding to a Hacker News thread. Let’s see. The Hacker News thread was regarding an article called “Claude Skills are awesome, maybe a bigger deal than MCP.” And you had posted an excerpt from it about how essentially, Claude Skills are documentation. Do you want to talk about that? Do you know much about Claude Skills that you could relay the gist of it here?
Fabrizio: Basically. So I’m still trying them out, but essentially you could picture them as a small package of documentation that tells Claude how to do things in a more systematic way. That’s the way you could describe it.
I think it’s a way of making things like MCP servers more intuitive. For example, I don’t think my parents will ever use MCP servers. Just the name is like… MCP? What’s that? It’s too technical. And they probably don’t need something like that. But Claude Skills, I think, is trying to bridge the gap, and I’m sure we’ll see other players doing something similar, in the sense that, “Well, I’m giving it a recipe. They’re recipes. I’m giving it a recipe and they’re going to follow it, and it’s going to be systematic.”
Now, how effective those are, I’m not sure. I think it’s probably going to be less structured and less predictable than an MCP server. Then again, I don’t expect anybody using something like Claude Skills for anything critical. All the talk we had about security and prompt injection… Claude Skills is more for daily usage.
But it really speaks to me about something I’ve been discussing recently with more people: these technologies need to be more accessible. The access to them needs to be more democratized in a way. Yesterday I was having this dinner with more writers that are going to attend Write the Docs, and we were talking about… we should really get, maybe in a year, two years, three years, something like “train your own model.” I expect that all the steps that are involved in making these tools available right now are kind of firewalled.
The companies behind LLMs like Claude or Gemini, et cetera, are providing this service. But as it happened in the past with other technologies, I think people will be eventually empowered to create that technology themselves at home in one way or another. Of course, right now, training a model costs millions of dollars. So I don’t expect that to happen overnight. But things like Skills are a little step in the direction of “do it yourself.” Let’s make it very easy. Maybe in two years’ time, they will launch something like “Train-Gear” or “Modulo” or something that makes fine-tuning very easy.
Tom: “Train your lawnmower.” That’s an interesting example. But yeah, I agree. The accessibility of these tools to be trained is something that we’re seeing more and more for all these special skills that a generalist model won’t really know how to do. And it is encouraging to see documentation be a primary player in how this all evolves.
There’s so much in this space that is full of uncertainty, even from a product perspective. Coming back to this earlier example of the flights: you want to use LLMs to figure out flights. Well, a company’s primary concern is if somebody goes into Claude or Gemini or ChatGPT and they’re like, “Hey, build me a little app that will get the cheapest flights to Albuquerque,” or something. If you’re Kayak, you want it to use your APIs. If you’re Orbitz, you want it to use your APIs. If you’re something else, you want it to use your APIs, right? How do you influence which APIs the model chooses? Good luck. You hope that the user has an MCP that leans toward a certain API, but anyway, there’s a lot of just black-box-type stuff where who knows how it actually works.
Fabrizio: Yeah, and even opening up the black box doesn’t help sometimes. It really depends on the kind of audience that these players are targeting.
There’s another thing. Again, the importance of documentation, and somehow connecting it again to the theme of taking the reins, taking a bit more control of things. Last week, we launched at work… I wrote these AI and docs guidelines. Essentially what they are is, I guess you have similar guidelines at your job for general usage of AI, like what kinds of things you can do with AI. I’m seeing these guidelines sprouting more and more at companies in the technology sector. Usually, they’re just focused on coding or information processing or what kind of docs you can feed to them, et cetera.
In this particular case, what we’re trying to create is guidelines on how you create documentation through generative AI. Why did we create those? It’s quite common sense, but my thinking is that this wave is coming. The tsunami wave of AI and documentation and folks creating documentation through AI is coming. We are not able to stop it. We cannot stop it.
What we can do is surf the wave. And the way you surf the wave is by channeling all that energy that is going to come from people wanting to create docs in a way that is better for us. In the sense of, “It’s okay. Let’s send the message that it’s okay to create documentation using GenAI, but do it in this way so that everybody wins. And do it also in these other ways so that we don’t have discussions around style, for example.”
For example, one of the things we say is, “Use our agent instructions file that we prepared so that the agent you’re going to use will write documentation that requires fewer edits.” It’s kind of providing tools and instructions in a sneaky way so that the output is better. I see more companies creating their own libraries of agent instructions. I think that’s also a way of keeping the output a little more compliant and predictable and having the quality we expect.
Tom: I’m glad you brought up this topic because just last week I had a glimpse of what I think could be a nightmare for technical writers in terms of this AI-generated content coming in. Let’s say you have some member of your team who’s not a writer who needs a documentation topic on something. So they use AI and they generate it. And to them, it looks decent. Maybe English isn’t even their first language. Anyway, they send it to you, and now you’ve got this… I don’t want to say “AI slop” to edit, but basically, you do have AI slop to edit, right? Because it just has so many things that have to be fixed, from your company’s own style guide and conventions to just your general writer’s intuition about how this should be changed, and so on.
Most of us, when we use AI to write something, we make a ton of edits to that content before we finalize it and publish it. But other people aren’t doing that. They’re expecting us to do that. And I just got my first one last week. I was actually pretty excited to receive it, because engineers so far hadn’t really started to use AI much and I was like, “Why aren’t you using it? It’s so much easier.” So somebody finally did, and they wrote about a topic that we definitely needed, and it was about 65-70% there. So it was great, but I was like, “I don’t want to sit here and have this AI slop editing task times 10 every week for all these incoming docs.”
You mentioned the agent markdown file. I think that’s a key strategy I need to implement so that whatever directory they’re working in, an agent markdown file is going to get rid of some of these easy wins.
Fabrizio: Yeah. But did they disclose AI usage, or you just kind of discovered it? How did you know?
Tom: I just… I mean, gosh, it just screams AI. So many different conventions that I’m used to seeing: the way lists are introduced, certain bold formatting, just the generalized soulless language and so on.
Fabrizio: So that’s important. I think docs contribution through AI in a company setting, the contributor should disclose that they used AI and how they used the AI. Why? Because it’s more honest, it’s sort of educational, and it’s also an incentive not to overdo it. In the sense of, “This guy created 30 AI PRs.”
We also have this rule that is applicable not just to LLM contributions, but also human contributions, but it’s a rule of “keep the PR small,” as if you were creating it without LLMs. Because using AI for producing documentation is not an excuse to flood the team with pull request reviews. You don’t have to produce 10x. Do it as you would do it.
The idea is that AI is augmenting your capabilities, acting as a co-writer, but you shouldn’t use it to dump lots of slop, as you call it, on the tech writing team. We’ll see if people will follow it. So far, so good. But I think this kind of helping people through guidelines, policies, and tooling is a way of channeling that trend a bit.
Tom: I personally am mixed about always flagging that I’ve used AI when I’m writing docs and requesting people to review them. Here’s why I’m mixed: I would pretty much be adding this little statement to almost every PR (we call them CLs for changelists) that I send out. A lot of times I am using AI, but it’s not like I’m just pressing a button and copying and pasting. I’m steering it through specific tasks and changing and tweaking and editing. So by the time I have something done, it’s been vetted and reviewed by me. But yeah, some people have called out in the past, like, “Hey, was this AI written?” and then “You should have flagged that.” I’m like, “Okay. Yeah. Sorry about that.” But it’s very uncommon.
Fabrizio: Well, I feel that. In the sense that, for example, if I use autocomplete, does that qualify as something you should disclose? Probably not. Or I do a content conversion, a comparative Markdown table to another format or whatever. Are those sufficient changes? Do they warrant a disclosure? Probably not. But I would say the golden rule is if you feel unsure about what the LLM has done, then flag it, because then you will have an extra set of eyes checking that together with your own criteria.
Tom: Yeah, it probably gets people to review things more carefully. And for sure, AI is great at bluffing and just coming across as super confident about something. If you’re a reviewer, it’s easy to rubber-stamp something because it just looks right.
So I was curious. There was another post you wrote that we’ve been talking about, but we haven’t actually called out explicitly. You said, “Why I built an MCP server to check my docs and what it taught me.” In this, you mentioned a few things. Now we’re talking about authoring. You mentioned a few things that you are hoping to do, like self-healing documentation. We already talked about checking docs for style.
I’m curious to know, because it’s always great to hear perspectives from people outside of my company, how’s the rest of the world using AI? What is your typical AI workflow? Are you using Claude Code? Are you using other tools? Are you using the CLI-type stuff?
Fabrizio: Well, I’m using Claude Code for coding. I’m one of the contributors of our documentation tooling at work. So I use Claude Code. It’s the best model for coding, really. But I just use it for coding.
For documentation, I’m using Cursor, usually, or Copilot. Maybe 60% of what I do is just autocomplete. I mean, it’s that good. For quick formatting, it’s fantastic. For example, you are editing headings in a long document. With a couple of examples, the autocomplete will be able to get what you’re trying to do. And then it’s just hitting the tab key and going through the doc to apply all those changes.
Other times I use it, for example, to… one of the things I do is, I point it to some changes in the code. It’s all open-source projects. And I tell it, “Look, this changed. And we have this README file. And we have these other docs that mentioned these things.” So I provide all these sources. I did this the other day. And I told it, “Create a document similar to these other documents that we have.” So I was already pointing at an existing structure, but with these caveats. I was essentially just instructing it as if I was telling an intern what to do.
And it created a very decent draft. Of course, it also had the agent instructions, so it wasn’t starting from scratch. But when it comes to following a very structured pattern and you provide it with good source information and context, it does a decent job, like 70% of the time. And then you jump in, you edit. That’s the most generative thing I’ve done. And then, well, I’m experimenting with adding Copilot, again with agent instructions, as a reviewer. But my hope is to leverage the MCP server and Veil to do that in a way that is a bit more deterministic.
Tom: There’s a lot of interest in Gemini CLI because of its ability to plug in more MCP servers.
Fabrizio: I guess you cannot tell us when 3.0 is going to go live. I can’t wait.
Tom: I have no idea. But I was really comparing Gemini CLI against Gemini Code Assist, which is like the side pane. Internally, it’s not called that, so I don’t know what the exact differences are externally. But basically, the form factor of using a CLI versus using a side pane was interesting to compare.
A lot of people are really on fire with the CLI because they’re like, “Man, I can give it this MCP and it can build my code and I can do this and that.” Whereas the side pane seems a little more fixed. But at the same time, the CLI seemed so challenged in terms of presenting a file diff about what it had changed, because you’re kind of trapped in the command line area and it’s hard to show visually what’s changed and give you little buttons to accept or reject. So I actually prefer the side pane because I want to go through and see the changes. I want to highlight the changes and be like, “Well, why did you change that? Persuade me on this,” or “Tell me why you deleted this block,” and have a conversation.
Fabrizio: Yeah, for docs, for docs for sure. I mean, it’s the same for me. The CLI is great for code because it’s better at calling tools, for example. If you’re developing a Node application, it’s just way more efficient. For some reason, it’s more efficient at calling NPM or whatever. But for docs, I really want to see the docs changing in real time while it does the thing.
Tom: Yeah, it’s nice to have multiple tools at your disposal. It sounds like you do quite a bit of coding. I don’t really do much coding. I’m mostly just working in the documentation space. I should do more coding, perhaps. Just to test out everything, but we’ve got QA people doing that too, so it often feels redundant.
At any rate, I like exploring different tools and I realized something else in comparing these tools. I was trying to provide a list of what tasks and capabilities you had and how they compared between the CLI tool versus the Code Assist tool. And I realized this is something we touched on earlier: what a tool can do highly depends on how it’s been customized, what extensions or MCP servers it has available to it, what prompts you give it, what other context you give it. Just so many customizations alone make it difficult.
Can it read a certain file? Can it read bugs? Can it read the comments on the bugs? It’s like, “Well, it depends if you have this extension. And if you do, did you start the tool with that extension enabled? And what were the settings in that extension?” It’s impossible.
Fabrizio: Yeah. And that I think makes it a little difficult to also even just teach or explain these technologies. Because it’s again, you are similar to what happens with front-end development where you have the older framework, some of the tools, and all the… “Are you using Node 24, 22, whatever?” Now with LLMs and agents, the first question is like, “Oh, so what CLI are you using? What IDE are you using? Have you enabled this, or are you using this other server? What model are you using?”
It’s a pure combinatorial nightmare where you have to figure out what the best combo is. Again, I think it’s a sign of how young this technology is. But eventually, this also shall stabilize a bit, hopefully.
Tom: Yeah. One other variable is the agent markdown file, right? It’s like, “Well, your output might be totally different depending upon what agent markdown file is instructing it to do.” So this idea that you have, or that you brought up, of how do we teach others to use AI tools effectively and empower them? It’s a formidable challenge because you have to communicate all this info and the setup, and then sample prompts and what you can hope to get back and so on.
Fabrizio: And then the model changes, as Anandi said, and you’re back to square one.
Tom: Yeah. Somebody compared the models to a family of friends, and you never quite know who you’re getting. Maybe you get the really smart aunt one day, and the next day you get like a dumb uncle. They’re all kind of highly related, but sometimes you get a much more capable session. I’m always toggling the model. Sometimes it’s giving me bogus info and I’m like, “I had the wrong model selected.” Change it back to Pro.
Fabrizio: Yeah, or you literally read comments in Slack, at work, or on Hacker News like, “Claude is having a bad day. Maybe they were partying last night.” I don’t know, but it’s just the reality we’re living in.
Tom: Yeah. Well, Fabrizio, I know you’re gearing up for your Write the Docs talk, so I hope that goes well. I’m sure you’re going to be a very popular speaker there. How many people are at that conference? Like 100?
Fabrizio: So Eric, the organizer, told me there were like 200 in person. Yeah, and lots of online tickets as well. It’s going to be fun.
Tom: 200, my goodness. Okay, yeah, well, thanks for carving out some time to do this podcast. Really appreciate it. Any last parting thoughts before we wind up here and close up?
Fabrizio: Well, I loved having Anandi in the show and again, I would probably reinforce the message that if you want to join us, right? I think we are quite open to that.
Tom: Yeah, it was fun. It was great to get a fresh voice and see enthusiasm and real expertise. I mean, she’s smart. She’s got a PhD. She’s definitely in this space, brings a different perspective. And that’s part of the interest for me in this: just to get outside my work environment and space and people and see how others are using these tools, what they’re doing.
So yeah, definitely we want to have more guests. We want to bring them on the first half of the show so that it’s not just about that guest and topic, it’s still broader, but it still integrates this multiplicity of voices and perspectives. So yeah, reach out to us, and also if you have topics that you want us to cover, let us know, and we’ll consider it. So thanks.
All right, thanks, Fabrizio, and good luck.
Fabrizio: Thanks, Tom.
About Tom Johnson
I'm an API technical writer based in the Seattle area. On this blog, I write about topics related to technical writing and communication — such as software documentation, API documentation, AI, information architecture, content strategy, writing processes, plain language, tech comm careers, and more. Check out my API documentation course if you're looking for more info about documenting APIs. Or see my posts on AI and AI course section for more on the latest in AI and tech comm.
If you're a technical writer and want to keep on top of the latest trends in the tech comm, be sure to subscribe to email updates below. You can also learn more about me or contact me. Finally, note that the opinions I express on my blog are my own points of view, not that of my employer.


