AI for Feedback Over Generation // BrXnd Dispatch vol. 46
A conversation with Nathan Baschez from Lex
You’re getting this email as a subscriber to the BrXnd Dispatch, a (roughly) weekly email at the intersection of brands and AI. Wednesday, May 8, was the second BrXnd NYC Marketing X AI Conference, and it was amazing. If you missed it, I have shared my talk and Tim Hwang’s talk, with more coming soon.
I’ve been having fun doing some interviews lately (please let me know if you’re having fun reading them) and one of the people I was excited to talk to was Nathan Baschez. I’m not exactly sure where I first ran into Nathan, but I assume it was in the early days of Every, the company/publisher he co-founded where he also wrote a strategy newsletter Divinations. If you’ve known me for a while, you know that I love digging into the history of business frameworks (my dive into the 2x2 is probably my personal favorite). So when I saw that Nathan also enjoyed this nerdy pastime, I knew we saw eye-to-eye.
Nathan ran product at Every and a few years in decided to spin out Lex, an AI writing tool he was working on internally. What I find interesting about Lex’s mission is that it’s less focused on generating writing than editing/feedback. Focusing on reaction over generation is something I’ve found helpful as a way to flip a brand’s thinking on where to focus their energy with AI. Anyway, I asked Nathan if he’d be up for a conversation and we did one a few weeks ago. It was thirty minutes and this is a lightly edited transcript. Hope you enjoy it!
Noah Brier: Why don't we start by giving me the five-sentence version of Lex.
Nathan Baschez: Sure. It's a word processor that lives on a website. You can send a link to a document to anybody and collaborate on it with them. But also, there's AI built in so you can collaborate with AI if you don't have anybody in the moment who is down to help. The feedback, advice, and edits that the AI gives are getting better by the day, given the progress in models and our progress in figuring out how to incorporate those models in a way that's useful to writers. The goal is just to be a really amazing creative tool that hopefully can become a standard for writers, editors, and teams, whether that's for public-facing editorial content, marketing content, academic papers - anything really where you want to string together the right sequence of words and put effort into making it really good.
Noah: On the small team thing, before we get into the bigger conversation, it's cool how you guys spun it out. It seems like you've kept the mindset of being writers first. Can you talk a little about how it was born and why you come at it from a writer-/editor-first perspective, not software first?
Nathan: That's very much right. I do think that's one of the things that makes Lex unique amongst the many people building AI stuff these days. Lex is very much writing first and AI to the extent it's useful, rather than trying to make AI useful.
The origin of Lex is that most of my career has been at the intersection of media and tech, but on the tech side, like working at Gimlet on the tech side building stuff to support audience growth. I was the first employee at Substack designing and programming, serving writers but not employing them directly.
But before Lex, I started a company called Every with my co-founder Dan Shipper. The goal was to create a "magazine" that was kind of like a bundle of Substacks under one umbrella. We were really focused on commentary on big business and tech, analysis, and such.
Because most of my career has been on the tech side, I'm used to using tools like GitHub, Figma, and IDEs for programming. I found that the quality of tools, workflows, and patterns that had evolved in building software was much better than what I could figure out most people did with writing.
The writing tools available didn't support much beyond scattered versions and one main doc with conflicting tracked changes—it was a mess. Imagine if when you're writing code in a PR [pull request], you're stuck in the diff view and can't get out—that's kind of what it is with suggesting changes for writers. There were so many small ways to improve a writing tool, even without AI.
Then separately, in the fall of 2022, I was curious what AI could do as the models kept getting better, but I didn't know how it could be useful since we weren't using AI in our writing process at Every.
So Lex started as a side project with Dan because I was annoyed at Google Docs. The first version was really simple: the big flashy feature was if you typed +++, the AI would generate a paragraph of what could come next. This was just a reflection of the models at the time, pre-ChatGPT, where it was just text-in, text-out.
The problem it solved was when you're writing and stuck, it's useful to have an idea of something to say next, even if you don't go with it, because it sparks ideas. It went way better than I imagined because GPT-3 was way better than anybody realized at the time and this was the first time a lot of people played with it.
From there, I loved working on it so much that we decided to spin it out and make it its own company. We've been building it into a more full-featured word processor but still feel like we're 2% of the way towards the features we've imagined. As we keep going we keep imagining more so it's fun - I don't see us running out of ideas anytime soon.
Noah: Do you think people understand how good the models are now? Your comment about GPT-3 is funny to me because sometimes I wonder the same thing about GPT-4. Do you think they've wrapped their heads around it yet or is there still an information gap?
Nathan: I definitely think people don't realize. There are kind of two groups. The enthusiasts mostly understand how good they are and maybe even overestimate how good they are to an extent. It's kind of like Gell-Mann Amnesia—you ask it about something you know deeply and can see the ways it's wrong, then you ask about something else and you're like "Oh that's cool, didn't know that." So enthusiasts with access to the latest models may overestimate a bit.
But for everyone else who's not an existing AI enthusiast using the latest models all the time, I think most people don't understand how good the models are, especially writers. There are questions about whether it's plagiarism, how would I even use it, is it going to write the draft for me which will probably suck? So there's not a lot of understanding of what it would be useful for.
We're in a situation where Lex is really competing more with Google Docs and to some extent Word than other AI tools, which is great. Even though Google Docs has some AI stuff built in, people don't really use it. Our mission is figuring out how to actually make it useful to writers, which is partly about building the right features and integrating it the right way, but also about figuring out what's useful and telling people "Hey, you could use it this way and it works."
The answer is almost never to just have it write the thing for you that you publish. It's not like an image generator, where you type a prompt, get an image, and maybe modify the prompt if you don't like it. With writing, the workflow is totally different.
I think there's a lot more precision required. I have medium confidence that for writing, when it's a little bit wrong, it feels very wrong. If you're asking it to write about something based on specialized knowledge you have and it's a little off, it's almost not useful at all and takes a lot of work to fix. For things with tone, it can't come close to the tone you want. You could never actually publish it under your name.
There's also the plagiarism issue. There aren't many corners of the world where you really want to publish something written by AI and say you did it. Maybe in some corners where you're taking in SEC information and generating a little article and you're transparent that it was AI generated—I don't think that's unethical. But the whole plagiarism thing is real.
Whereas for images, there are more use cases, way more so than writing, where it's stylistically in the ballpark but maybe not what you would do because you couldn't come close to doing that anyway. There's less precise information that could be wrong, besides things like finger placement. A lot of times the point is just to illustrate some simple thing—"What if Spotify made a Walkman?"—it kind of looks like a Walkman with a Spotify logo, you're not saying you did it, it's clearly AI-generated, not plagiarism, and you're not really telling a story, just visualizing a thing you thought of.
I think the job to be done of images and text is where a lot of the differences stem from in terms of how okay it is for it to not be 100% faithful to what you want. Obviously for visual artists, The New Yorker covers, movie posters, etc., the same reasons you wouldn't want to just use AI for text apply. But there are more use cases where you don't have that specificity required.
I think there are also fewer people who can create imagery at all compared to a lot of people who can hack together writing, and that's kind of okay. It may take more work than they'd like, but we all write emails, text messages, etc. Writing is more broadly distributed, so more people can do it a little bit themselves, whereas, with images, a lot of people have no idea where to start and can't get close to what AI can do.
Noah: One thing I've been thinking about is why AI is so bad at tone.
Nathan: I think it's because tone is a product of context: who am I writing for, who do I want to reach, what specific language and idioms do we use that signifies we're part of a shared culture? If you just type "generate a post about XYZ" with none of that context, it's going to take a generic "view from nowhere" tone, sort of like the tone of a local evening news story. It's more a lack of tone than a specific tone, or it's the generic tone they used in the RLHF [reinforcement learning from human feedback] process when they trained the people generating the data to aim for a certain style.
But I think it will do a better job at tone going forward, especially as products find ways to incorporate things like if you have a company style guide. It's not going to tell you if it's a great piece of writing overall, but it can tell you at a tactical level if there are pieces that don't match - like if Disney's style guide says to use "cast members" not "employees." Those kinds of things are easy for an AI if you just tell it what to do.
For fuzzier, more judgment-oriented things, you can give it a bunch of examples. You could have a fine-tuned model for a very specific type of thing. Like when I worked at General Assembly, part of our brand style guide was that there's a lot of credibility and seriousness but there's a "wink" every once in a while. Hard to define exactly what a "wink" meant, but we could show 15 examples of a "wink" in our past copy. Today's models will do an amazing job with 15 examples and a description of what a "wink" means.
I think AI will get there on tone, but it's better in the short run to isolate it to focus on one specific element of your style versus a holistic "generate a post in our voice and tone." It's a better tool for making sure content from 50 different freelancers has less variance in fitting your tone, even if it's not going to outright produce articles at the level you need.
Noah: Something I've been circling around is this idea that maybe the bottleneck for a lot of teams isn't generating more insights or content, it's packaging those insights, selling them in, and all the other things you need to do with them. The real bottleneck is often the feedback process. Why haven't more people recognized there's so much power on the feedback side of things with AI? What's going on there and why do you think it would be better at telling you if a piece of content fits a style guide than writing a piece of content in that style?
Nathan: I find the AI feedback incredibly useful but I think what happens is the standard we use to judge a human's ability to give good feedback is totally different than for AI. If you sent me a draft and I left a few comments that indicated I had no idea what you were going for or who you were trying to speak to, you'd discount my feedback, like "he doesn't get it."
But with an AI model, because of how its intelligence is created so differently from humans, it might have 3-4 things out of every 10 that are actually really helpful but also 3-4 real stinkers that any capable human editor would never say. There's a wider variety of quality it's going to give you compared to humans - a human's feedback is either all kind of high quality or all kind of low quality.
So with AI, I think about it more like I'm generating a bunch of stuff that's going to have varying degrees of quality, but every time there's some stuff in there that matters to me a lot. Even if it doesn't directly tell me how to improve something, it's useful.
Like if I'm looking for ideas on how to improve an intro and make it transition better to the body, it might say, "you don't talk enough about the big idea, you need to make it more clear at the top." That's valid. Then it might give three example sentences that suck. People see those sentences and think it's not useful, but actually, if you think more critically about what it's giving you and what you would do with that insight versus what it literally outputted, it can be really helpful, almost like a checklist.
You can develop your own checklist for what you like in writing, what you like about your best pieces or writing you admire, create a prompt based on that, and have the AI use it against your draft. Guaranteed you're going to get really valuable stuff each time. Is the hit rate going to be as good as a really talented human editor? No, but it's a lot more convenient and easy to access.
It's also hard to find a really good human editor for a number of reasons. People are worried about hurting your feelings when they edit you, so often they give you terrible feedback or no feedback at all, just smile and say, "it's good" when it's terrible instead of telling you the thing you need to hear. Some of the best feedback I've ever gotten is "I have no idea what you're talking about." Sometimes that's what you need.
Part of what's interesting about AI feedback is it doesn't care who you are, it's not worried about hurting your feelings or telling the boss their writing is bad. It just gives it to you straight.
Noah: I've been obsessed with the idea that the build vs. buy decision is shifting with AI. The models are so powerful that maybe it makes more sense to build rather than buy. Does that resonate with you? How does that intersect with the in-housing trend you've been part of?
Nathan: On the in-housing piece, it's driven by concerns about AI. One big concern is whether your data is being used to train AI that could benefit your competitors. With in-housing, people are inside the client's organization on their systems, not in an external agency where you don't know what's happening with the data.
Humans can complicate technology. If you can create 100 versions of an ad in 2.5 minutes but it takes 2 weeks to present them and get feedback, or if a creative writer needs a month to add their special touch, you lose the speed advantage. The proximity of having people inside client organizations provides strength.
On the bigger picture, what brands and clients look for has changed significantly in the last 18 months. Initially there was excitement about using AI models and making sure you could bring your own models and not be locked in. Now they're asking, "what should we be doing?" We set up a "brand tech brain" to lead and control this and it's really played out over the last 18 months.
Noah: For people, especially at brands, who are trying to wrap their head around this and feel like they're staring over the edge of a cliff, what advice are you giving them on where to go next?
Nathan: This is going to be the single biggest disruption I've seen in my career, bigger than the internet and mobile. You can't do anything to stop it, so jump in. Play with every tool, experiment, and get hands-on. People are scared of technology and job loss, but history shows that's not how it plays out.
I asked ChatGPT and Anthropic to write essays on why humans shouldn't worry about AI taking their jobs and it's fascinating to see the different perspectives.
Experimenting is easier than with something like social media because you don't have to post it for the world to see. The speed of change is unprecedented. There's a quote that we always overestimate what can happen in 2 years and underestimate what can happen in 10 - maybe not this time. Every major tech company is spending billions on AI saying it will change everything. It might become a self-fulfilling prophecy.
As long as you're experimenting and staying engaged, you'll be fine. You can't know everything, there's too much happening too fast. Normal humans might know 3 things you don't and you'll know 3 things they don't. It's about not being paranoid and making sure you're aware of the developments.
That’s it for now. This was a long one. Thanks for reading, subscribing, and supporting. As always, if you have questions or want to chat, please be in touch.
Thanks,
Noah