I can highly recommend these talks to get your eyes slightly opened to how stuck we are in a local minima.
Whether you call yourself an engineer, developer, programmer, or even a coder is mostly a localized thing, not an evaluation of expertise.
We're confusing everyone when we pretend a title reflects how good we are at the craft, especially titles we already use to refer to ourselves without judgement. At least use script kiddie or something.
The vibe coders can deliver on happy path results pretty fast but I already have seen within 2 months it starts to fall apart quick and has to be extensively refactored which ends up ultimately taking more time than if it was done with quality in mind in the first place
And supposedly the free market makes companies “efficient and logical”
I think those “fall apart in 2 months” kinds of projects will still keep happening, but some of us had that experience and are refining our use of the tools. So I think in the future we will see a broader spread of “percent generated code” and degrees of success
No code -> no software.
Certainly there’s no simple F(num_lines_changed) value function. There are many other parameters. But to suggest, as many here somehow do, that lines of code touched is independent to effective development, is plain ludicrous.
Vibe coding is going to make this so much worse; the tech debt of load-bearing code that no one really understands is going to be immense.
A question: what if all those activities are to build a feature that will harm user retention or a product no one wants?
A follow-up question: what if we could have known that up front, or there was a simple way to learn that?
Because so often we build stuff that shouldn't have been built in the first place (appalling startup success rate is probably a good enough statistical measure of that). And yes, there are ways to learn that we're building the wrong thing, other than building a fully-fledged version of it.
Two/three months to code everything ("It's maximum priority!"), about four to QA, and then about a year to deploy to individual country services by ops team.
During test and deploy phases, the developers were just twiddling thumbs because ops refused to allow them access and product refused to take in new projects due to possibility of developers having to go back to code.
It took the CEO to intervene and investigate the issues, and the CTO's college best friend that was running DevOps was demoted.
This is often a CTO putting pressure on a dev manager when the bottleneck is ops, or product, or putting pressure on product when the bottleneck is dev.
The normal rationalization is that "you should be putting pressure on them".
The actual reason is that they are putting pressure on you as a show of force, rather than actually wanted it to go faster.
This is why the only response to a bad manager is to run away.
Just to quote one little bit from the piece regarding Google: "In other words, there have been numerous dead ends that they explored, invalidated, and moved on from. There's no knowing up front."
Every time you change your mind or learn something new and you have to make a course correction, there's latency. That latency is just development velocity. The way to find the right answer isn't to think very hard and miraculously come up with the perfect answer. It's to try every goddamn thing that shows promise. The bottleneck for that is 100% development speed.
If you can shrink your iteration time, then there are fewer meetings trying to determine prioritization. There are fewer discussions and bargaining sessions you need to do. Because just developing the variations would be faster than all of the debate. So the amount of time you waste in meetings and deliberation goes down as well.
If you can shrink your iteration time between versions 2 and 3, between versions 3 and 4, etc. The advantage compounds over your competitors. You find promising solutions earlier, which lead to new promising solutions earlier. Over an extended period of time, this is how you build a moat.
With LLMs, you can type so much faster! So we should be going faster! It feels faster!
(We are not going faster.)
But your definition, the right one, is spot on. The pace of learning and decisions is exactly what drives development velocity. My one quibble is that if you want to learn whether something is worth doing, implementing it isn't always the answer. Prototyping vs. production-quality implementation is different, even within that. But yeah, broadly, you need to test and validate as many _ideas_ as possible, in order take make as many correct _decisions_ as possible.
That's one place I'm pretty bullish on AI: using it to explore/test ideas, which otherwise would have been too expensive. You can learn a ton by sending the AI off to research stuff (code, web search, your production logs, whatever), which lets you try more stuff. That genuinely tightens the feedback loop, and you go faster.
I wrote a bit more about that here: https://tern.sh/blog/you-have-to-decide/
That’s what slows me down with AI tools and why I ended up sticking with GitHub Copilot, which does not do any of that unless I prompt it to
It’s very rare to not touch up code, even when writing new features. Knowing where to do so in advance (and planning to not have to do that a lot) is where velocity is. AI can’t help.
We could go with that perception, however, only if we assume that whatever is in the backlog is actually the right thing to build. If we knew that every feature has value to the customers and (even better) they are sorted from the most valuable to the least valuable one.
In reality, many features have negative value, i.e., they hurt performance, customer satisfaction, any key metric a company employs.
The big question: can we check some of these before we actually develop a fully-fledged feature? The answer, very often, is positive. And if we follow up with an inquiry about how to validate such ideas without development, we will find a way more often than not.
Teresa Torres' Continuous Discovery Habits is an entire book about that :)
One of her recurring patterns is the Opportunity Solution Tree, which is a way of navigating across all the possible experiments to focus on the right ones (and ignore, i.e., not develop, all the rest).
Maybe the real skynet will kill us with ticking time bomb software bugs we blindly accepted.
GPT-2 was barely capable of writing two lines of code. GPT-3.5 could write a simple code snippet, and be right more often than it was wrong. GPT-4 was a leap over that, enabling things like "vibe coding" for small simple projects, and GPT-5 is yet another advancement in the same direction. Each AI upgrade brings forth more capabilities - with every upgrade, the AI can go further before it needs supervision.
I can totally see the amount of supervision an AI needs collapsing to zero within our lifetimes.
Because they generate so much code, that often passes initial tests, looks reasonable, and fails in nonhuman ways, in a pretty opinionated style tbh.
I have less context (and need to spend much more effort and supervision time to get up to speed to learn) to fix, refactor, and integrate the solutions, than if I was only trusting short few line windows at a time.
That is because you are trained in the old way to writing code: manual crafting of software line by line, slowly, deliberately, thoughtfully. New generations of developers will not use the same workflow as you, just like you do not use the same workflow as folks who programmed punch cards.
The only way these tools can possibly be faster for non-trivial work is if you don't give a shit enough about the output to not even read it. And if you can do that and still achieve your goal, chances are your goal wasn't that difficult to begin with.
That's why we're now consistently measuring individuals to be slower using these tools even though many of them feel faster.
This is /especially/ true in software in 2025, because most products are SaaS or subscription based, so you have a consistent revenue stream that can cover ongoing development costs which gives you the necessary runway to iterate repeatedly. Development costs then become relatively stable for a given team size and the velocity of that team entirely determines how often you can iterate, which determines how quickly you find an optimal solution and derive more value.
This has been my experience as well :/
The current trend in anti-vibe-coding articles is to take whatever the vibe coding maximalists are saying and then stake out the polar opposite position. In this case, vibe coding maximalists are claiming that LLM coding will dramatically accelerate time to market, so the anti-vibe-coding people feel like they need to claim that development speed has no impact at all. Add a dash of clickbait (putting "development speed" in the headline when they mean typing speed) and you get the standard LLM war clickbait article.
Both extremes are wrong, of course. Accelerating development speed is helpful, but it's not the only factor that goes into launching a successful product. If something can accelerate development speed, it will accelerate time to market and turnaround on feature requests.
I also think this mentality appeals to people who have been stuck in slow moving companies where you spend more time in meetings, waiting for blockers from third parties, writing documents, and appeasing stakeholders than you do shipping code. In some companies, you really could reduce development time to 0 and it wouldn't change anything because every feature must go through a gauntlet of meetings, approvals, and waiting for stakeholders to have open slots in their calendars to make progress. For anyone stuck in this environment, coding speed barely matters because the rest of the company moves so slow.
For those of us familiar with faster moving environments that prioritize shipping and discourage excessive process and meetings, development speed is absolutely a bottleneck.
We have literally one half-hour-long sync meeting a week. The rest is as lightweight as possible, typically averaging below 10 minutes daily with clients (when all the decisions happen on the fly).
I've worked in the corpo world, too, and it is anything but.
We do use vibe coding a lot in prototyping. Depending on the context, we sometimes have a lot of AI-agent-generated code, too.
What's more, because of working on multiple projects, we have a fairly decent pool of data points. And we don't see much of speed improvement from a perspective of a project (I wrote more on it here: https://brodzinski.com/2025/08/most-underestimated-factor-es...).
However, developers sure report their perception of being more productive. We do discuss how much these perceptions are grounded in reality, though. See this: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o... and this: https://substack.com/home/post/p-172538377
So, I don't think I'm biased toward bureaucratic environments, where developers code in MS Word rather than VS Code.
But these are all just one dimension of the discussion. The other is a simple question: are there ways of validating ideas before we turn them into implemented features/products?
The answer has always been a wholehearted "yes".
If development pace were all that counted, Googles and Amazons of this world would be beating the crap out of every aspiring startup in any niche the big tech cared about, even remotely. And that simply is not happening.
Incumbents are known to be losing ground, and old-school behemoths that still kick butts (such as IBM) do so because they continuously reinvent their businesses.
I like this metaphor. Looking at a map, we may get a pretty good understanding of whether it's a place we'd like to spend time, say, on vacation.
We don't physically go to a place to scrutinize it.
And we don't limit ourselves to maps only. We check reviews, ask friends, and what have you. We do cheap validation before committing to a costly decision.
If we planned vacations the way we build software products, we'd just go there (because the map is not the territory), learn that the place sucks, and then we'd complain that finding good vacation spots is costly and time-consuming. Oh, and we'd mention that traveling is a bottleneck in finding good spots.
Check out all of the bullshit “AI” companies that YC is funding.
BigTech is not “loosing ground” all of them are reporting increasing revenues and profits.
Or on a smaller scale, what's the last genuine Attlassian success?
Yet, when it comes to product innovation, the momentum is always on the side of the new players. Always has been.
Project management/work organization software? Linear. Async communication? Slack. Social Media? TikTok. One has to be curious how Zoom is doing so well, given that all the big competition actually controls the channels for setting up meetings. Self-publishing? Substack. Even with AI, everyone plays catch-up with Sam Altman, and many of the most prominent companies are newcomers.
We could go on and on.
Yes, Big Techs will survive because they have enough momentum to survive events such as the Balmer-era MS. But that doesn't mean they lead product innovation.
And it's expected. Conflicting priorities, growing bureaucracies, shareholders' expectations, old business lines (and more), all make them less flexible.
An innovative product is one where customers in aggregate are willing to pay more for it than it costs to create and run. Any idiot can sell a bunch of dollar bills for 95 cents.
Going back to the latest batch of YC companies, there value play can easily be duplicated by any company in their vertical either by throwing a few engineers on it or creating a statement of work for the consulting company I work for and I can pull together a few engineers and knock it out in a few months and they will already have customers to sell it to.
There was one recent YC company (of course one of the BS AI companies) that was a hiring a “founding full stack engineer” for $150K. It looks like they were two non technical “serial entrepreneurs” without even an MVP that YC threw money at.
You can’t imagine how many times some hair brain underfunded startup reached out to me to be a “CTO” that paid less than I made as a mid level employee at BigTech with the promise of Monopoly money “equity”.
Claude crapped out a workable landing page in ~30 seconds of prompting. I updated the copy on the page, total time less than an hour.
The odds of me spending more than an hour just picking a color theme for the page or finding the SVG icons it used is pretty much 100%.
------------
I had a bug in some async code, it hit rarely but often enough it was noticeable. I had narrowed down what file it was in, but after over an hour of staring at the code I wasn't finding it.
Popped into cursor, asked it to look for async bugs in the current file. "You forgot to clean up a resource on this line here."
Bug fixed.
------------
"Here is my nginx config, what is wrong with the block I just added for this new site I'm throwing up?"
------------
"Write a regex to do nnnnnn"
------------
"This page isn't working on mobile, something is wrong, can you investigate and tell me what the issues may be?"
Oh that won't go well, all of the models get super confused about CSS at some point and end up in doom spirals applying incorrect fixes again and again.
> Googles and Amazons of this world would be beating the crap out of every aspiring startup in any niche the big tech cared about, even remotely. And that simply is not happening.
This is already a well explored and understood space, to the extent that big tech cos have at times spun teams off to work independently to gain the advantage of startup-like velocities.
The more infra you have, the more overhead you have. Deploying a company's first service to production is really easy, no infra needed, no dev-ops, just publish.
Deploying the 5th service, eh.
Deploying the 50th service, well by now you need to have a host of meetings before work even starts to make sure you aren't duplicating effort and that the libraries you use mesh with the department's strategic technical vision. By the time those meeting are done, a startup will have already put 3 things into prod.
The communication overhead within large orgs is also famously non-linear.
I spent 10 years working at Microsoft, then 3 years at HBO Max (lean tech company 200 engineers, amazing dev ops), and now I'm working at startups of various sizes.
At Microsoft, pre-Azure it could take weeks just to get a machine provisioned to test an idea out on. Actually getting a project up and running in a repo was... hard at times. Build systems were complex, tooling was complex, and you sure as hell weren't getting anything pushed to users without a lot of checks in place. Now many of those checks were in place for damn good reasons, wrongly drawn lines on a map inside Windows is a literal international incident[1], and we had separate localizations for different variants of English around the world. (And I'd argue that Microsoft's agility at deploying software around the entire world at the same time is unmatched, the people I worked with there were amazing at sorting through the cultural and legal problems!)
Also if Google launches a new service and it goes down from too much traffic, it is embarrassing. Everything they do has to be scalable and load balanced, just to avoid bad press. If a startup hits the front page of HN and their website goes down from being too popular, they get to write a follow up blog post about how their announcement was so damn popular their site crashed! (And if they are lucky, hit the front page of HN again!)
The differences in designing for levels of scale is huge.
At Microsoft it was "expect potentially a billion users" At HBO it was "Expect tens of millions of users", at many startups it is "If we hit 10k users we'll turn a profit and we can figure out how to scale out later."
10K DAU is a load balances and 3 instances of NodeJS (for rolling updates) each running on a potato of a CPU.
> So, I don't think I'm biased toward bureaucratic environments, where developers code in MS Word rather than VS Code.
I've worked in those environments, and the level of engineering quality can be much higher. The number of bugs that can be hammered out and avoided in spec reviews is huge. Technology designs that end up being servicable for years to decades instead of "until the next rewrite". The actual code tends to flow much faster as well, or at least as fast as it can flow in the large sprawling code bases that exist at big tech companies. At other times, those specs are needed so that one has a path forward while working through messy legacy code bases.
Both styles have their place - Sometimes you need to iterate quickly and get lots of code down and see what works, other times it is worth thinking through edge cases, usage scenarios, and performance characteristics. Heck I've done memory bus calculations for different designs, when you are working at that level you don't just "write code and see what works", you first spend a few days (or a week!) with some other smart engineers and try to narrow down the potential field of you should even be trying to do!
[1]https://www.upi.com/Archives/1995/09/09/Microsoft-settles-In...
Paul Buchheit's stories about Gmail and AdSense are good examples. I was an early Gmail user when it was invitation-only and invitations were scarcely distributed (only as fast as the infrastructure could handle).
So, while I understand the difference in PR costs, it's not like they don't have tools to run smaller experiments.
I agree with the huge bureaucracy cost. On the other hand, they really have (relatively) infinite resources if they care to deploy them. And sometimes they do. And they still fail.
They often fail even when they try a Skunk Works-like approach. Google Wave was famously developed as a corporate Lean Startup (before there was Lean Startup). It was a disaster. Precisely because they did close to zero validation pre-release.
A side note, a huge flop it was (although Buzz and Google+ were bigger), it didn't hurt them long term in PR or reputation.
People criticize Microsoft's historical fiefdom model, and it had its issues, but it also allowed orgs to find what worked for them and basically run independently. Of course it also had orgs fighting with each other and killing off good products.
Xbox was also a skunk works project at Microsoft (a few good books have been written about it!) and so was Microsoft Band. Xbox succeeded, Band failed for a number of reasons not related to the product or execution itself. (Politics and some historical corporate karma).
IMHO the only company good at deploying infinite resources quickly is Apple. 1 billion developing the first Apple Watch (Microsoft spend under 50 million on two generations of Band!) and then they kept going after the market, even though the first version was kinda meh. In comparison Google wear was on again of again for years until they finally took it seriously recently. I'm sure they spent lots of $, but the end result is nowheres near what Apple pulled off.
Perhaps I've just misunderstood the point, but it seems like a nonsensical argument.
Do we always have to build it before we know that it will work (or, in 9 cases out of 10, that it will not work)?
Even more so, do we have to build a fully-fledged version of it to know?
If yes, then I agree, development is the bottleneck.
The effort it takes to implement a feature makes is more likely you think twice before you start.
If the effort goes to zero, so does the thinking.
We will turn from programmers to just LLM customers sooner or later.
Because testing if it works can be done by none programmers
It is telling that, while the article's theme is product management (and its relationship with the pace of development), that context is largely ignored in some comments. It's as if the article's scope was purely what happens within the IDE and/or AI agent of choice.
The whole point is that the perspective necessarily should be broader. Otherwise, we make it a circular argument, really: development is a bottleneck of development.
Well, hard to disagree on that.
The whole Lean Startup was about figuring out how to validate ideas without actually developing them. And it is as relevant as ever, even with AI (maybe, especially with AI).
In fact, it's enough to look at the appalling rate of product success. We commonly agree that 90% of startups fail. The majority of that cohort have built things that shouldn't have been built at all in the first place. That's utter waste.
If only, instead of focusing on building more, they stopped and reevaluated whether they were building the right thing in the first place. Yet, most startups are completely immersed in the "development as a bottleneck" principle. And I tell that part from our own experience of 20+ years of helping such companies to build their early-stage products. The biggest challenge? Convince them to build less, validate, learn, and only then go back to further development.
When it comes to existing products, it gets even more complex. The quote from Leah Tharin explicitly mentions waiting weeks/months of wait till they were able to get statistically significant data. What follows is that within that part of experimentation, they were blocked.
Another angle to take a look at it is the fundamental difference in innovation between Edison/Dyson and Tesla.
The first duo was known for "I have not failed. I found 10,000 ways that don't work." They were flailing around with ideas till something eventually clicked.
Tesla, in contrast, would be at the Einstein's end of the spectrum with "If I had an hour to solve a problem, I'd spend 55 minutes thinking about the problem and 5 minutes thinking about [or in Tesla's case, making] solutions."
While most of the product companies would be somewhere in between, I'd argue that development is a bottleneck only if we are very close to Edison/Dyson's approach.
It is about designing good experiments, validating, and learning, so that when we're down to development, we build something that's way more likely to succeed.
The fact that we were advised to build non-technical experiments is but a small part. And with the current AI capabilities, we actually have a new power tool for prototyping that falls neatly into the whole puzzle.
Here's a bit more elaborate argument (sorry for a LinkedIn link): https://www.linkedin.com/posts/pawelbrodzinski_weve-already-...
Move faster and move better (to move faster) are the same thing. You reduce costs by going faster, and with lean you go faster by avoiding time wasters.
I use Python differently because uv made many things faster, less costly. Stuff I used to do in bash are now in Python. Stuff I wouldn't do at all because 3rd party modules were an incompressible expense, now I do because the cost is low.
Same with AI.
Every week, there was a small tool I actively chose to not develop because I know that it would save less time by automating the thing than it would take coding it.
E.G: I send regularly documents from my hard drive or forward mails to a specific email for accounting. It would be nice to be able to do those in one click. But dev a nautilus script or thunderbird extension to save max a minute a day doesn't make sense.
Except now with claude code, it does. In a week, they paid off. And now I'm racking the minutes.
Now each week, I'm getting a new tool that is not only saving me minutes, but also reducing context switching. Those turn into hours, which turn into days. These compounds.
And of course, getting out a MVP, or a new feature demo out of the door quickly allows you to get feedback faster.
In general, AI lets you get a shorter feedback loop. Trash bad concept sooner. Get crucial info faster.
Those do speed up a project.
But with LLMs I'm not so sure. I feel like I can skip the effort of typing, which is still effort, despite years of coding. I feel like I actually did end up spending quite a lot of time doing trivial nonsense like figuring out syntax errors and version mismatches. With an LLM I can conserve more of my attention on the things that really matter, while the AI sorts out the tedious things.
This in turn means that I can test more things at the top architectural level. If I want to do an experiment, I don't feel a reluctance to actually do it, since I now don't need to concentrate on it, rather I'm just guiding the AI. I can even do multiple such explorations at once.
With the llm I really can spend most of my time on the verification problem.
Depending on your subject matter you might only need an idea or two per 100loc generated. So much of what I used to do turns out to be grunt work that was simply pattern matching on simple heuristics, but I can churn out 5-10 good ideas per hour it seems, so I’m definitely rate limited on coding.
Similar to your comment on architectural experiments, one thing I have been observing is that the critical path doesn’t go 10x faster, but by multiplexing small incidental ideas I can get a lot more done. Eg “it would be nice if we had a new set of integration tests that stub this API in some slightly tedious way, go build that”.
It's basically the wetware equivalent of page thrashing.
My experience is that I write better code faster by turning off the AI assistants and trying to configure the IDE to as best possible produce deterministic and fast suggestions, that way they become a rapid shorthand. This makes for a fast way of writing code that doesn't lead to mental model thrashing, since the model can be updated incrementally as I go.
The exception is using LLMs to straight up generate a prototype that can be refined. That also works pretty well, and largely avoids the expensive exchanges of information back and forth between human and machine.
My new paradigm is something like:
- write a few paragraphs about what is needed
- have the bot take in the context and produce a prototype solution outside of the main application
- have the bot describe main integration challenges
- do that integration myself — although I’m still somewhat lazy about this and keep trying to have the bot do it after the above steps; it seems to only have maybe 50% success rate
- obviously test thoroughly
Cognitively, these are very different tasks. With the former, we actively drive technical decisions (decide on architecture, implementation details, even naming). The latter offers all these decisions made, and we first need to untangle them all before we can scrutinize the details.
What's more, often AI-generated code results in bigger PRs, which again adds to the cognitive load.
And some developers fall into a rabbit hole of starting another thing while they wait for their agent to produce the code. Adding context switching to an already taxing challenge basically fries brains. There's no way such a code review to consistently catch the issues.
I see how development teams define health routines around working with generated code. Especially around limiting context switching. But also retaking tasks to be made by hand.
You have even CEO of car companies that get fired because they mess this up. Or even the Sonos company lost a lot of value, and got their CEO fired because they messed up and can't fix it in time.
Speed is not everything. Developing the right features (what users want) and Quality are the most important things, but development speed allows you to test features and fix things fast and course correct.
Accompanying many early-stage startups in their journey, I see how often the development (which we're responsible for) takes a back seat. Sometimes the pivotal role will be customer support, sometimes it will be business development, and often product management will drive the whole thing.
And there's one more follow-up thought to this observation. Products that achieved success, inevitably, get into a spiral of getting more features. That, in turn, makes them more clunky and less usable, and ultimately opens a way for new players who disrupt the niche.
At some point, adding more features in general makes things worse--too complicated, too overwhelming, making it harder to accomplish the core task. And yet, adding new stuff never ceases.
In the long run, the best tactic may actually be to go slower (and stop at some point), but focus on the meaningful changes.
But there are people with great product taste who can know by trying a product whether it meets a real user need - some of these are early-adopter customers, sometimes they are great designers, sometimes PMs. And they really do need to try a product (or prototype) to really know whether it works. I was always frustrated as a junior engineer when the PM would design a feature in a written spec, we would implement it, and then when trying it out before launch, they would want to totally redesign it, often in ways which required either terrible hacks or significant technical design changes to meet the new requirements. But after 15 years of seeing some great ideas on paper fall flat with our users, and noticing that truly exceptional product people could tell exactly what was wrong after the feature was built but before it was released to users, I learned to be flexible about those sorts of rewrites. And it’s exactly that sort of thing that vibecoding can accelerate
Also, in the past I've done interactive maps and charts for different media organizations, and people would often debate for a considerable amount of time whether to, for example, make a bar or line chart (the actual questions and visualizations themselves were usually more sophisticated).
I remember occasionally suggesting prototyping both options and trying them out, and intuitively that usually struck people as impractical, even though it would often take less time than the discussions and yield more concrete results.
And don't take that as a complaint. It's a basic behavioral observation. What we say we do is different from what we really do. By the same token, what we say we want is different from what we really want.
At a risk of being a bit sarcastic: we say we want regular exercise to keep fit, but we really want doomscrolling on a sofa with a beer in hand.
In the product development context, we have a very different attitude towards an imagined (hell, even wireframed) solution than an actual working piece of software. So it's kinda obvious we can't get it right on the first attempt.
We can be working toward the right direction, and many product teams don't even do that. For them, development speed is only a clock counting time remaining before VCs pull the plug.
Writing a compiler at Sycor, there were teams waiting for us to finish our development. We were successful, being about an order of magnitude faster than the effort we replaced.
And just because google cancels products doesn't suggest anything about development speed.
If I were an LLM advocate (having much fun currently with gemini), I would let the criticism roll and make book using LLMs.
VP of Product put all the pressure on dev teams to deliver all the features against the specs. Then they release the new product/new version with plenty of fanfare.
And then literally no one measures which parts have actually delivered any value. I'd bet a big part of that code added no value, so it's a pure waste. Some other parts were actually harmful. They frustrated users, drove key metrics down, or have you. They are worse than waste.
But no one cared to check. Good product people, and there are scarcely few of them, would follow up with validation on what worked and what did not. They would argue against "major" releases whenever possible.
And seriously, if Amazon can avoid major releases, almost anyone could.
Suddenly, we might flip the script and have a VP of Product not asking "when will it be done?" but rather trying to figure out what the next most sensible experiments are.
The context of the article is product development, with a bias toward the commercial part of the ecosystem. And of course, as any picture painted with broad strokes, some generalizations were inevitable.
As a scientist, you definitely are familiar with the weight (or lack thereof) of anecdotal evidence. Unless the claim is "it can never work" or "it always works," my individual experience is just that--an individual experience.
But fine, let's take the subset of features / projects that can be tested or somehow validated. In my experience (having worked for 13+ years on companies that prefer to A/B almost everything), more than half of the tests fail. People initially might think the solution is to have better ideas, cook them more, do better analysis. That's usually wrong. I've seen PHDs with 20+ years of experience in a given industry (Search) launch experiments and they still fail.
The solution is to have some sort of "just enough" analysis like user studies, intuition, and business needs, and launch as fast and as many as you can. Therefore, development speed is A bottleneck (there's no Silver Bullet so it's not THE bottleneck).
Development speed absolutely is a bottleneck. But coding speed? Like, typing? Yeah, I can definitely type faster than I can think about code, or anything really (typing at 100wpm is a fun party trick but not super useful in the end). Many times over... Even single finger typers who peck at the keyboard probably can, auto-complete has existed for a long time...
They don't understand that this AI was built decades ago and has been improved on several times over: Compilers & Interpreters. Furthermore, you don't need billion-dollar neural-network supercomputers, just a vanilla laptop.
It's because of how you talk about the job, though. We automate every other kind of "coding" - why can't we automate yours?
LLMs are a tool that added a new dimension to explore. While I haven't like many felt actual gains, others are finding, and time will allow us to better judge if those can lead to long term impacts in the economy.
Just based on what I've been reading and experiencing: - Short term POCs can reach validation stage faster. - Mature cloud software needs a lot of extra tooling (LLMs don't understand the codebase, lack of places to derive good context from, and so on). - Anything in between for cloud seems to be a hit or miss, where people are mostly trading first iteration time for more refactoring later down the line.
From another perspective, areas of software where things are a lot more about numbers (cpu time, memory consumption, and so on), may benefit a lot from faster development/coding as the validation phase is either shorter or can be executed in parallel.
The key reality here is that I've been observing higher expectations for deliveries without a proof that we actually got better at coding in general. Which means that sacrifices are being made somewhere.
With a more complex code base (and a less popular tech stack), the perceived gains quickly diminish. Beyond a certain level of tech debt, AI-generated code is utterly useless. It's no surprise that we see people who vibe-coded their products with no technical knowledge whatsoever, and now they call professional engineers to untangle the mess.
A software agency I know well responded to the rise of AI somewhere between the lines of "Now, we'll have plenty of work to clean all that mess!" Admittedly, they always specialized in complex/rescue engineering gigs.
However, the "development as a bottleneck" discussion was set here in a broader context. It's not only how efficiently we are able to deliver bits of functionality, but primarily whether we should be building these things in the first place.
Equally for early-stage startups and established products alike, so much of features are built because someone said so. At the end of the day, they don't deliver any value (if we're lucky) or are plain harmful (if we're out of luck).
In such cases, it would have been better if developers actually sipped coffee and read Hacker News rather than coded/developed/engineered stuff.
Other than that the discovery process of what you should build is the hardest and the costliest part, the main conclusion from the article seems to be that if you outsource the first iterations to AI via vibe-coding, you will have much harder time changing and evolving it from there (iterating); to this, I agree
Go ahead and code as much as you want. Unless you can communicate the utility of that code to a paying customer it has no value or relevance.
Nobody wants to believe it, but just try compiling C++ on Windows and again in a Linux VM. Linux in a VM on the same host compiles at least twice as fast. It's insanity. I tried a script that rsync's the project files to my server from 2013, runs the build and rsync's the artifacts back. Running the build on a Xeon 2500 is still far faster with Linux than windows on my two year old i9. Even with the overhead of sending binaries over the internet. Absolutely disgusting.