Is prompt engineering a ‘fad’ hindering AI progress?

Estimated read time 4 min read


gettyimages-1184216905

Catherine Falls Commercial/Getty Images

Is the art and science of prompt engineering, the refinement of instructions for generative AI, a good thing or a bad thing? Surprisingly, there isn’t universal agreement. 

Prompt engineering emerged by 2024 as an increasingly important user interface tool after the runaway success of ChatGPT in 2022 and 2023. The realization that shaping and crafting instructions for large language models and related technologies could achieve better or worse results made prompt engineering its own field of vibrant exploration.

Also: 7 ways to write better ChatGPT prompts – and get the results you want faster

Motivated by the belief that “a well-crafted prompt is essential for obtaining accurate and relevant outputs from LLMs,” aggressive AI users — such as ride-sharing service Uber — have created whole disciplines around the topic

And yet, there is a reasoned argument to be made that prompts are the wrong interface for most users of gen AI, including experts. 

“It is my professional opinion that prompting is a poor user interface for generative AI systems, which should be phased out as quickly as possible,” writes Meredith Ringel Morris, principal scientist for Human-AI Interaction for Google’s DeepMind research unit, in the December issue of computer science journal Communications of the ACM.

Also: CES 2025: The 13 most impressive products so far

Prompts are not really “natural language interfaces,” Morris points out. They are “pseudo” natural language, in that much of what makes them work is unnatural. 

“The fact that variations in prompting that would be irrelevant to a human interlocutor (for example, swapping synonyms, minor rephrasings, changes in spacing, punctuation, or spelling) result in major changes in model behavior should give us all pause,” writes Morris, “and serve as a further reminder that prompts are still quite far from being a natural-language interface.”

Those variations, she notes, are confusing to the average user, who can’t rely on what comes from a given phrase. 

Also: How to install an LLM on MacOS (and why you should)

Natural language between humans has elements that don’t ever enter into prompting, Morris points out. “When people converse with each other, they work together to communicate, forming mental models of a conversation partner’s communicative intent based not only on words but also on paralinguistic and other contextual cues, theory-of-mind abilities, and by requesting clarification as needed.”

In contrast, “arcane prompts tend to produce better results than those in plain language,” she says, writing that the “subtle differences between prompting and true natural-language interactions lead to confusion for typical end users of AI systems” and “results in the need for specially trained ‘prompt engineers’ as well as prompt marketplaces such as PromptBase.” Even prompt engineering can produce inconsistent, unreliable results, Morris adds. 

It’s not just average users who suffer from prompting’s shortcomings: The use of prompts is poisoning AI research. The research papers trumpeting each new breakthrough don’t reliably report on how many prompts they use to achieve a result, an omission Morris calls “prompt-hacking.”

Also: Autonomous businesses will be powered by AI agents

For example, prompt hacking may mean that benchmark tests of new AI models — the standard way to evaluate advances — are inconsistent and, therefore, invalid.

“While models are ostensibly testing on the same set of benchmarks,” writes Morris, “in practice, these metrics may not be comparable due to variations in how each organization operationalizes the benchmarking—that is, the format of prompts used to present the tests to the model.”

In place of prompting, Morris suggests a variety of approaches. These include more constrained user interfaces with familiar buttons to give average users predictable results; “true” natural language interfaces; or a variety of other “high-bandwidth” approaches such as “gesture interfaces, affective interfaces (that is, mediated by emotional states), direct-manipulation interfaces (that is, directly manipulating content on a screen, in mixed reality, or in the physical world).”

Also: Google’s Gems are a gentle introduction to AI prompt engineering

Morris contends that all of those approaches, rather than the arcana of prompts, are easier methods of interacting with AI “since they require no learning curve and are extremely expressive.”

AI is “at a critical juncture,” she writes. “Our acceptance of prompting as a ‘good enough’ simulacrum of a natural interface is hindering progress.

“I expect we will look back on prompt-based interfaces to generative AI models as a fad of the early 2020s—a flash in the pan on the evolution toward more natural interactions with increasingly powerful AI systems.”





Source link

You May Also Like

More From Author

+ There are no comments

Add yours