The duality of AI’s impact is coming into sharper focus as AI tools become more capable. The Promise and Limitations of Large Language Models
We glimpsed the huge potential of large language models (LLMs) with the release of ChatGPT. But a lack of veracity of the content they generate limited the usefulness of LLMs for high-fidelity factual use cases. Also, the early versions of LLMs did not have the ability to do web searches or external lookups, and their answers were based on the data they were initially trained on. In essence, there was a cutoff date for their knowledge. So, while LLMs may be great for brainstorming or coming up with ad copy, we need to cross-verify when numbers are involved or when we need to be sure that any generated text is, in fact, grounded in truth and corresponds to the most up-to-date information available.
Rise of New Capabilities: Citations, Search, System 2 Thinking, and AI Agents
These problems have been mitigated—not fully, but to a certain extent—with the ability for LLMs to perform external searches, providing sources and references for the LLM output. At least, it’s now easier to follow the link to the source and cross-check the claims if you are diligent. Another advance in LLMs is inference-time scaling (aka test-time compute).
Let’s understand its key benefit: Consider that the older models of ChatGPT generated responses to your questions right away, as soon as you finished entering your prompt. But for many real-world questions, particularly in a business context, you may not need an answer right away. You can wait a few hours or, in some cases, even days. That’s the idea behind inference-time scaling. What if AI could generate better-quality responses by working through it step-by-step (aka chain of thought) and “thinking” for a longer duration (in non-anthropomorphic terms, by allocating more compute power)? Newer AI models such as OpenAI o1, OpenAI o3-mini, Google’s Gemini 2.0 Flash, Anthropic’s Claude 3.7 Sonnet, and DeepSeek R1 support this capability. A rough analogy here would be to System 1 (involuntary and fast) and System 2 (deliberative and slower) modes of thought popularized by psychologist Daniel Kahneman in his book Thinking, Fast and Slow; the newer AI models support a slower but more logical and analytical System 2 thinking.
Another important construct is that of an AI agent. Think of this as an AI tool being able to complete a given task (mostly) independently, on its own and with a high degree of success, to a reasonable quality. Unlike traditional LLMs that provide siloed responses, AI agents can execute multi-step workflows, and they refine their outputs based on additional data.
Rise of Deep Research AI Assistants
Now, all of these functionality refinements and enhancements—external information search, citations, reasoning models, inference-time scaling, and agents—have come together in the form of AI research assistant tools. OpenAI, Perplexity, and Gemini offer Deep Research (all three have chosen the same name), while xAI’s Grok 3 tool is called Deep Search. Another AI agent, Manus, built on top of Anthropic’s Claude 3.5 Sonnet and Alibaba’s Qwen models, offers similar functionality.
OpenAI’s Deep Research tool comes with the company’s pro plan ($200 a month, with a limit of 120 queries; 10 deep research queries per month are included with the $20 a month plus plan), while Perplexity’s is available at $20 per month.
What Are AI Research Assistants Capable Of?
Give a prompt (i.e., a research/analysis task), and these tools search for online sources with relevant information, analyze and synthesize the information, and prepare a report. Before proceeding with research, the assistant also presents a research plan that you can confirm or modify. What normally takes human analysts a few days or many hours is now done in minutes to an hour. Based on early tests, the quality of research/analysis seems reasonable.
The usual caveats apply—you can’t 100% rely on the information and still must cross-verify, there is important information that may be missing, the sources used may be of low quality, etc. Important insights and data may be in subscriber-only or paid repositories that are not accessible to the tool. There isn’t yet a way to specify or provide the reports and papers that should be part of the analysis. We can expect some of this capability to be added in the near future.
Implications for Knowledge Work
Net-net, what do we have? We have a research analyst for $200 per month. What are the implications for knowledge work?
While the number of research analyst/researcher jobs per se may be limited in number (I’d estimate a few million worldwide, and the analysts also do several other tasks as part of their jobs besides just producing reports), some amount of research and analysis is a part of practically all knowledge work functions such as strategy, product management, marketing, finance, policy, and R&D. So, it appears that analyst jobs are in for some changes. Here are a couple of scenarios for how this can play out as AI research assistant tools are widely adopted.
Scenario 1: Analyst as Fact-Checker
With AI doing the heavy lifting on producing analysis reports, will the analyst jobs morph into fact-checking and hallucination-whacking roles? Will there be downward pressure on wages as this work directly competes with AI? Will the senior researchers and analysts be assisted by multiple AI agents instead of junior human researchers? Where will the next generation of senior analysts come from when there are no juniors who progress to more senior roles?
Scenario 2: New Vistas of Demand and Growth
Will the research and analysis field flourish and grow faster because the demand for research and analysis work goes up when analysts can be super productive and the cost of producing research drops significantly?
A Fork in the Road?
The course of automation never did run smoothly, and it could go either way. If your job (like mine) involves a lot of research, analysis, and writing, pay attention to how these AI research tools and assistants are shaping up.