ChatGPT for Industry Research: Not Ready for Prime Time
Putting ChatGPT to the test on a subject I am intimately familiar with.
ChatGPT: Four Errors in a Single Sentence
Since its unveiling in November 2022, there has been a lot of buzz around ChatGPT—OpenAI’s generative AI chatbot. Although there are similar projects, such as Google’s Bard, ChatGPT got an enormous boost as it was free, easy to use, and could be deployed in a variety of ways—from answering questions to writing essays, poems, and song lyrics, to coming up with recipes, to writing software code, based on simple prompts from the user.
One inevitable question of course was, to what extent would ChatGPT threaten jobs?
For me, it’s an important question. I direct a team of IT and industry research analysts at Avasant. Is ChatGPT a disruptive threat to our business? Would competitors, with less-skilled workers, be able to generate research as good or in greater quantities than what we could produce? Would clients even need research firms like ours at all, if they could generate their own reports, using these tools? Or, can we leverage a tool like ChatGPT to make us more productive?
Putting ChatGPT to the Test
To find out, I did a series of tests to see if ChatGPT could write a short research note.
One of my tests was to ask ChatGPT to write about something I am intimately familiar with: The history of the IT research firm Computer Economics, which my partner and I acquired in 2005 and I sold to Avasant in 2020. (I still manage this business as part of Avasant Research.)
Here is the transcript of my prompts and the responses from ChatGPT, lightly edited for length, along with my evaluation of ChatGPT.
Prompt: “Provide a history of the IT research firm Computer Economics.”
ChatGPT responded:
Computer Economics is a research and consulting firm specializing in the field of information technology (IT). The company was founded in 1982 in Newport Beach, California, by John L_________, who currently serves as the firm's President and CEO.
There are four factual errors in the second sentence of this opening paragraph! Computer Economics was founded in 1979, not 1982, as documented in this Wikipedia entry. (Wikipedia is a primary source for ChatGPT's training data, so how did it miss this?) Moreover, the firm was founded in New York City, not Newport Beach, by Bruno Bassi, not John L., and John L. was never the company president. He was our VP of Research, from 2007 until he retired in 2017. ChatGPT just made this all up.
But wait, there's more! A couple of paragraphs later, ChatGPT writes:
Computer Economics … has received several awards for its work, including recognition from Forrester Research as a “Leading IT Research and Advisory Firm” and from Gartner as a “Top IT Research and Consulting Firm.”
But neither Forrester nor Gartner has any such award program. In fact, it would be hilarious if they did since they are both IT research and consulting/advisory firms themselves! ChatGPT just made up that part. So, I gave ChatGPT another prompt and a chance to correct itself.
Prompt: “Please provide more information about Gartner's award program for Top IT Research and Consulting firms. When was this program established and who were some of the most recent winners of this award?”
ChatGPT responded (click to enlarge):
ChatGPT Hallucinations
Apparently, ChatGPT is not aware of the First Law of Holes: When you find yourself in one, stop digging.
My prompt asked who some recent award winners were. Now it says the winners are not publicly available. What kind of award keeps the winners secret? Moreover, if the winners are secret, how does it know Computer Economics was one of them? At the same time, the winners must not be secret, because they “can be found in Gartner’s annual report on the market for IT research and consulting services” (which, of course, does not exist).
Risks in the Use of ChatGPT for Research
In summary, here are some observations on the risks of using ChatGPT as a virtual research analyst.
Fiction parading as fact. As shown above, ChatGPT is prone to simply make up stuff. When it does, it declares it with confidence—what some have called hallucinations. Whatever savings a research firm might gain in analyst productivity it might lose in fact-checking since you can’t trust anything it says. If ChatGPT says the sun rises in the east, you might want to go outside tomorrow morning to double-check it.
Lack of citations. Fiction parading as fact might not be so bad if ChatGPT would cite its sources, but it refuses to say where it got its information, even when asked to do so. In AI terms, it violates the four principles of explainability.
Risk of plagiarism. Lack of citations means you can never be sure if ChatGPT is committing plagiarism. It never uses direct quotes, so it most likely is paraphrasing from one or multiple sources. But this can be difficult to spot. More concerning, it might be copying an original idea or insight from some other author, opening the door to the misappropriation of copyrighted material.
Possible Limited Uses for ChatGPT
We are still in the early days of generative AI, and it will no doubt get better in the coming years. So, perhaps there may be some limited uses for ChatGPT in writing research. Here are two ideas.
The first use might be simply to help overcome writer’s block. We all know what it’s like to start with a blank sheet of paper. ChatGPT might be able to offer a starting point for a blog post or research note, especially for the introduction, which the analyst could then refine.
An additional use case might be to use ChatGPT to help come up with a structure for a research note. To test this, I thought about writing a blog post on the recent layoffs in the tech industry. I had some ideas on what to write but wanted to see if ChatGPT could come up with a coherent structure. So, I gave it a list of tech companies that had recently announced layoffs. Then I gave it some additional prompts:
What do these companies have in common? Or are the reasons for the layoffs different for some of them?
As a counterpoint, include some examples of tech companies that are hiring.
Talk about how these layoffs go against the concept of a company being a family. Families do not lay off family members when times are tight.
Point out that many employees in the tech industry have never experienced a downturn and this is something that they are not used to dealing with.
The result was not bad. With a little editing, rearranging, and rewriting it could make a passable piece of news analysis. As noted earlier, however, the results would need to be carefully fact-checked, and citations might need to be added.
One word of warning, however: In order to learn, young writers need to struggle a little, whether it is by having to stare at a blank sheet of paper or constructing a narrative. I am concerned that the overuse of tools like ChatGPT could deny junior analysts the experience they need to learn to write and think for themselves.
The larger lesson here is that you can’t just ask ChatGPT to come up with a research note on its own. You must have an idea and a point of view and give ChatGPT something to work with. In other words, treat ChatGPT as a research assistant. You still need to be the analyst, and you need to make the work product your own.
I will be experimenting more with ChatGPT in the near future. Hopefully, improvements in the tool will mitigate the problems and risks.
Update Feb. 20, 2023: Jon Reed and I exchanged several long comments on this post. See them in the comments section on the original post.