Help! I'm getting poor results.

Good prompt engineering is very important to getting the best results from GPTcsv. A well designed prompt can make the difference between results than feel magical and results that are inaccurate or misleading.

A prompt for closed-ended classification

Let's look at an example of a prompt that works well for evaluating the quality of a user research response contained in a column named "What would you change about Spotify?":

You are a user research expert. I need help rating the quality of a response to a user research question. I will provide a user's answer to the question: "If you could wave a magic wand and change one thing about Spotify what would it be?"

Rate the response to "What would you change about Spotify?" either "Good", "OK", or "Poor". A "Poor" response contains no useful information. An "OK" response mentions a specific feature. A "Good" response mentions a specific feature and provides an explanation.

For example, a "Good" response is:
"Better integration with Apple CarPlay. I would like to be able to more easily select songs from my playlist while I'm driving."

For example, a "Poor" response is: "Haven't used it enough yet to provide feedback" or "Nothing"

Respond only with "Good", "OK", or "Poor".

This prompt does a few things:

Establishes the role of the AI as an expert which — strangely enough — helps the AI generate better responses.
Provides useful context to help the AI understand what you are trying to accomplish with the task.
Mentions the specific column by name, "What would you change about Spotify?"
Gives clear criteria and clear examples for each of the valid outputs: "Good", "OK", and "Poor"
Gives clear instructions to only response with one of the three labels

A prompt for open-ended classification

This prompt approach works well for a classification task where the right output is constrained. Sometimes, we want to loosen the reins a bit:

You are a user research expert. I need help labelling user research responses for Spotify.

I will provide a user's answer to the question: "What change would you make to Spotify?"

Label the Survey Response with a 1 word topic in all lower case for example, 'none', 'playlists', 'lyrics', 'performance', etc.. Output 'none' if the Survey Response column is 'none', 'not sure', 'idk', 'don't know'. '?', or any other unspecific response.

Again, we establish a clear role for the AI to fill which generates better results. Here, we give the AI some examples of valid outputs but we also encourage the AI to generate new responses by ending our examples with "etc."

However, we don't want the AI to be too creative (lest we end up with too many different labels which aren't useful for our analysis). For example, the AI might generate "shuffling playlists", "adding new songs to playlists", "sharing my playlists" which may be accurate but are too detailed. We just want to know if the survey response has to do with playlists. For this reason, we specific that the output should be a "1 word topic" which gives the AI some space for creativity, but also constraints to make sure the output is a useful classification.

A prompt for content generation

Lastly, we may not want the AI to have any constraints — this is particularly useful for content generation. For example:

You are an expert travel agent tasked with helping travelers determine the best time to visit Destination. Output an overview of the best time to visit per the example below.

Example:
Destination: Barcelona, Spain
Output: Summer (June to August) and fall (September to November): Summer is fiesta time in Barcelona, when the city hosts some of Europe's biggest music festivals, including Sonar and Primavera Sound. Average temperatures in summer have a high of 82°F (28°C) and a low of 71°F (22°C).

While soaring temperatures send summer visitors to the beach, the cooler months of fall are ideal for exploring Barcelona's colorful neighborhoods. In November, the scent of roasting chestnuts fills the air during the Catalan festival of La Castanyada. Average temperatures in fall have a high of 68°F (20°C) and a low of 60°F (16°C).

In this case, our example encourages the AI to be detailed and creative. Instead of just outputting the months, we help the AI be creative by including a well written example.

Help! I'm getting poor results.

A prompt for closed-ended classification

A prompt for open-ended classification

A prompt for content generation

We're here to help