Expected Behavior
The code should properly process a DataFrame containing consecutive agent responses (double agent responses) without throwing a KeyError.
Current Behavior
When processing a DataFrame with consecutive agent responses, the code throws a KeyError: 'utterance_pair' during the evaluation process. The error occurs specifically in the run_detect_intent_queries method when trying to access the 'utterance_pair' column.
Possible Solution
The error suggests that the 'utterance_pair' column is missing from the DataFrame. Possible solutions include:
- Modify the code to handle cases where consecutive agent responses exist without requiring the utterance_pair column
- Ensure the 'utterance_pair' column is properly initialized in the DataFrame before processing
- Add a fallback mechanism when 'utterance_pair' is not present
Steps to Reproduce
- Create a DataFrame containing consecutive agent responses
- Call eval_results = evals.run_query_and_eval(sample_df)
- The error occurs during execution of run_detect_intent_queries()
- Specifically fails at: utterance_idx = int(row["utterance_pair"] or index)
Context (Environment)
The issue occurs when trying to evaluate agent responses in a DataFrame. The evaluation process is meant to analyze the quality and similarity of responses but fails when encountering consecutive agent responses in the data structure.
Detailed Description
The error occurs in the DataFrame processing pipeline, specifically during the intent detection phase. The stack trace shows that the code attempts to access a 'utterance_pair' column that doesn't exist in the DataFrame. This happens when processing rows with consecutive agent responses, since the code expects a utterance pair there instead (1:1 agent vs. utterance vs. 2:1), suggesting that the data structure handling for such cases needs to be revised.
The error originates from the following line of code:
pythonCopyutterance_idx = int(row["utterance_pair"] or index)
which fails when the 'utterance_pair' column is not present in the DataFrame.
Possible Implementation
-
Add a check for the existence of 'utterance_pair' column:
pythonCopyutterance_idx = int(row["utterance_pair"]) if "utterance_pair" in row else index
-
Or ensure the column is properly initialized:
pythonCopydef add_response_columns(self, df):
if 'utterance_pair' not in df.columns:
df['utterance_pair'] = df.index
rest of the column initialization
return df
Expected Behavior
The code should properly process a DataFrame containing consecutive agent responses (double agent responses) without throwing a KeyError.
Current Behavior
When processing a DataFrame with consecutive agent responses, the code throws a KeyError: 'utterance_pair' during the evaluation process. The error occurs specifically in the run_detect_intent_queries method when trying to access the 'utterance_pair' column.
Possible Solution
The error suggests that the 'utterance_pair' column is missing from the DataFrame. Possible solutions include:
Steps to Reproduce
Context (Environment)
The issue occurs when trying to evaluate agent responses in a DataFrame. The evaluation process is meant to analyze the quality and similarity of responses but fails when encountering consecutive agent responses in the data structure.
Detailed Description
The error occurs in the DataFrame processing pipeline, specifically during the intent detection phase. The stack trace shows that the code attempts to access a 'utterance_pair' column that doesn't exist in the DataFrame. This happens when processing rows with consecutive agent responses, since the code expects a utterance pair there instead (1:1 agent vs. utterance vs. 2:1), suggesting that the data structure handling for such cases needs to be revised.
The error originates from the following line of code:
pythonCopyutterance_idx = int(row["utterance_pair"] or index)
which fails when the 'utterance_pair' column is not present in the DataFrame.
Possible Implementation
Add a check for the existence of 'utterance_pair' column:
pythonCopyutterance_idx = int(row["utterance_pair"]) if "utterance_pair" in row else index
Or ensure the column is properly initialized:
pythonCopydef add_response_columns(self, df):
if 'utterance_pair' not in df.columns:
df['utterance_pair'] = df.index
rest of the column initialization
return df