Skip to content

[BUG] Double Agent Response DataFrame Processing Error #272

@gmchueh

Description

@gmchueh

Expected Behavior
The code should properly process a DataFrame containing consecutive agent responses (double agent responses) without throwing a KeyError.

Current Behavior
When processing a DataFrame with consecutive agent responses, the code throws a KeyError: 'utterance_pair' during the evaluation process. The error occurs specifically in the run_detect_intent_queries method when trying to access the 'utterance_pair' column.

Possible Solution
The error suggests that the 'utterance_pair' column is missing from the DataFrame. Possible solutions include:

  • Modify the code to handle cases where consecutive agent responses exist without requiring the utterance_pair column
  • Ensure the 'utterance_pair' column is properly initialized in the DataFrame before processing
  • Add a fallback mechanism when 'utterance_pair' is not present

Steps to Reproduce

  1. Create a DataFrame containing consecutive agent responses
  2. Call eval_results = evals.run_query_and_eval(sample_df)
  3. The error occurs during execution of run_detect_intent_queries()
  4. Specifically fails at: utterance_idx = int(row["utterance_pair"] or index)

Context (Environment)
The issue occurs when trying to evaluate agent responses in a DataFrame. The evaluation process is meant to analyze the quality and similarity of responses but fails when encountering consecutive agent responses in the data structure.

Detailed Description
The error occurs in the DataFrame processing pipeline, specifically during the intent detection phase. The stack trace shows that the code attempts to access a 'utterance_pair' column that doesn't exist in the DataFrame. This happens when processing rows with consecutive agent responses, since the code expects a utterance pair there instead (1:1 agent vs. utterance vs. 2:1), suggesting that the data structure handling for such cases needs to be revised.

The error originates from the following line of code:
pythonCopyutterance_idx = int(row["utterance_pair"] or index)
which fails when the 'utterance_pair' column is not present in the DataFrame.

Possible Implementation

  • Add a check for the existence of 'utterance_pair' column:
    pythonCopyutterance_idx = int(row["utterance_pair"]) if "utterance_pair" in row else index

  • Or ensure the column is properly initialized:
    pythonCopydef add_response_columns(self, df):
    if 'utterance_pair' not in df.columns:
    df['utterance_pair'] = df.index

    rest of the column initialization

    return df

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions