Vista previa

AWS AI Practitioner

A company is evaluating several large language models (LLMs) for a text summarization task. The company needs to select a metric to evaluate the quality of the summaries that the LLMs generate. Which metric will meet this requirement?

A Recall

B Area under the ROC curve (AUC)

C Recall-Oriented Understudy for Gisting Evaluation (ROUGE) ✓ Correcta

D Mean squared error (MSE)