AWS AI Practitioner
A data scientist is explaining the pre-training process for large language models (LLMs). Which description BEST represents what happens during pre-training?
A
The model weights are initialized with human feedback and fine-tuned on labeled data
B
The model weights are randomly initialized and then fitted to a large dataset through a language modeling objective
✓ Correcta
C
The model is trained exclusively on task-specific labeled datasets
D
The model uses reinforcement learning to optimize for human preference
Explicación
During pre-training, LLM weights are randomly initialized and then trained on large amounts of text data using a language modeling objective (such as predicting the next token or masked tokens). This process allows the model to learn language patterns, knowledge, and reasoning capabilities from the training corpus.