Summary: This study tested how different implementations of explaining and drawing activities affect learning from a multimedia science lesson. After studying a multimedia slideshow about the human respiratory system, college students (n = 145) were assigned randomly to one of four learning activity conditions: write explanations before drawing pictures (explain‐then‐draw group), draw pictures before writing explanations (draw‐then‐explain group), study provided explanations before drawing pictures (provided explanation‐then‐draw), or study provided pictures before writing explanations (provided drawing‐then‐explain). One week following the learning activity, all students completed post‐tests of their understanding. Results from the learning activity supported the scaffolding hypothesis: students generated better explanations when they used provided (rather than their own) drawings, and they generated better quality drawings when they used provided (rather than their own) explanations. However, this difference in learning activity performance did not correspond to higher performance on the delayed post‐tests. We discuss implications for how to best sequence and scaffold generative activities.