Research intern at Microsoft Research working on spatial and mathematical reasoning for vision-language models.