The camera only captures the pan's image to understand the recipe status and if a step is done or not by matching it with vision models. It does not record any audio.