Smart rooms are used for a growing number of practical applications. They are often equipped with microphones and cameras allowing acoustic and visual tracking of persons. For that, the geometry of the sensors has to be calibrated. A method is introduced that calibrates the microphone arrays by using the visual localization of a speaker at a small number of fixed positions. By matching the positions to the direction of arrival (DoA) estimates of the microphone arrays, their absolute position and orientation are derived. Data from a reverberant smart room is used to show that the proposed method can estimate the absolute geometry with about 0.1 m and 2 degrees precision. The calibration is good enough for acoustic and multimodal tracking applications and eliminates the need for dedicated calibration measures by using the tracking data itself.