Since 2006 Google Earth has included textured 3D building models for urban areas. Initially these were crowd-sourced from enthusiastic members of the user community who modeled them by hand with the aid of SketchUp, sold to Timble in 2012, or the simpler Google Building Maker, retired in 2013. As the video above shows, from 2012 onward Google have instead been using aerial imagery captured at a 45 degree angle and employing photogrammetry to automate the generation of 3D building and landscape models.In the following video from the Nat and Friends YouTube channel Google employees help explain the process.
As explained Google Earth’s digital representation of the world is created with the aid of different types of imagery. For the global view 2D satellite imagery is captured from above and wrapped around Google Earth’s virtual globe. The 3D data that appears when users zoom in to the globe is captured via aircraft.
Each aircraft has five different cameras. One faces directly downward while the others are aimed to the front, back, left and right of the plane at a 45 degree angle. By flying in a stripe-like pattern and taking consecutive photos with multiple cameras, the aircraft is able to capture each location it passes from multiple directions. However, the need to obtain cloud free images means that multiple flights have to be taken, entailing that the images captured for any single location may be taken at different times days apart. The captured imagery is colour corrected to account for different lighting conditions, and the images for some areas even have finer details like cars removed.
The process of photogrammetry as employed by Google works by combining the different images of a location and generating a 3D geometric surface mesh. Computer vision techniques are used to identify common features within the different images so that they can be aligned. A GPS receiver on the aircraft also records the position from which each photograph was taken enabling the calculation of the distance between the camera on the plan and any given feature within the photograph. This facilitates the creation of depth maps which can be stitched together using the common features identified earlier to form a combined geometric surface mesh. The process is completed by texturing the mesh with the original aerial imagery. For regularly shaped objects like buildings this can be done very accurately with the aid of edge detection algorithms that can identify the edges of buildings in the imagery and help align them with the edges of features in the mesh. For organic structures this is more challenging.
Google Earth includes imagery for may different levels of detail or zoom. According to the video the number of images required is staggering, in the order of the tens of millions. While the zoomed out global view in Google Earth is only fully updated once every few years the aerial imagery for particular urban areas may be updated in less than a year. Gathered over time this imagery can enable users to observe changes and this can be leveraged for analysis with the aid of Google’s Earth Engine.