One of the most important elements of Augmented Reality is the ability to seamlessly mesh 3D graphics with the real world. Current AR technology simply overlays graphics on top of video–even when tracking and recognizing objects like cards and markers. The AR SDK gives the position and orientation of the tracked object to a 3D engine which then renders geometry on top of the video frame coming from the device’s camera.
New technologies like Google’s Tango Tablet use Kinect-style depth cameras to store not only the color of each pixel, but the depth and position, too. (Well, sort of–the depth camera’s resolution is much lower than that of the color camera). This means that you can build a 3D model out of what the tablet’s camera sees as you move around an environment.
This feature has huge ramifications. Tango uses this data to do what is called “localization.” This means once an area is scanned, the tablet can compare the internal 3D model of the current environment it has stored with what the camera is currently seeing. When fused with compass and gyro data, the Tango tablet can compute its precise location inside the scanned area. This doesn’t take long either. Tango starts building the model immediately. Walk back to where you started using the tablet, and Tango knows where it is.
Usually this 3D data is stored as a point cloud. This is basically a 3D point for every position the 3D camera records. Hence, a sufficiently complicated area will look like a cloud of dots–a point cloud. You can see an example of the Tango building a point cloud with the Room Scanner Tango app.
These point clouds are important for not only localization, but AR graphical effects such as occluding rendered 3D objects with the real world. This is because a 3D mesh can be built out of these points which can be used for occlusion, collision and other features. Having objects in between you and the augmentation occlude the 3D render is essential to nailing the feeling that an AR object is really there.
Point clouds are awesome, but building them can be frustrating. Current point cloud scanners are bulky and slow, not to mention their accuracy issues can lead to jitter and other artifacts. Also, some depth cameras run at a frame rate low enough to make it hard to create point clouds without moving very slowly through an environment. Who wants to play a game where you have to walk around and meticulously scan a room before you can start?
In order for AR games and apps to succeed, devices need to effortlessly be able to sense and detect the 3D geometry of their surroundings. Yet, quick and instant generation of point clouds is far beyond the capabilities of current mobile sensor technology.
That’s where the public point cloud comes in.
A truly great Augmented Reality platform needs to upload point clouds generated by devices to the cloud. Then, when a user uses some hot new wearable AR glasses, it can pull down a pre-made point cloud for the current location off of a server and use that until the glasses can update it from its own sensors. The device will then upload a fresh point cloud which can be used to refine the version stored online.
You can kind of see this already–Google and Apple Maps’ 3D satellite mode use similar point cloud reconstruction techniques presumably from aerial photos and other sources. Whereas these 3D models often look like something you’d see on the original PlayStation, the public point cloud will have to be much more detailed. As sensors on mobile devices become more advanced, the crowdsourced point cloud data will become incredibly detailed.
A massive, publicly accessible point cloud is not just necessary for the next generation of AR wearable devices. But also for self driving cars, drone navigation, and robotics (which is indeed where many of these algorithms came from in the first place). Privacy implications do exist, but perhaps not more so than Google Maps’ street view, or other current technologies that give you very precise information about your location.
In the near future, almost every public place on the planet will be stored in the cloud as 3D reconstructed geometry–passively built up and constantly refined by sensors embedded on countless mobile and wearable devices, perhaps without the user even knowing.