A seminal work in SLAM is the research of R.C. Smith and P. Cheeseman on the representation and estimation of spatial uncertainty in 1986. Other pioneering work in this field was conducted by the research group of Hugh F. Durrant-Whyte in the early 1990s which showed that solutions to SLAM exist in the infinite data limit. This finding motivates the search for algorithms which are computationally tractable and approximate the solution.

In recent years visual SLAM \footnote{Simultaneous Localization and Mapping} has a rapid progress and application of that can be observed in many fields like VR, Computer Games, 3D Model constructing and etc. The main point in this problem is related to constructing or updating a map of an unknown environment while simultaneously keeping track of an agent’s or agents’ location within it. These agents are actually sensors that sense the environment and try to construct it by instantiating some points which are called clouds of points. Each time an agent senses the environment, according to the status of the clouds of points, the newly sensed points are created or currently, points may experience an update regard of them.

One of the main requirement for such system is the agents. there are several approaches to perform 3D scanning from structured light scanners to CTs. However, most of these scanners are industrial or clinical grade instruments and are generally very expensive and bulky. Structured light scanners need calibration and are inherently expensive due to the requirement of a laser projector and a high-end camera to capture the images. Kinect has traditionally been used in gesture recognition in gaming, computer graphics and more recently in 3D scanning. It is more accessible and less expensive than other tools and also there is much better support by IT community for that.

As a state of the art technology, computer vision has broad application in real life. Consider self-driving car problem, the method can be used to construct the environment for any of these vehicles passing in roads. Another application is constructing a map of somewhere special for a computer game, imagine a complete scanned environment of the university selected in some shooter games like counter-strike as a map. Another example in this application area is the environment at a place of interest. it can be captured and converted into a 3D model. This model can then be explored by the public, either through a VR interface or a traditional 2D interface. This allows the user to explore locations which are inconvenient for travel. Robotics is another application area of that, a robot vacuum cleaner is a very good example. By sensing the environment where it cleans, the map constructed and even by any change in the environment it can be handled and updated in the future run.

The main challenges in this problem lie on mapping and modeling the data. Kinect can provide both depth and color data from an environment and it has its own software development kit. The point clouds produced by Kinect can be used directly for measurement and visualization in the architecture and construction world. There are some other systems that can be deployed for this issue also which will need be discussed in future.


Here is the short demo of the work is done during this project, this program is capture RGB and Depth data from the Kinect sensor and try to construct the online version of the point cloud. There are some remaining parts like loop closure detection and also aligning point clouds which will be applied to the current program in future.