Conceptually, a 3D Object Retrieval framework (see Fig. 1) tries to identify a template model (query) among a large number of shapes. This technique came as a response to the explosion of available 3D object representations all over the internet. It is used mainly for object identification and surface reconstruction.
For large database of objects, the algorithm cannot offer real time model extraction but it compensate at precision and reliability. Each model involved in this process has to be described in a common frame also known as a descriptor. As depicted in Fig.1 the overall structure is coarse divided in two section; online and offline. The fist sub-structure is used to process only variable information such template descriptor extraction or correspondence identification and validation whereas the second sub-structure deals with static data such database models or database descriptor extraction. Since the template presents only a partial representation, the similarity between the 3D shape models can occur only in the direction of partial matching. Finally, the database shape with the highest confidence (similarity) is used to represent or identify the query model.
Since it is impossible to directly determine a similarity only by simply comparing two point distribution models, a method which embeds surrounding information in a common frame representation had to be used. A descriptor collects surrounding feature information (euclidean distance, color etc.) and embed it in local representation which can be further used to identify similarities with other regions or models. The data is stored within a histogram storing the encoded neighbor geometric information. Several encoding principles can be mentioned as: SHOT (Signature of Histograms of OrienTations), Spin Images, Point Features Histograms, etc.
The matching process try to determine similar regions between two given models. Since one of the models (query) can presents large occluded areas (because of the perceiving principle or because of the limited number of perspectives), the matching process occurs at a local level. As a direct result, only local descriptor representation can be used in this sense. The similarity identification occurs using a brute force matchingby comparing independently each point from the query with each point from the database models. Several similarity measure methods can be used such as: L1-Norm, L2-Norm, Battacharyya distance, correlation coefficient, histogram intersection or chi-squared distribution. From all of them, the L2-Norm is the most used similarity measure. Fig. 2 plots a series of correspondences between to similar point distribution models.
By this stage, incorrect correspondences are eliminated. Because of the similar histogram representations multiple correspondences can be determined. To avoid this, only points describing unique geometric regions are used to validate the similarity between two models. Considering this, flat or regular objects are difficult to match. A correspondence is considered to be valid only if the ratio between the closest similarity measure and the second closest similarity measure is above 0.7 as depicted in equation the next equation. This value ensures the selection of only distinctive point correspondences.
T.T. Cociaș, S.M. Grigorescu and F. Moldoveanu, "Multiple-Superquadrics based Object Surface Estimation for Grasping in Service Robotics," 13th International Conference on Optimization of Electrical and Electronic Equipment, Brasov, Romania, 24-26 May 2012, pp. 1471-1477.
T.T. Cociaș, S.M. Grigorescu and F. Moldoveanu, "3DOR based Global Pose Estimation for Service Robotics," Fifth Győr Symposium & First Hungarian-Polish Joint Conference On Computational Intelligence, Győr, Hungary, 2012.