A Comparison of 2D-3D Pose Estimation Methods
Author
Petersen, Thomas
Term
4. term
Publication year
2008
Pages
76
Abstract
Pose-estimering er processen med at bestemme et kameras eller en genstands position og orientering ud fra billeder. Det bruges bredt til tracking og augmented reality (udvidet virkelighed), hvor både realtidshastighed og pålidelig nøjagtighed er vigtige. For eksempel sporer software som ARToolKit en flad markør og tegner 3D-objekter ovenpå; her prioriteres hastighed, så overlaget ser stabilt ud, mens små nøjagtighedsfejl kan accepteres. Selvom 2D–2D-korrespondancer også er et aktivt forskningsområde, fokuserer dette speciale på 2D–3D-korrespondancer. Der findes ingen fælles standard for at sammenligne metoder til pose-estimering, så specialet gennemfører en retfærdig, side-om-side evaluering af fire tilgange. Hver metode estimerer perspektiv ud fra kendte match mellem 2D-billedpunkter og 3D-punkter i en punktsky (2D–3D-korrespondancer). De testede metoder er CPC, PosIt, PosIt for coplanare punkter og DLT, en lineær baseline valgt for sin enkelhed. Vi undersøger praktiske begrænsninger som minimum antal punktmatch og følsomhed over for støj (tilfældige målefejl), der kan gøre resultaterne uforudsigelige. Brug af syntetiske data giver kontrolleret ground truth og tillader ekstreme tilfælde. Testene omfatter tilføjet støj, varierende antal punkter, planaritetsproblemer (punkter på ét plan), afstand til objektet og forskellige startgæt. Resultaterne viser, at metoderne opfører sig meget forskelligt, så valget afhænger af den tiltænkte anvendelse og de tilgængelige data. Specialet oplister fordele og ulemper for hver metode for at støtte et informeret valg.
Pose estimation is the process of determining the position and orientation of a camera or object from images. It is widely used for tracking and augmented reality, where real-time speed and reliable accuracy both matter. For example, software like ARToolKit tracks a flat marker and draws 3D objects on top of it; here speed is prioritized so the overlay appears stable to the eye, while small accuracy errors are acceptable. Although 2D–2D correspondences are also an active topic, this thesis focuses on 2D–3D correspondences. There is no common standard for comparing pose estimation methods, so the thesis conducts a fair, side-by-side evaluation of four approaches. Each method estimates perspective from known matches between 2D image points and 3D points in a point cloud (2D–3D correspondences). The tested methods are CPC, PosIt, PosIt for coplanar points, and DLT, a linear baseline chosen for its simplicity. We examine practical limitations, such as minimum numbers of point matches and sensitivity to noise (random measurement errors), which can make results unpredictable. Using synthetic data provides controlled ground truth and allows extreme cases. The tests cover added noise, varying numbers of points, planarity issues (points lying on a single plane), distance to the object, and different initial guesses. Results show the methods behave quite differently, so selecting one depends on the intended application and the data available. The thesis lists benefits and drawbacks for each method to support an informed choice.
[This abstract was generated with the help of AI]
