6dpose.html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <!-- Global site tag (gtag.js) - Google Analytics -->
    <script async src="https://www.googletagmanager.com/gtag/js?id=UA-61937787-1"></script>
    <script>
      window.dataLayer = window.dataLayer || [];
      function gtag(){dataLayer.push(arguments);}
      gtag('js', new Date());

      gtag('config', 'UA-61937787-1');
    </script>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <!-- TemplateBeginEditable name="doctitle" -->
    <title>Edge Enhanced Implicit Orientation Learning With Geometric Prior for 6D Pose Estimation</title>
    <!-- TemplateEndEditable -->
    <!-- TemplateBeginEditable name="head" -->
    <!-- TemplateEndEditable -->
    
    <style type="text/css">
        
        body {
            font: 100%/1.4 Verdana, Arial, Helvetica, sans-serif;
            background-color: #42413C;
            margin: 0;
            padding: 0;
            color: #000;
            background-attachment: fixed;
            font-size: 17px;
        }

        /* ~~ Element/tag selectors ~~ */
        ul, ol, dl { /* Due to variations between browsers, it's best practices to zero padding and margin on lists. For consistency, you can either specify the amounts you want here, or on the list items (LI, DT, DD) they contain. Remember that what you do here will cascade to the .nav list unless you write a more specific selector. */
            padding: 0;
            margin: 0;
        }

        h1, h2, h3, h4, h5, h6, p {
            margin-top: 0; /* removing the top margin gets around an issue where margins can escape from their containing div. The remaining bottom margin will hold it away from any elements that follow. */
            padding-right: 15px;
            padding-left: 15px; /* adding the padding to the sides of the elements within the divs, instead of the divs themselves, gets rid of any box model math. A nested div with side padding can also be used as an alternate method. */
            font-size: 130%;
            text-align: justify;
        }

        a img { /* this selector removes the default blue border displayed in some browsers around an image when it is surrounded by a link */
            border: none;
        }

        /* ~~ Styling for your site's links must remain in this order - including the group of selectors that create the hover effect. ~~ */
        a:link {
            color: #42413C; /* unless you style your links to look extremely unique, it's best to provide underlines for quick visual identification */
            font-family: Verdana, Arial, Helvetica, sans-serif;
        }

        a:visited {
            color: #6E6C64;
        }

        a:hover, a:active, a:focus { /* this group of selectors will give a keyboard navigator the same hover experience as the person using a mouse. */
            text-decoration: none;
        }

        /* ~~ this fixed width container surrounds all other divs~~ */
        .container {
            width: 1000px;
            background-color: #FFF;
            margin: 0 auto; /* the auto value on the sides, coupled with the width, centers the layout */
            overflow: hidden; /* this declaration makes the .container understand where the floated columns within ends and contain them */
        }

        /* ~~ These are the columns for the layout. ~~

        1) Padding is only placed on the top and/or bottom of the divs. The elements within these divs have padding on their sides. This saves you from any "box model math". Keep in mind, if you add any side padding or border to the div itself, it will be added to the width you define to create the *total* width. You may also choose to remove the padding on the element in the div and place a second div within it with no width and the padding necessary for your design.

        2) No margin has been given to the columns since they are all floated. If you must add margin, avoid placing it on the side you're floating toward (for example: a right margin on a div set to float right). Many times, padding can be used instead. For divs where this rule must be broken, you should add a "display:inline" declaration to the div's rule to tame a bug where some versions of Internet Explorer double the margin.

        3) Since classes can be used multiple times in a document (and an element can also have multiple classes applied), the columns have been assigned class names instead of IDs. For example, two sidebar divs could be stacked if necessary. These can very easily be changed to IDs if that's your preference, as long as you'll only be using them once per document.

        4) If you prefer your nav on the right instead of the left, simply float these columns the opposite direction (all right instead of all left) and they'll render in reverse order. There's no need to move the divs around in the HTML source.

        */
        .sidebar1 {
            float: left;
            width: 180px;
            height: 800px;
            background-color: #EADCAE;
            padding-bottom: 10px;
        }

        .content {
            padding: 10px 0px 10px 0px;
            width: 1000px;
            float: center;
            text-align: center;
        }

        .sidebar2 {
            float: left;
            width: 180px;
            background-color: #EADCAE;
            padding: 10px 0;
        }

        /* ~~ This grouped selector gives the lists in the .content area space ~~ */
        .content ul, .content ol {
            padding: 0 15px 15px 40px; /* this padding mirrors the right padding in the headings and paragraph rule above. Padding was placed on the bottom for space between other elements on the lists and on the left to create the indention. These may be adjusted as you wish. */
        }

        /* ~~ The navigation list styles (can be removed if you choose to use a premade flyout menu like Spry) ~~ */
        ul.nav {
            list-style: none; /* this removes the list marker */
            border-top: 1px solid #666; /* this creates the top border for the links - all others are placed using a bottom border on the LI */
            margin-bottom: 15px; /* this creates the space between the navigation on the content below */
        }

            ul.nav li {
                border-bottom: 1px solid #666; /* this creates the button separation */
            }

            ul.nav a, ul.nav a:visited { /* grouping these selectors makes sure that your links retain their button look even after being visited */
                padding: 5px 5px 5px 15px;
                display: block; /* this gives the link block properties causing it to fill the whole LI containing it. This causes the entire area to react to a mouse click. */
                width: 160px; /*this width makes the entire button clickable for IE6. If you don't need to support IE6, it can be removed. Calculate the proper width by subtracting the padding on this link from the width of your sidebar container. */
                text-decoration: none;
                background-color: #C6D580;
            }

                ul.nav a:hover, ul.nav a:active, ul.nav a:focus { /* this changes the background and text color for both mouse and keyboard navigators */
                    background-color: #ADB96E;
                    color: #FFF;
                }

        /* ~~ miscellaneous float/clear classes ~~ */
        .fltrt { /* this class can be used to float an element right in your page. The floated element must precede the element it should be next to on the page. */
            float: right;
            margin-left: 8px;
        }

        .fltlft { /* this class can be used to float an element left in your page. The floated element must precede the element it should be next to on the page. */
            float: left;
            margin-right: 8px;
        }

        .clearfloat { /* this class can be placed on a <br /> or empty div as the final element following the last floated div (within the #container) if the overflow:hidden on the .container is removed */
            clear: both;
            height: 0;
            font-size: 1px;
            line-height: 0px;
        }

        .container .content .content {
            font-size: 100%;
            text-align: center;
        }
        
    </style>
    
</head>

<body>
    <div class="container">

        <div class="content">
            <table width="80%" align="center" cellspacing="10">
                <tr>
                    <td align="center" style="font-size:120%"><strong>Edge Enhanced Implicit Orientation Learning With Geometric Prior for 6D Pose Estimation</strong></td>
                </tr>
                <tr>
                    <td align="center" style="font-size:90%"><a href="" style="text-decoration:none;">Yilin Wen</a><font size="2"><sup>1</sup></font>, <a href="http://haopan.github.io/" style="text-decoration:none;">Hao Pan</a><font size="2"><sup>2</sup></font>, <a href="https://scholar.google.co.in/citations?user=JAzFhXQAAAAJ&hl=vi" style="text-decoration:none;">Lei Yang</a><font size="2"><sup>1</sup></font>, <a href="http://i.cs.hku.hk/~wenping/" style="text-decoration:none;">Wenping Wang</a><font size="2"><sup>1</sup></font></td>
                </tr>
                <tr>
                    <td align="center" style="font-size:90%"><font size="2"><sup>1</sup></font>The University of Hong Kong, <font size="2"><sup>2</sup></font>Microsoft Research Asia</td>
                </tr>
                <tr>
                    <td align="center" style="font-size:90%">IEEE Robotics and Automation Letters &amp; IROS 2020</td>
                </tr>
                <tr>
                    <td align="center" style="font-size:100%">&nbsp;</td>
                </tr>
                <tr>
                    <td align="left" style="font-size:100%"><strong>Abstract</strong></td>
                </tr>
                <tr>
                    <td align="justify" style="font-size:75%">
                    Estimating 6D poses of rigid objects from RGB images is an important but challenging task. This is especially true for textureless objects with strong symmetry, since they have only sparse visual features to be leveraged for the task and their symmetry leads to pose ambiguity. The implicit encoding of orientations learned by autoencoders has demonstrated its effectiveness in handling such objects without requiring explicit pose labeling. In this paper, we further improve this methodology with two key technical contributions. First, we use edge cues to complement the color images with more discriminative features and reduce the domain gap between the real images for testing and the synthetic ones for training. Second, we enhance the regularity of the implicitly learned pose representations by a self-supervision scheme to enforce the geometric prior that the latent representations of two images presenting nearby rotations should be close too. Our approach achieves the state-of-the-art performance on the T-LESS benchmark in the RGB domain; its evaluation on the LINEMOD dataset also outperforms other synthetically trained approaches. Extensive ablation tests demonstrate the improvements enabled by our technical designs. Our code is publicly available for research use.
                    </td>
                </tr>
                <!--<tr>
                    <td align="left" style="font-size:110%"><strong>Paper and data</strong></td>
                </tr>-->
                <tr align="left">
                    <td>
                        <table>
                            <tr>
                                <td><a href="papers/6dpose.pdf"><img src="images/6dpose_paper_thumbnail.jpg" width="190" /></a></td>
                                <td>&nbsp;</td>
                                <td style="font-size:75%">
                                    <strong>Preprint</strong> [<a href="papers/6dpose.pdf">PDF</a>] <br/><br />
                                    <strong>Code and data</strong> [<a href="https://github.com/fylwen/EEGP-AAE">Link</a>] <br /><br />
                                    <strong>Supplemental video</strong> [<a href="data/6dpose_supplemental.zip">ZIP</a>] <br /><br />
                                    <strong>Citation</strong><br/>
                                    Y. Wen, H. Pan, L. Yang and W. Wang, "Edge Enhanced Implicit Orientation Learning With Geometric Prior for 6D Pose Estimation," in IEEE Robotics and Automation Letters, vol. 5, no. 3, pp. 4931-4938, July 2020, doi: 10.1109/LRA.2020.3005121.<br/>
                                    <a href="6dpose_bibtex.bib">(bibtex)</a>
                                </td>
                            </tr>
                        </table>
                    </td>
                </tr>
                <tr>
                    <td align="center">&nbsp;</td>
                </tr>
                <tr>
                    <td align="left" style="font-size:100%"><strong>Algorithm overview</strong></td>
                </tr>
                <tr>
                    <td align="center">&nbsp;</td>
                </tr>
                <tr>
                    <td><img src="images/6dpose_training_pipeline.jpg" width="100%" /></td>
                </tr>
                <tr>
                    <td align="justify" style="font-size:75%">
                        <b>Training stage pipeline.</b> Given a pair of the augmented color image and its edge map, the encoder maps the concatenated image pair to a code in the latent space. The code is compared against a set of reference codes to impose the geometry prior of the rotation space. Meanwhile, the code is passed through the color and edge decoders to reconstruct the canonical color image (lower branch) and edge map (upper branch), respectively. The reconstruction loss and the geometric prior loss together help the autoencoder to learn an implicit orientation encoding that is more aware of the discriminative edge cues and closer to the rotation space geometry.
                    </td>
                </tr>
                <tr>
                    <td align="center">&nbsp;</td>
                </tr>
                <tr>
                    <td><img src="images/6dpose_test_pipeline.jpg" width="100%" /></td>
                </tr>
                <tr>
                    <td align="justify" style="font-size:75%">
                        <b>Test stage pipeline.</b> Given a query RGB image and the object of interest as input, the object in the image is detected by a 2D detector. The detected object region is cropped and resized to feed to the trained encoder. Using the offline generated codebook consisting of encodings of the representative rotations, we estimate the rotation of the query instance via nearest code retrieval, and infer the translation based on the bounding box scale ratio between the retrieved pose template and the detected 2D bounding box.
                    </td>
                </tr>
                <tr>
                    <td align="center">&nbsp;</td>
                </tr>
                
                <tr> <td align="center">&nbsp;</td> </tr>
<!--            <tr>
                    <td align="left"><b>Video</b><br/><br />
                        <iframe width="640" height="360" src="" frameborder="0" allowfullscreen=allowfullscreen></iframe>
                    </td>
                </tr> -->
                <tr>
                    <td align="center">&nbsp;</td>
                </tr>
                <tr>
                    <td align="center">&nbsp;</td>
                </tr>
                <tr>
                    <td align="center" style="font-size:80%">&copy;Hao Pan. Last update: Aug 6, 2020.</td>
                </tr>
            </table>
        </div>
        <!-- end .container -->
    </div>
</body>
</html>