Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

please use █ instead of ■ when converting {aligned} into docx #209

Open
ZhuangQu opened this issue Mar 13, 2023 · 6 comments
Open

please use █ instead of ■ when converting {aligned} into docx #209

ZhuangQu opened this issue Mar 13, 2023 · 6 comments
Labels

Comments

@ZhuangQu
Copy link

ZhuangQu commented Mar 13, 2023

I use pandoc 3.1.1 in Windows11. When converting

\begin{equation*}
    \begin{aligned}
        1= & 2 &  & 3 \\
        =  & 4 &  & 5 \\
    \end{aligned}
\end{equation*}

from LaTeX into docx, we get

■(1=&2&&3@=&4&&5) 

in Word. we can see that you convert {aligned} to ■, which is wrong. The correct output is █.
In UnicodeMath, ■ U+25A0 represents a matrix, █ U+2588 represents an aligned structure.

@ZhuangQu ZhuangQu added the bug label Mar 13, 2023
@ZhuangQu ZhuangQu changed the title use █ instead of ■ when converting {aligned} into docx please use █ instead of ■ when converting {aligned} into docx Mar 13, 2023
@jgm jgm transferred this issue from jgm/pandoc Mar 13, 2023
@jgm
Copy link
Owner

jgm commented Mar 13, 2023

Transferring to jgm/texmath which does our math conversion.

Note: we don't use UnicodeMath; we use Word's XML representation of math.
The above aligned environment is translated as

<m:oMathPara>
  <m:oMathParaPr>
    <m:jc m:val="center" />
  </m:oMathParaPr>
  <m:oMath>
    <m:m>
      <m:mPr>
        <m:baseJc m:val="center" />
        <m:plcHide m:val="1" />
        <m:mcs>
          <m:mc>
            <m:mcPr>
              <m:mcJc m:val="right" />
              <m:count m:val="1" />
            </m:mcPr>
          </m:mc>
          <m:mc>
            <m:mcPr>
              <m:mcJc m:val="left" />
              <m:count m:val="1" />
            </m:mcPr>
          </m:mc>
          <m:mc>
            <m:mcPr>
              <m:mcJc m:val="right" />
              <m:count m:val="1" />
            </m:mcPr>
          </m:mc>
          <m:mc>
            <m:mcPr>
              <m:mcJc m:val="left" />
              <m:count m:val="1" />
            </m:mcPr>
          </m:mc>
        </m:mcs>
      </m:mPr>
      <m:mr>
        <m:e>
          <m:r>
            <m:t>1</m:t>
          </m:r>
          <m:r>
            <m:rPr>
              <m:sty m:val="p" />
            </m:rPr>
            <m:t>=</m:t>
          </m:r>
        </m:e>
        <m:e>
          <m:r>
            <m:t>2</m:t>
          </m:r>
        </m:e>
        <m:e />
        <m:e>
          <m:r>
            <m:t>3</m:t>
          </m:r>
        </m:e>
      </m:mr>
      <m:mr>
        <m:e>
          <m:r>
            <m:rPr>
              <m:sty m:val="p" />
            </m:rPr>
            <m:t>=</m:t>
          </m:r>
        </m:e>
        <m:e>
          <m:r>
            <m:t>4</m:t>
          </m:r>
        </m:e>
        <m:e />
        <m:e>
          <m:r>
            <m:t>5</m:t>
          </m:r>
        </m:e>
      </m:mr>
    </m:m>
  </m:oMath>
</m:oMathPara>

Please suggest more appropriate OMML.

@ZhuangQu
Copy link
Author

ZhuangQu commented Mar 17, 2023

Sorry, I don't know what is OMML.
I only know that █ is correct and ■ is wrong.
Maybe you can convert UnicodeMath to OMML to get more appropriate OMML.

@jgm
Copy link
Owner

jgm commented Mar 17, 2023

Experimenting with Word: using U+25A0, I get
Screen Shot 2023-03-16 at 11 43 31 PM

and XML

     <m:oMathPara>
        <m:oMathParaPr>
          <m:jc m:val="center" />
        </m:oMathParaPr>
        <m:oMath>
          <m:m>
            <m:mPr>
              <m:plcHide m:val="1" />
              <m:mcs>
                <m:mc>
                  <m:mcPr>
                    <m:count m:val="1" />
                    <m:mcJc m:val="right" />
                  </m:mcPr>
                </m:mc>
                <m:mc>
                  <m:mcPr>
                    <m:count m:val="1" />
                    <m:mcJc m:val="left" />
                  </m:mcPr>
                </m:mc>
                <m:mc>
                  <m:mcPr>
                    <m:count m:val="1" />
                    <m:mcJc m:val="right" />
                  </m:mcPr>
                </m:mc>
                <m:mc>
                  <m:mcPr>
                    <m:count m:val="1" />
                    <m:mcJc m:val="left" />
                  </m:mcPr>
                </m:mc>
              </m:mcs>
              <m:ctrlPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
              </m:ctrlPr>
            </m:mPr>
            <m:mr>
              <m:e>
                <m:r>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>1</m:t>
                </m:r>
                <m:r>
                  <m:rPr>
                    <m:sty m:val="p" />
                  </m:rPr>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>=</m:t>
                </m:r>
              </m:e>
              <m:e>
                <m:r>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>2</m:t>
                </m:r>
              </m:e>
              <m:e />
              <m:e>
                <m:r>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>3</m:t>
                </m:r>
              </m:e>
            </m:mr>
            <m:mr>
              <m:e>
                <m:r>
                  <m:rPr>
                    <m:sty m:val="p" />
                  </m:rPr>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>=</m:t>
                </m:r>
              </m:e>
              <m:e>
                <m:r>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>4</m:t>
                </m:r>
              </m:e>
              <m:e />
              <m:e>
                <m:r>
                  <w:rPr>
                    <w:rFonts w:ascii="Cambria Math"
                    w:hAnsi="Cambria Math" />
                  </w:rPr>
                  <m:t>5</m:t>
                </m:r>
              </m:e>
            </m:mr>
          </m:m>
        </m:oMath>
      </m:oMathPara>

while with U+2588, I get
Screen Shot 2023-03-16 at 11 43 56 PM

and XML

      <m:oMathPara>
        <m:oMath>
          <m:eqArr>
            <m:eqArrPr>
              <m:ctrlPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" w:cs="Arial" />
                  <w:color w:val="24292F" />
                  <w:sz w:val="21" />
                  <w:szCs w:val="21" />
                  <w:shd w:val="clear" w:color="auto"
                  w:fill="FFFFFF" />
                </w:rPr>
              </m:ctrlPr>
            </m:eqArrPr>
            <m:e>
              <m:r>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>1</m:t>
              </m:r>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>=</m:t>
              </m:r>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>&amp;</m:t>
              </m:r>
              <m:r>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>2</m:t>
              </m:r>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>&amp;</m:t>
              </m:r>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>&amp;</m:t>
              </m:r>
              <m:r>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>3</m:t>
              </m:r>
              <m:ctrlPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
              </m:ctrlPr>
            </m:e>
            <m:e>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>=&amp;</m:t>
              </m:r>
              <m:r>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>4</m:t>
              </m:r>
              <m:r>
                <m:rPr>
                  <m:sty m:val="p" />
                </m:rPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>&amp;&amp;</m:t>
              </m:r>
              <m:r>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
                <m:t>5</m:t>
              </m:r>
              <m:ctrlPr>
                <w:rPr>
                  <w:rFonts w:ascii="Cambria Math"
                  w:hAnsi="Cambria Math" />
                </w:rPr>
              </m:ctrlPr>
            </m:e>
          </m:eqArr>
        </m:oMath>
      </m:oMathPara>

The first (current behavior) is actually closer in appearance to what pdflatex gives us, which is
Screen Shot 2023-03-16 at 11 44 26 PM

@ZhuangQu
Copy link
Author

ZhuangQu commented Mar 17, 2023

No, the second is closer!
Your 2 and 3 are crowded together because there are no spaces added. Please try:

█(1=&2&  &3@=&4&  &5) 

I advocate that ■ corresponds to {matrix} and █ corresponds to {aligned}, because of the meaning of &.
Both ■ in docx and {matrix} in LaTeX, & means a column.
Both █ in docx and {aligned} in LaTeX, odd & means an aligning-point and even & means a padding-point.
Do you find that in your first case, the space between 2 and 3 is too wide?
Because the 2nd & is treated as a new empty column! Not an aligning-point.

@ZhuangQu
Copy link
Author

ZhuangQu commented Mar 17, 2023

I understand that format converting is not always perfect and exact. If the cost of modification is too high, please close this issue.

@jgm
Copy link
Owner

jgm commented Mar 17, 2023

I'll keep this open. It would not be a small change, because currently we don't have an AST element for aligned environments that is separate from that for matrices -- we use the same form for both. That's not ideal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants