当前位置: 动力学知识库 > 问答 > 编程问答 >

java - How to convert some limited cases of *.tex files to plain text *.txt

问题描述:

So I've tried using tokenizers, but I can only figure out how to replace or remove single delimiters in java.

Like for this input:

\box { Boxed words }

{\boldface This line in bold. }

I want to be able to remove \box and some other guidelines I have to follow which are:

The rules that we are going to apply are very simple .

  1. Remove all commands backslash followed one or more lowercase letters and terminated

    with a blank.

  2. Remove all braces: } or {.
  3. Substitute all math display (characters in between $), by the words FORMULA 1

    , FORMULA 2 etc...

  4. The environment ( a special command) .

    \begin{enumerate}

    \item First item, \fer and only this.

    \item Second line \iterate and maybe more. \item Third.

    ...

    \end{enumerate}

    puts everything between backslash item in a new paragraph with a number. So the

    above should look:

  5. First item and only this.
  6. Second line and maybe more.
  7. Third.

网友答案:

The (IMO) sensible way to is to use a stand-alone TeX to text (or TeX to HTML) converter. That should:

  • Save you a lot of work in implementing your own converter.
  • Do a better job ... assuming you pick a decent converter.
  • Insulate you from having to deal with a stream of special cases where your heuristic / pattern-based approach fails.
分享给朋友:
您可能感兴趣的文章:
随机阅读: