Ben Humphreys

  • Archive
  • RSS

Print Wiki Parse Tree with `mwlib`

It’s hard to see how mwlib treats Wiki markup. You can use this small snippet to output the tree of an article and see what you’re dealing with. It’s recursive so if you have a Huge document it’s possible you’ll get some problems, although I haven’t hit any yet.

Output is something like:

Section
  Node
    Text
  Node
    Paragraph
      Text
    Paragraph
      Style
        Node
          Node
            ArticleLink
            Text
            ArticleLink
            Text
...

There’s probably a function within mwlib for doing this, but I couldn’t find it.

    • #programming
    • #python
    • #mwlib
    • #wiki
  • 1 month ago
  • 15
  • Comments
  • Permalink
  • Share
    Tweet

About

Avatar Computational linguistics researcher at Kyoto University, focussing on machine translation. Also learning Japanese, Korean, French and other badassery.
(日本語版)

Me, Elsewhere

  • @benhumphreys on Twitter
  • benhumphreys on github
  • RSS
  • Random
  • Archive
  • Mobile

Effector Theme by Carlo Franco.

Powered by Tumblr