Difference between pages "Contagion Nullifier" and "Converting documents to mediawiki markup"
imported>KyrinFireheart |
imported>Floobity |
||
Line 1: | Line 1: | ||
+ | ==Intro== | ||
+ | There appears to be few existing tools to automatically convert other document formats (e.g .doc, .xls, .ppt). The simplest approach right now appears to be to convert documents into html and from that to convert them to wiki markup as there are good html->wiki converters. [http://en.wikipedia.org/wiki/Wikipedia:Tools/Editing_Tools Here] are some good links. | ||
− | + | More tools and techniques have been developed at Appropedia (the sustainable development wiki) including for converting PDFs: [http://www.appropedia.org/Appropedia:Porting_formatted_content_to_MediaWiki Porting formatted content to MediaWiki] and [http://www.appropedia.org/Help:Porting_PDF_files_to_MediaWiki Help:Porting PDF files to MediaWiki] | |
− | + | ==HTML documents== | |
− | + | [http://diberri.dyndns.org/wikipedia/html2wiki/ html2wiki] converter based on [http://search.cpan.org/dist/HTML-WikiConverter/ HTML::WikiConverter Perl module] | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | ==Word Documents== | |
+ | Saving a relatively simple Word document (no images or tables) to html and then running that through the converter [http://diberri.dyndns.org/html2wiki.html here] produced good mediawiki formatting. A document including images, tables, and centered text did not work as well. The images would need to be added to the wiki separately, the table also didn't come out quite right and centered text was no longer centered. | ||
− | + | A direct converter can be found [http://216.239.37.104/translate_c?u=http://www.homeopathy.at/wiki/index.php%3Ftitle%3DWord2Wiki here]. | |
− | + | A series of Word macros for doing simple conversions (including tables) is [http://www.infpro.com/Word2MediaWiki.aspx here]; they seem to work reasonably well but aren't designed for sophisticated layouts. | |
− | |||
− | + | Also, with the release of [http://www.openoffice.org/ OpenOffice], 2.4, OpenOffice can now export documents to mediawiki format. Since OpenOffice can also read MS Word documents, this allows OpenOffice to serve as a Word to MediaWiki converted. | |
− | |||
− | + | ===Images=== | |
+ | |||
+ | I have had good success with the following steps for porting images embeddded in word documents to MediaWiki format on a Mac: | ||
+ | # Click on the image in the word document and choose Edit->Copy from the menu (cmd-C) | ||
+ | # Go to the application GraphicConverter and choose File->New->Image with clipboard (cmd-J) | ||
+ | # Choose File->Save as and save as a JPEG/JFIF format (.jpg) file with 100% Quality. | ||
+ | |||
+ | Alternatively, if you want to take an image which has associated text boxes, it seems to come out well if you take a screenshot of a selection with Grab (in the /Applications/Utilities folder), save as a .tiff (your only option) and then open in GraphicConverter and save as a JPEG as described above. | ||
+ | |||
+ | ==Excel Documents== | ||
+ | *If you can export a data in comma separated variable (CSV) format, then a converter [http://area23.brightbyte.de/csv2wp.php exists]. | ||
+ | *Simpler, less feature-rich script supporting "copy and paste" conversion: [http://people.fas.harvard.edu/~sdouglas/table.cgi Excel to Wiki Table Converter] | ||
+ | |||
+ | <!-- | ||
+ | * The Wibbit extension currently installed has a table editor with a import from tab separated values so you do not need any external tools. On the edit page, click on the insert table button (second one from left) and click paste. | ||
+ | |||
+ | --> |
Revision as of 18:54, 7 April 2009
Intro
There appears to be few existing tools to automatically convert other document formats (e.g .doc, .xls, .ppt). The simplest approach right now appears to be to convert documents into html and from that to convert them to wiki markup as there are good html->wiki converters. Here are some good links.
More tools and techniques have been developed at Appropedia (the sustainable development wiki) including for converting PDFs: Porting formatted content to MediaWiki and Help:Porting PDF files to MediaWiki
HTML documents
html2wiki converter based on HTML::WikiConverter Perl module
Word Documents
Saving a relatively simple Word document (no images or tables) to html and then running that through the converter here produced good mediawiki formatting. A document including images, tables, and centered text did not work as well. The images would need to be added to the wiki separately, the table also didn't come out quite right and centered text was no longer centered.
A direct converter can be found here.
A series of Word macros for doing simple conversions (including tables) is here; they seem to work reasonably well but aren't designed for sophisticated layouts.
Also, with the release of OpenOffice, 2.4, OpenOffice can now export documents to mediawiki format. Since OpenOffice can also read MS Word documents, this allows OpenOffice to serve as a Word to MediaWiki converted.
Images
I have had good success with the following steps for porting images embeddded in word documents to MediaWiki format on a Mac:
- Click on the image in the word document and choose Edit->Copy from the menu (cmd-C)
- Go to the application GraphicConverter and choose File->New->Image with clipboard (cmd-J)
- Choose File->Save as and save as a JPEG/JFIF format (.jpg) file with 100% Quality.
Alternatively, if you want to take an image which has associated text boxes, it seems to come out well if you take a screenshot of a selection with Grab (in the /Applications/Utilities folder), save as a .tiff (your only option) and then open in GraphicConverter and save as a JPEG as described above.
Excel Documents
- If you can export a data in comma separated variable (CSV) format, then a converter exists.
- Simpler, less feature-rich script supporting "copy and paste" conversion: Excel to Wiki Table Converter