This question about Using an extension: Task filed

Why images become corrupted in PDF?

Most images become corrupted in generated pdf like in this example:

corrupted_pdf.png

I've tried it with .jpg, .png, but nothing helps. The distortion looks always the same for particular image, which makes me think that this is a systematical algorithm error in htmldoc, but why? Have you experienced such artifacts?

-- AndreyStoliarov - 25 Nov 2009

What version of htmldoc are you using? And what OS and what version of the GenPDFAddOn? I'm not seeing any distortion like that on my test or production systems.

-- GeorgeClark - 25 Nov 2009

I'm using htmldoc 1.9.1586 binary from indiecodelabs.com, OS is Win XP SP2. GenPDFAddOn version is the latest one (can't give exact number, because I've installed it manually and for some reason it isn't shown as installed in find More Extensions).

-- AndreyStoliarov - 26 Nov 2009

I'm sorry for the delay in responding. The GenPDFAddOn doesn't report a version in InstalledPlugins or FindMoreExtensions because it's not a plugin, so it doesn't get picked up by the API.

As far as image generation goes, all the GenPDFAddOn does to the images is to save each image as a file using a binary copy function. If you run the plugin with debug enabled by adding the following to your SitePreferences, the working files and saved images should be left in the working/tmp directory as a file named GenPDFImg[longrandomstring] so you can view them in an image viewer to make sure that the image was not modified by the plugin. Note that there is no file type suffix, however most image software will determine the file type from the signature internal to the file.
  • Set GENPDFADDON_DEBUG = 1
The other thing you could do is to create a simple html document outside of Foswiki including some embedded images and run htmldoc externally to test the pdf generation.

Note that the 1.9.x versions of htmldoc are listed as "pre-release snapshots - not recommended for production usage". So you might have better luck with the released 1.8.27 version of htmldoc.

-- GeorgeClark - 01 Dec 2009

Hi George,

Now I am sorry for late response ) Except time shortage to try what you suggested I then had problems with logging into wiki.

But nevertheless I've made described experiments and after all I'm even more confused. Results are:

1. htmldoc separately works fine on different pages, even if it misses some unknown tags.

2. I've used temporary html page + images, even copied them to other folder for purity of results and htmldoc once again worked just all right.

Now, where in the communication between GenPDFAddon and htmldoc the problem might be?

-- AndreyStoliarov - 10 Dec 2009

I found that it helped to modify /lib/Foswiki/Contrib/GenPDF.pm (down near the end, where all the other arguments are added) to add an extra argument.
   push @htmldocArgs, "--no-compression";       # Work-around for image corruption problem

-- SeanMorgan - 10 Dec 2009

Excellent. Thanks Sean. I'll open a bug report and will add an option to disable compression if image corruption occurs. I hope to release a new version of GenPDFAddOn soon. There are a number of fixes that have been saved up.

Tasks.Item2492 filed.

-- GeorgeClark - 10 Dec 2009

Thank you a lot guys.

Strangely I am now getting perfect clear temporary .pdfs but for some reason they're corrupted when opening or saving through the browser %) I've tried it on Firefox and IE and every time the temporary pdf is all right but obtained from browser is bad. Any ideas on that?

*But I have correct pdf, at least as a temporary file, thx smile

Also it looks like if I remove the additional "--no-compression" argument, it still gives correct temporary files. Something wild happens which i cannot understand big grin

-- AndreyStoliarov - 11 Dec 2009

If you are still seeing some cases of corruption, you might also try adding the --no-jpeg argument. This disables additional jpeg compression of large images. This is different from the --no-compression . jpeg compression is "lossy" and might degrade images. The --no-compression argument disables "Flate" (zip) compression which is supposed to be lossless. Note that as long as you are editing the commandline, you can control compression. I have not found what the htmldoc defaults are.
  • --compression=[1-9] Sets compression level from 1 (least compression) to 9 (maximum).
  • --no-compression Disables Flate (zip) compression
  • --jpeg=xx Sets image quality for jpeg compression.
  • --no-jpeg Disables jpeg image compression
-- GeorgeClark - 12 Dec 2009

Tasks.Item2492 filed. A test version with parameters to control compression is attached to GenPDFAddOn.

QuestionForm edit

Subject Using an extension
Extension GenPDFAddOn
Version Foswiki 1.0.7
Status Task filed
Topic revision: r11 - 10 Jan 2010, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy