MantisBT - v2.3 Release (Closed)
View Issue Details
0000326v2.3 Release (Closed)[All Projects] Generalpublic2009-12-08 03:452011-03-24 09:52
ReporterKyle Skrinak 
Assigned Topedroa 
PriorityurgentSeveritymajorReproducibilityalways
StatusclosedResolutionfixed 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version2.3 
Summary0000326: Encoding issues from web report view to PDF
DescriptionYour encoding for HTML is utf-8 but your encoding for your PDFs is ANSI, with I presume is the Windows Latin-1 / ISO/IEC 8859-1 character set, a subset of UTF-8 in terms of what characters it can reliably handle.

So, when using non-UTF-8 characters, they shift unpredictably and you can replicate this problem by using any modern text editor.
1) FIrst make sure you have "extended" characters where it will appear in the report. For example, "?" or "†" will suffice
2) View the report; all should be well.
3) Save the source of a rendered report page. This will be a UTF-8 encoded file. 4) Next, reload that same file but using 8859 encoding. You'll see the characters shift most unpleasantly and this replicates what happens in the PDF view of the report.

I've attached a screenshot showing the utf-8 and the re-encoding into 8859-1
Additional InformationThere's a brilliant discussion on the matter here: http://www.phpwact.org/php/i18n/charsets
TagsNo tags attached.
has duplicate 0000490closed  Pending Requests UTF8 not in PDF files 
has duplicate 0000374closed  Pending Requests Incorrect rendering of the cyrillic text in the pdf-reports 
related to 0000345closed  v2.0 Release (Closed) UTF8 hanling pb in PDF generated reports 
Attached Filespng utf-8-8859.png (9,102) 1969-12-31 16:00
https://bugs.web2project.net/file_download.php?file_id=67&type=bug
png

jpg utf8PDF.jpg (62,296) 1969-12-31 16:00
https://bugs.web2project.net/file_download.php?file_id=82&type=bug
jpg

Notes
(0001003)
caseydk   
2010-06-12 12:59   
I've been hacking at this one for hours and found out some useful info.

Apparently, the PDF generation class we're using - Cezpdf - isn't very good at supporting non-ANSI character sets short of manually setting each and every mapping. While this will work, it's time consuming and we need to cleanup how PDFs are generated firs.t.. or else we'll have to put these mappings all over the place.

Resulting todo list:
* I'll refactor PDFs in 2.1/2.2; AND
* Someone else needs to take a lead on how to do the mappings. If you look at the last few pages of lib/ezpdf/readme.pdf, you can see what the format will be; OR
* Someone needs to find a PDF library that will support the character sets we need.

Any takers on the second or third points?
(0001011)
Kyle Skrinak   
2010-06-14 06:04   
I'd advise against 1-to-1 char mapping, I can volunteer to find a PDF library that supports UTF-8
(0001645)
robertbasic   
2011-02-17 13:09   
Will work on unicode issues with PDFs. Most likely will need to switch to TCPDF, we're currently successfully using it on a project where languages like Russian, Romanian, Hungarian and Serbian are used.
(0001656)
pedroa   
2011-02-22 01:33   
Due to my experience with the reporting subject and FPDF/TCPDF I am assigning this to myself.

Also so I can be in the loop of robertbasic.
Now, lets see what we can come up with.

I know this is marked as 'urgent', but believe me this may take some time.

Thanks,

Pedro A.
(0001705)
caseydk   
2011-03-07 10:32   
I'm working to review & merge Robert's stuff now -
https://github.com/caseysoftware/web2project/pull/21
It looks promising.
(0001711)
caseydk   
2011-03-07 17:05   
Merged RobertBasic's PDF generation cleanup
credit: https://github.com/robertbasic/web2project/tree/pdfgeneration
Resolved in r1716, will be in pending v2.3 later this month
(0001780)
caseydk   
2011-03-24 09:52   
Closed in preparation for v2.3 release.

Issue History
2009-12-08 03:45Kyle SkrinakNew Issue
2009-12-08 03:45Kyle SkrinakFile Added: utf-8-8859.png
2009-12-08 11:48caseydkProjectv1.1 Release (Closed) => v1.3 Release (Closed)
2009-12-11 20:42caseydkPrioritynormal => urgent
2010-01-05 20:01caseydkRelationship addedrelated to 0000345
2010-02-18 04:38caseydkProjectv1.3 Release (Closed) => v2.0 Release (Closed)
2010-05-01 16:32caseydkRelationship addedparent of 0000374
2010-05-04 18:20caseydkCategoryReports => PDF Generation
2010-05-04 18:20caseydkDescription Updated
2010-05-12 22:18caseydkFile Added: utf8PDF.jpg
2010-06-12 12:53caseydkProjectv2.0 Release (Closed) => Pending Requests
2010-06-12 12:59caseydkNote Added: 0001003
2010-06-14 06:04Kyle SkrinakNote Added: 0001011
2010-06-14 17:34caseydkRelationship addedhas duplicate 0000490
2011-02-17 13:09robertbasicNote Added: 0001645
2011-02-22 01:30pedroaStatusnew => assigned
2011-02-22 01:30pedroaAssigned To => pedroa
2011-02-22 01:33pedroaNote Added: 0001656
2011-02-22 01:36pedroaRelationship deletedparent of 0000374
2011-02-22 01:36pedroaRelationship addedhas duplicate 0000374
2011-03-07 10:32caseydkNote Added: 0001705
2011-03-07 17:02caseydkProjectPending Requests => v2.3 Release (Closed)
2011-03-07 17:05caseydkNote Added: 0001711
2011-03-07 17:05caseydkStatusassigned => resolved
2011-03-07 17:05caseydkResolutionopen => fixed
2011-03-24 09:52caseydkNote Added: 0001780
2011-03-24 09:52caseydkStatusresolved => closed
2011-03-24 09:52caseydkFixed in Version => 2.3