Anonymous Login
2023-06-08 18:57 PDT

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0000326v2.3 Release (Closed)[All Projects] Generalpublic2011-03-24 09:52
ReporterKyle Skrinak 
Assigned Topedroa 
PriorityurgentSeveritymajorReproducibilityalways
StatusclosedResolutionfixed 
Product Version 
Target VersionFixed in Version2.3 
Summary0000326: Encoding issues from web report view to PDF
DescriptionYour encoding for HTML is utf-8 but your encoding for your PDFs is ANSI, with I presume is the Windows Latin-1 / ISO/IEC 8859-1 character set, a subset of UTF-8 in terms of what characters it can reliably handle.

So, when using non-UTF-8 characters, they shift unpredictably and you can replicate this problem by using any modern text editor.
1) FIrst make sure you have "extended" characters where it will appear in the report. For example, "?" or "†" will suffice
2) View the report; all should be well.
3) Save the source of a rendered report page. This will be a UTF-8 encoded file. 4) Next, reload that same file but using 8859 encoding. You'll see the characters shift most unpleasantly and this replicates what happens in the PDF view of the report.

I've attached a screenshot showing the utf-8 and the re-encoding into 8859-1
Additional InformationThere's a brilliant discussion on the matter here: http://www.phpwact.org/php/i18n/charsets
TagsNo tags attached.
Attached Files

-Relationships
has duplicate 0000490closed Pending Requests UTF8 not in PDF files 
has duplicate 0000374closed Pending Requests Incorrect rendering of the cyrillic text in the pdf-reports 
related to 0000345closed v2.0 Release (Closed) UTF8 hanling pb in PDF generated reports 
+Relationships

-Notes

~0001003

caseydk (administrator)

I've been hacking at this one for hours and found out some useful info.

Apparently, the PDF generation class we're using - Cezpdf - isn't very good at supporting non-ANSI character sets short of manually setting each and every mapping. While this will work, it's time consuming and we need to cleanup how PDFs are generated firs.t.. or else we'll have to put these mappings all over the place.

Resulting todo list:
* I'll refactor PDFs in 2.1/2.2; AND
* Someone else needs to take a lead on how to do the mappings. If you look at the last few pages of lib/ezpdf/readme.pdf, you can see what the format will be; OR
* Someone needs to find a PDF library that will support the character sets we need.

Any takers on the second or third points?

~0001011

Kyle Skrinak (reporter)

I'd advise against 1-to-1 char mapping, I can volunteer to find a PDF library that supports UTF-8

~0001645

robertbasic (developer)

Will work on unicode issues with PDFs. Most likely will need to switch to TCPDF, we're currently successfully using it on a project where languages like Russian, Romanian, Hungarian and Serbian are used.

~0001656

pedroa (administrator)

Due to my experience with the reporting subject and FPDF/TCPDF I am assigning this to myself.

Also so I can be in the loop of robertbasic.
Now, lets see what we can come up with.

I know this is marked as 'urgent', but believe me this may take some time.

Thanks,

Pedro A.

~0001705

caseydk (administrator)

I'm working to review & merge Robert's stuff now -
https://github.com/caseysoftware/web2project/pull/21
It looks promising.

~0001711

caseydk (administrator)

Merged RobertBasic's PDF generation cleanup
credit: https://github.com/robertbasic/web2project/tree/pdfgeneration
Resolved in r1716, will be in pending v2.3 later this month

~0001780

caseydk (administrator)

Closed in preparation for v2.3 release.
+Notes

-Issue History
Date Modified Username Field Change
2009-12-08 03:45 Kyle Skrinak New Issue
2009-12-08 03:45 Kyle Skrinak File Added: utf-8-8859.png
2009-12-08 11:48 caseydk Project v1.1 Release (Closed) => v1.3 Release (Closed)
2009-12-11 20:42 caseydk Priority normal => urgent
2010-01-05 20:01 caseydk Relationship added related to 0000345
2010-02-18 04:38 caseydk Project v1.3 Release (Closed) => v2.0 Release (Closed)
2010-05-01 16:32 caseydk Relationship added parent of 0000374
2010-05-04 18:20 caseydk Category Reports => PDF Generation
2010-05-04 18:20 caseydk Description Updated
2010-05-12 22:18 caseydk File Added: utf8PDF.jpg
2010-06-12 12:53 caseydk Project v2.0 Release (Closed) => Pending Requests
2010-06-12 12:59 caseydk Note Added: 0001003
2010-06-14 06:04 Kyle Skrinak Note Added: 0001011
2010-06-14 17:34 caseydk Relationship added has duplicate 0000490
2011-02-17 13:09 robertbasic Note Added: 0001645
2011-02-22 01:30 pedroa Status new => assigned
2011-02-22 01:30 pedroa Assigned To => pedroa
2011-02-22 01:33 pedroa Note Added: 0001656
2011-02-22 01:36 pedroa Relationship deleted parent of 0000374
2011-02-22 01:36 pedroa Relationship added has duplicate 0000374
2011-03-07 10:32 caseydk Note Added: 0001705
2011-03-07 17:02 caseydk Project Pending Requests => v2.3 Release (Closed)
2011-03-07 17:05 caseydk Note Added: 0001711
2011-03-07 17:05 caseydk Status assigned => resolved
2011-03-07 17:05 caseydk Resolution open => fixed
2011-03-24 09:52 caseydk Note Added: 0001780
2011-03-24 09:52 caseydk Status resolved => closed
2011-03-24 09:52 caseydk Fixed in Version => 2.3
+Issue History