Context Navigation

Modify ↓

#897 closed defect (wontfix)

two incorrect characters instead of one correct in PDF output

Reported by:	anonymous	Owned by:	Alec Thomas
Priority:	normal	Component:	PageToPdfPlugin
Severity:	normal	Keywords:	UTF-8
Cc:		Trac Release:	0.10

Description

I've checked out this plugin form subversion repository and it can't handle utf-8 encoded pages. Generates two characters instead of one correct. I've read previous posts on this topic and saw that it had been fixed, but it does not work for me. Thanks.

Attachments (0)

Change History (12)

comment:1 follow-up: 2 Changed 18 years ago by Noah Kantrowitz

What is your default_charset in trac.ini?

comment:2 in reply to: 1 Changed 18 years ago by anonymous

Replying to coderanger:

What is your default_charset in trac.ini?

Hi. My trac.ini contains:

[trac]
default_charset = UTF-8

[pagetopdf]
charset = UTF-8

comment:3 follow-up: 4 Changed 18 years ago by Noah Kantrowitz

I think that should be utf-8 (note the lower case).

comment:4 in reply to: 3 Changed 18 years ago by anonymous

Replying to coderanger:

I think that should be utf-8 (note the lower case).

Unfortunately, it doesn't work with lowercase, either.

Environment:

CentOS 4.3 linux
htmldoc 1.8.27
trac-0.10
Python 2.3.4

The text is in Hungarian with accented characters. Trac wiki works ok.

comment:5 follow-up: 6 Changed 18 years ago by Noah Kantrowitz

What encoding are you actually using for the text?

comment:6 in reply to: 5 Changed 18 years ago by anonymous

Replying to coderanger:

What encoding are you actually using for the text?

I'm not sure, I understand your question correctly... What do you mean? I use utf-8 default_charset in trac.ini, the default is utf-8 on my linux-box. Wiki pages are utf-8 texts in trac:

[root@dev tmp]# trac-admin /opt/trac/dia wiki export TestPage test
[root@dev tmp]# file test
test: UTF-8 Unicode text, with CRLF line terminators
[root@dev tmp]#

comment:7 follow-up: 8 Changed 18 years ago by Noah Kantrowitz

Trac uses Unicode strings internally, but this doesn't mean your browser is actually sending UTF8. Not sure how you check this on a Linux box, though I would hope it takes the system charset.

comment:8 in reply to: 7 ; follow-up: 9 Changed 18 years ago by anonymous

Replying to coderanger:

Trac uses Unicode strings internally, but this doesn't mean your browser is actually sending UTF8. Not sure how you check this on a Linux box, though I would hope it takes the system charset.

utf-8 is default on linux boxes. htmldoc converts HTML to PDF, trac - I think - creates HTML page from wiki and gives it to htmldoc. The client's charset doesn't affect this process, as far as I know.

pagetopdf.py fragment:

        hfile, hfilename = mkstemp('tracpdf')
        codepage = self.env.config.get('trac', 'default_charset', 0)
        page = wiki_to_html(source, self.env, req).encode(codepage)
        page = re.sub('<img src="(?!\w+://)', '<img src="%s://%s:%d' % (req.scheme,              
                            req.server_name, req.server_port), page)
        os.write(hfile, '<html><body>' + page + '</body></html>')
        os.close(hfile)

Trac logs this:

2006-11-12 16:47:04,174 Trac[pagetopdf] DEBUG: --right 1.5cm --bottom 1.5cm --webpage  --top 1.5cm --format pdf14 --size A4 --charset utf-8 --left 1.5cm

comment:9 in reply to: 8 Changed 18 years ago by anonymous

utf-8 is default on linux boxes. htmldoc converts HTML to PDF, trac - I think - creates HTML page from wiki and gives it to htmldoc. The client's charset doesn't affect this process, as far as I know.

I changed the code to test other encoding (ISO-8859-2):

         page = wiki_to_html(source, self.env, req).encode('iso-8859-2')

and

   htmldoc_args = { 'webpage': None, 'format': 'pdf14', 'left': '1.5cm',
                         'right': '1.5cm', 'top': '1.5cm', 'bottom': '1.5cm',
                         'charset': '8859-2'}

I left defaul_charset as utf-8, since I want utf on my wiki. Only PDF generation is based on Latin2 encoding.

This way it works ok for iso-latin-2 accented characters (utf-8 would be better, but it'll do at this moment). HTMLDOC can't handle UTF-8 (but then how is it possible to work somewhere?).

Well, this is a workaround, but not for trac - for HTMLDOC.

comment:10 Changed 18 years ago by Alec Thomas

Resolution:	→ wontfix
Status:	new → closed

UTF-8 is not supported by htmldoc. You must use one of the supported encodings.

comment:11 Changed 18 years ago by Noah Kantrowitz

#980 has been marked as a duplicate.

comment:12 Changed 14 years ago by anonymous

Keywords:	UTF-8 added; utf8 removed

Modify Ticket

Change Properties

Summary:
Type:	Priority:
Component:	Severity:
Keywords:	Cc:	Set your email in Preferences
Trac Release:

Action

leave as closed The owner will remain Alec Thomas.

reopen The resolution will be deleted. Next status will be 'reopened'.

Add Comment

Your email or username:

E-mail address and name can be saved in the Preferences.

You may use WikiFormatting here.

Attachments ↑ Description ↑

Note: See TracTickets for help on using tickets.

Download in other formats:

Context Navigation

#897 closed defect (wontfix)

two incorrect characters instead of one correct in PDF output

Description

Attachments (0)

Change History (12)

comment:1 follow-up: 2 Changed 18 years ago by Noah Kantrowitz

comment:2 in reply to: 1 Changed 18 years ago by anonymous

comment:3 follow-up: 4 Changed 18 years ago by Noah Kantrowitz

comment:4 in reply to: 3 Changed 18 years ago by anonymous

comment:5 follow-up: 6 Changed 18 years ago by Noah Kantrowitz

comment:6 in reply to: 5 Changed 18 years ago by anonymous

comment:7 follow-up: 8 Changed 18 years ago by Noah Kantrowitz

comment:8 in reply to: 7 ; follow-up: 9 Changed 18 years ago by anonymous

comment:9 in reply to: 8 Changed 18 years ago by anonymous

comment:10 Changed 18 years ago by Alec Thomas

comment:11 Changed 18 years ago by Noah Kantrowitz

comment:12 Changed 14 years ago by anonymous

Modify Ticket

Add Comment

Changed by anonymous

Download in other formats: