Search

I have a page defined in Symphony titled “Why I’m Running”.

Given my page’s title, some part of the system ultimately sets $page-title to Why I'm Running. Internet Explorer 8 doesn’t recognize ' and passes the character entity through to the viewer; Firefox and Chrome handle it fine. Research I’ve done so far indicates that this entity is a part of the XML 1.0 specification, but not HTML 4 ref.

As a workaround, I’ve modified the page’s title to use ' and added in disable-output-escaping where necessary. Fields that use Markdown seem to be handling apostrophes correctly.

In my xsl:output, the doctype is XHTML 1.0 Strict and encoding is UTF-8. Content type of HTTP response is text/html. Most of my database tables are latin1_swedish_ci rather than utf8_unicode_ci collation, but I don’t believe that’s the problem.

Ideas?

Most of my database tables are latin1swedishci rather than utf8unicodeci collation, but I don’t believe that’s the problem.

This will definitely cause problems. How did this happen? Didi you manually change the collation to latin1? Symphony should create tables with utf8 collation.

Try putting the following in your xslt for the page with the problems (near the top):

<!DOCTYPE stylesheet [
<!ENTITY apos  "&#39;" >
]>

Just to check that it is being declared properly. This declares the entity reference to the xslt processor and should generate apostrophes for the browser. If that fails, try the following:

<!DOCTYPE stylesheet [
<!ENTITY apos 
"<xsl:text disable-output-escaping='yes'>&amp;#39;</xsl:text>">
]>

This method lets the entity pass through the xslt processor intact and lets the browser generate the apostrophes.

It may not work if IE8 is having real issues, but it’s worth a try. These are the methods that ‘disable output escaping’ is for, it should be used sparingly…

Hope this helps!

PS: Michael is right, your db should really be utf-8, any reason it isn’t? Or is it something you can’t control? As long as your Symphony install was specified to use compatibility, it may be ok, but it could lead to other issues with character encoding…

How did this happen? Didi you manually change the collation to latin1? Symphony should create tables with utf8 collation.

Annoyingly Symphony won’t create utf8 collation if your database default is something different. For example with MAMP my MySQL instance is latin1_swedish_ci by default. When I create a new database (using the Sequel Pro client) this is the default presented to me, so I must change it to utf8_unicode_ci before installing Symphony.

Interesting, I use MAMP, will check that when I get home… What MySQL admin client are you using?

For example with MAMP my MySQL instance is latin1swedishci by default.

Same here using wampserver.

It is possible to set the default to utf8 in phpmyadmin though.

How do you check what the default is? I use phpmyadmin and it is set as utf8generalci

How do you check what the default is?

If you create a new database, and it is set to utf8 automatically, I guess? Can’t find the place for setting the default anymore…

Using Sequel Pro this is what I see using Database > Show Server Variables for my MAMP instance.

alt text

I’ve got that and will look, just from interest!

Edit: Oops, no I havent ;) Going to check the command line…

Edit: I had to modify my PATH variable for MAMP doing the following

PATH=/Applications/MAMP/Library/bin:/Applications/MAMP/bin/php5.x/bin:$PATH

Then in terminal: mysql -u root -p

then status

This told me it is set to latin1

Thank you for the helpful responses.

My tables are now all utf8; likewise for the defaults. When I set up the database, I had forgotten to specify the collation and character encoding.

I tried defining the entity in my page as suggested, but without success —even after getting it in the right place.

Also, I’ve found that the XML from my Navigation data source is correctly outputting an apostrophe —it shows up fine in IE8 when I debug the page and on my rendered page. The server returns the correct Content-Type, and my stylesheets and database tables/fields are all UTF-8. Shouldn’t the apostrophe be allowed to pass through to the browser rather than converted into to an entity reference specific to XML? Why is the value of $page-title escaped?

W/r/t the conservative use of disable-output-escaping, if $page-title is going to insist on coming out as &apos;, how can I get the value without it being escaped to &amp;apos;?

As another solution, I found a character, the ‘RIGHT SINGLE QUOTATION MARK,’ (discussion) that’ll do the job and won’t get escaped. Downside is that it’s not available on my keyboard.

Why.xsl

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE stylesheet [
<!ENTITY apos  "&#39;" >
]>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:import href="../utilities/master.xsl"/>

<xsl:template match="/data">
    <h2><xsl:value-of select="$page-title" disable-output-escaping="yes"/></h2>
    <xsl:copy-of select="why-im-running/entry/body/*" />
</xsl:template>
</xsl:stylesheet>

When you changed your database tables to UTF-8, have you tried to re-save the page?

After a little reading, it turns out that it is just a ‘browser specific’ thing, which is a shame, so nothing you will be able to do with XML or XSLT will fix it.

May I suggest logging this as a bug with the Core team and maybe they will be kind enough to change all instances of &apos; in the system for &#39; (if they even can).

I didn’t think it would be a db character encoding issue to be honest, as it is being correctly output as &apos; in the xml, but it was worth a try to be safe. This is after all what escapes are for.

[EDIT]: Deleted. I was using a “different” (a.k.a wrong) apostrophe in my tests…

I am a bit confused why the apostrophe, returning correctly from the database to the XML, is output as &amp;apos; by the XSLT parser. Maybe Allen can help?

Yeah, lets hope so…

My previous answer is a little naive, I thought it was just the <title> element, but it isn’t. This is a little weird…

Apologies if my answer is off the mark or if it’s covered already. I saw “Allen” and my ears perked up (stop doing that!)

Apostrophes gets encoded inside attributes (for a good reason). This is part of the XML specification.

Well, Allen, but how comes it’s double-encoded in the end?

 &amp;apos;

Beat me to it. And it’s happening in everything, not just attributes.

Oh, that’s the problem? I’m not sure about that.

Create an account or sign in to comment.

Symphony • Open Source XSLT CMS

Server Requirements

  • PHP 5.3-5.6 or 7.0-7.3
  • PHP's LibXML module, with the XSLT extension enabled (--with-xsl)
  • MySQL 5.5 or above
  • An Apache or Litespeed webserver
  • Apache's mod_rewrite module or equivalent

Compatible Hosts

Sign in

Login details