“Multilanguage from the core” – Forum Thread – Discuss

- guillem_l
- 17 Aug 11, 6:14 pm
- Comment #41

+1 for nick

- Nitriques
- 24 Aug 11, 10:32 am
- Comment #42

+1 For nick, again

Symphony should only offers hooks to make it more easy to build extensions.

+2 if a language is always present and let it be the default one. That way, the data pulled from the DB should always be associated to a culture (since this is true in the real world)

@nick: Since you are interested on how people do use translation this is how I do it 99% of the time. ** Some ** of the fields are translated. I never depends on the type of field, but on the will of the client. Some people can upload pdf that features all languages in their site, some will upload a file per language.

Another thing is ML support is ALWAYS needed in fields that relates to other's field value.

- moonoo2
- 24 Aug 11, 6:48 pm
- Comment #43

+1 Always working in 2 languages, so as @nitriques said, fields that relate to other fields i.e SBL is a required translatable field.

- DavidOliver
- 24 Aug 11, 9:34 pm
- Comment #44

I don't possess the technical awareness to be able to fully understand or contribute to this discussion in a detailed way, but it's interesting for me to read about the general directions of different approaches.

One vague and not-well-formed thought that has popped into my mind while reading through (and that was further stimulated by Nick's post): could it be that rather than implementing multilingual support, a kind of many-to-many multiversion capability could be built which could be used for (among other things) multilingual sites.

Other things it could be used for might include complexity (simple/complex versions of content), user contributions (different users' versions of a piece of content), or whatever the data-modeller requires. If these version-types could be linked with each other, perhaps you could have both English and German versions of both simple and complex (or whatever) versions of content. (I'm thinking inheritance would play an important role.)

In this way, it might follow the Symphony approach of not making any more assumptions than necessary by providing a framework/tool which can be used for more situations than merely multilingual, but that also provides a method which is very well suited for multilingual.

It might be a fundamental (core?) part of Symphony, but shouldn't be exposed to users who don't need it (as Nick said).

I don't know whether this makes any sense and is practial and possible or not, but I thought I'd try to get it written out anyway.

- nickdunn
- 26 Aug 11, 7:19 pm
- Comment #45

could it be that rather than implementing multilingual support, a kind of many-to-many multiversion capability could be built which could be used for (among other things) multilingual sites

The theory is sound, definitely. But I found when building Entry Versions that version control of content is incredibly difficult. Basic, single dimensional content is easy (text), but throw in files and related entries, you enter a whole new world of pain. If you replace an uploaded file, should previous versions of the entry retain the old file? What if you edit a related entry (say, an image entry uploaded via a SSM field to a parent article entry), should older versions retain the previous version of that related entry? If the answer is yes, you enter into very complex logic. This is why I brought up the idea of version control but quickly back-tracked and decided it wasn't appropriate.

It feels like the need for multilingual in the core is more immediate than content versioning, so I wouldn't want an implementation of the latter to hold up the former. But I agree that the two share similar concepts, and with some forethought, the delegates and implementation of multilingual could potentially make content versioning easier in the future.

DanMan has raised the point again that section/field labels themselves might beed to be translated. This is another complex facet of multilingual sites that some people will require, others now:

translation of system copy in the backend (this is implemented and working well, translations provided via extensions)
translation of backend structure (sections and fields), which are navigation and labels that are provided to authors (Dan's point)
translation of content (this thread up to now)

There is a disconnect here. Backend translation is currently done via extensions — text labels both in the core, and from other extensions, are bundled into translation files and served to the user. This is based on the author's language selection, I presume. This makes Symphony "multilingual" because it means a webdesign agency in Spain can use Symphony without any English at all:

they install the Spanish language pack
they ensure extensions have Spanish translations
they name their sections/fields in Spanish when building site structure

The problem comes when building a multilingual frontend site, but when the authors do not all speak the same language. On some occasions all authors will speak one language (e.g. English) but be publishing content in several (English, German, Spanish). What Dan is asking is whether we could/should support a Spanish author signing in and seeing the entire backend in Spanish only.

Would you want to limit this Spanish user from seeing only Spanish entries, and can only create entries in Spanish? And if a French author signs in, they see only French entries?

This would require an additional set of translations to be applied: the site developers would need to translate the section and field names that make up the content model.

This makes my head hurt. Remie points out that some CMS keep things very separate. How do systems such as Drupal handle this conundrum?

- nickdunn
- 26 Aug 11, 7:49 pm
- Comment #46

Hmm, re-reading my ramble above, the TL;DR version:

There are two uses cases: one for pure content translation, another for pure backend localisation. In some cases they are independent, in other cases they cross over.

- creativedutchmen
- 26 Aug 11, 8:02 pm
- Comment #47

Nick, as you, I think there is a lot of overlap between entry versions and ML. However, I think the similarities are even bigger. All the "problems" you mention with respect to versions also apply to ML.

What if you edit a related entry (say, an image entry uploaded via a SSM field to a parent article entry), should older versions retain the previous version of that related entry?

What if you have artwork (with text) for each article, and the artwork is entered in a SSM, should different languages require different images, or should they share the same set of images?

If you replace an uploaded file, should previous versions of the entry retain the old file?

Here, too. I can think of usecases where an image has to be uploaded per language, creating roughly the same problem.

All in all, I think creating a symphony-like system for entry versions and ML is a tough job, but it surely isn't impossible!

[offtopic]

The problem you mentioned with the versioned relatios isn't as hard as you think: if you let each entry keep track of its revisions (much like github's commits) you can then reference to that entry+revision from another entry (think submodules). This would not require any additional logic except from datasources, and maybe some visual clues as to why the newest entry isn't showing.

[/offtopic]

- nickdunn
- 26 Aug 11, 8:07 pm
- Comment #48

I can think of usecases where an image has to be uploaded per language, creating roughly the same problem.

True, yes. Although this also depends on the way in which you require translations. We've already seen that in some use cases you only want to translate maybe an image caption (per-field translations), but in another case you might want to translate the entire entry (per-entry translations). We have extensions that do both of these at the moment.

Another problem to the mix: the translation of section/field names, allowing a completely localised backend, would not be complete without the ability to translate static values for fields also. For example you may have a Select Box with values in English, but if a German author logs in would you want her to see German values?

In this instance, translation goes beyond the publishing interface, and needs to be considered at the section/field configuration interface also.

- creativedutchmen
- 26 Aug 11, 8:31 pm
- Comment #49

We've already seen that in some use cases you only want to translate maybe an image caption (per-field translations), but in another case you might want to translate the entire entry.

Which is exactly your point with the entry versions and images. If each field would have multiple instances of itself (either versions or translations), the entry could then simply reference the field ID's combined with version number.

This is just the very low level concept, on which extensions could build. The best part is here that a field supporting ML will also support versions and vice versa.

Another problem to the mix: the translation of section/field names, allowing a completely localised backend, would not be complete without the ability to translate static values for fields also. For example you may have a Select Box with values in English, but if a German author logs in would you want her to see German values?

Good point, there will always be edge cases which don't play nice. The worst thing with them is that they normally pop up long after the entire thing has been created.

- DanMan
- 27 Aug 11, 4:02 am
- Comment #50

What remie describes is what I meant with "half-assed solution" in my thread. It boils down to: set up all content for one language, copy'n'paste the whole thing, and translate all entries. Repeat. Then watch the content slowly drift apart, as users forget to update the other languages when they change one of them. Bad.

I couldn't have explained it better as creativedutchmen did with the different methods. I tend to think that there should be some kind of basic language awareness in the core, to make things like translated field names even possible.

For example you may have a Select Box with values in English, but if a German author logs in would you want her to see German values?

I would expect that from a through-and-through multilingual CMS, yes.

When building the I18n part of our company's online shop software, I was inspired a lot by how Gettext works. I've also come to the conclusion that we needed different means of translation. One for static text like headings and descriptive text (using Gettext), one for short, user generated, re-usable text like navigation texts (own development, Gettext inspired, but DB-based), and one for long, singular content blobs like complete articles (DB table with one row per language).

I guess it would be a good start to define what needs to be translatable, so you can think about how to do it individually.

- remie
- 28 Aug 11, 4:08 am
- Comment #51

@DanMan: you are describing a social problem when you state that your authors will forget to translate / update content in different languages. Having multilingual support in the core does not solve this problem.

In addition, having your authors only see the backend in their own language, both navigation and content, will even encourage the content to drift apart.

The benefit I see from supporting multiple languages in the core is that, to some extend, you can establish the fact that content drifts apart. If the English version is updated, and the French, German or Dutch versions do not follow, you can have notifications about this in your UI. That is as far as you can go.

I have no experience with Drupal, but I do have to work with Umbraco frequently and we experience the pain of missing translations on a daily basis. So the multilingual implementation is not perfect, but again, this is mainly due to a social problem: we are so focussed on our main language that we forget to maintain the others. That is something we need to sort out (perhaps we just need to fire some people).

As both Nick and DanMan point out there are multiple levels on which multilingual support can be defined. For me those include

Symphony Backend + extensions
Frontend website - static content
Frontend website - dynamic content

I believe Symphony is already a long way in the first point. The second and third point are doable if you put some time and effort in it.

For me I think it would be valuable if the Symphony core would provide extension developers the structure they need to make the appropriate UI changes to make multilingual support easier for authors. A solution to implement this has already been discussed in this thread: create meta information tables to hold customizable information about articles. This can be used to link articles and to provide language codes or other meta information required to group content and make a nice backend representation.

You can argue that this approach technically does not mean that Symphony will support multiple lanuages in the core: it simply gives extension developers the hooks that the need to do fantastic improvements to their own liking. It also allows us to have several different implementations of multilingual support, based on different views. That is exactly what I like about Symphony: it leaves room for multiple ways of doing the same thing, due to its awesome framework.

- DanMan
- 31 Aug 11, 4:03 am
- Comment #52

You can argue that this approach technically does not mean that Symphony will support multiple languages in the core: it simply gives extension developers the hooks that the need to do fantastic improvements to their own liking. It also allows us to have several different implementations of multilingual support, based on different views. That is exactly what I like about Symphony: it leaves room for multiple ways of doing the same thing, due to its awesome framework.

The downside is that more flexibility always comes with more complexity, and I'd argue there's enough of that in Symphony from a developer's POV. Fragmentation has always been a problem for Open Source software. People rather start forking instead of discussing. IMHO there's more value in few but good solutions than in lots of mediocre ones.

- remie
- 31 Aug 11, 7:21 am
- Comment #53

IMHO there's more value in few but good solutions than in lots of mediocre ones.

Based on the quality of many of the current extensions, and the active community to support them, I do not think this is fair to Symphony developers.

For instance, the URL Router extension was created by a user that needed a specific implementation. It was a done with some haste and was messy and buggy. Because it was considered to be such a valuable addition to Symphony it was picked up by the community after it was abandoned by the initial user and is currently supported by the Symphony team.

In addition, creating the appropriate hooks does not mean that the Symphony team cannot create their implementation of multilingual support. There can always be a supported method, just like the Members extension. However, if I want to stick with the low-level FrontEnd Authentication extension, this is also fine. That's my call as a developer, and that is what is great about Symphony.

//end of preach :)

- DanMan
- 13 Sep 11, 3:41 am
- Comment #54

Another problem to the mix: the translation of section/field names, allowing a completely localized backend, would not be complete without the ability to translate static values for fields also. For example you may have a Select Box with values in English, but if a German author logs in would you want her to see German values?

That's exactly the use case for our 2nd system I've described. For such short terms I abstract an ID (strip everything but numbers and letters) and put that, along with the actual text and the language-ID, into a table. Then you can generate a *.po file for that whenever that table changes and include that into the system at runtime.

To translate terms into other languages, you provide users with a form where they can fill in them in.

- xander_group
- 13 Dec 11, 10:50 pm
- Comment #55

Bump ...

Anything else going on here? I would really appreciate the multilingual stuff going in the core ... The multilingual versions for other extensions begin to add up :)

- nickdunn
- 13 Dec 11, 11:57 pm
- Comment #56

Not actively. The team are working hard on the Symphony 2.3 release for January. After that I hope this is something to put on the agenda.

- vladG
- 09 Mar 12, 7:04 pm
- Comment #57

After re-reading this thread and thinking about the multilingual approach, here's my long 2 cents.

These are good:

Symphony Backend + extensions

Frontend website - static content

Frontend website - dynamic content

1. Symphony Backend + extensions

@todo

Sections' and Fields' labels
navigation groups

First step has been made for Fields: as of Symphony 2.3, there are separate inputs for a Field's label and handle.

Regarding a Section's label and handle, this should be supported by the core as well, even though there already is an extension doing this.

Technically, retrieving the name of a section is as simple as $section->get('name'). But regarding multilingual names, it's a pain in the ***. With current implementation, the only way for an extension to translate these Labels is to process a page after creation: the resulting XMLElements or with some fancy JS post-load. Neither is good. => +1 for multilingual in the core.

2. Frontend website - static content

This task is covered independent & differently by 2 extensions:

gettext
- static strings translation;
Frontend Localisation
- static strings translation;
- arhitectural design to scale well for many different Translations;
- backend UI accessible by Authors to translate the strings;

3. Frontend website - dynamic content

First of all, let's keep this in mind:

What we should do is inventise the barriers there are now for multilanguage support for example, and think of modular solutions to 'make it all work'. Just like there were some slight modifications to get the Members extension working we should also look at what need to be done for this matter.

Note A: As others pointed out, multilingual stuff and entries versions requirements overlap pretty much: in the end, it's all about providing different versions of data from one field for each entry, based on some arbitrary meta_info. In the multilingual case, the meta_info could be the lang_code etc.

@kanduvisla offered a technical solution regarding the database structure to this issue. This will have to be backed-up by the core (field data access etc).

My approach to multilingual_* extensions is rather different than multilingual_field: I tried to derive the field classes from other fields so the majority of the inner workings stays the same. (e.g. Multilingual Upload Field extends the regular Upload Field). This will scale pretty well, BUT on this road, I encountered several issues:

_1. The first problem is described by @kanduvisla here.

Abstraction of the generation of tables for fields are one thing that I think needs to be done. Allthough it would require a lot of extensions to alter their code (however, I wouldn't expect that this is quite easy), it opens new windows for more flexibility for extensions.

This would be great because a derived field could use its parent's DB structure to maintain compatibility.

If the meta_info would exist and would be implemented, there shouldn't be a need for extra language_columns. => + 1 for meta_info.

_2. The second problem is about the way fields access and process data. I had to make this kind of data-forgery™ to keep re-using the parent class code. If fields are to be aware of the meta_info required, this type of masquerade won't be necessary any more.

+1 again for meta_info.

_3. Maintenance overhead, duplicated code, hacks.

How I'd tackle this from a developer's point of view, keeping in mind what @nickdunn said here

no multilang at all, nothing exposing this concept should be visible to users
one field in an entry needs to be translated, but not others
all fields in an entry need translation without exception

#1 Create the meta_info for fields & entries.

It will be good for other purposes than multilingual needs. Existing fields will need updates.

#2 Build the Frontend Language mechanism into the core (see this pull request for a discussion about frontend language).

Currently, Frontend Language is centralised in Frontend Localisation. How much does it cost someone to set only english or french as it's main (and only) frontend language on Preferences Page and then forget about it? It won't affect anything visually in frontend or backend, but will make an enormous difference behind the scenes.

Depending on the level of multilingual implementation in the core, there are 2 ways to supply dinamic multilingual content from fields:

#3.a Create multilingual_* versions for fields (Bad for long-term)

Pros:
+ Modifications to existing fields won't require more changes than those from #1.

Cons:
+ multilingual_* versions will be required for every desired field; ( anyone care to make Reflection field multilang? :) I had enough )
+ duplicated code (DRY will not be possible);
+ maintenance overhead. Rewrite of parent field will require rewrite of child;

#3.b Adapt existing fields (Good for long-term)

Pros:
+ Remove cons from #3.a

Cons:
+ More initial work.

In conclusion

Multilang could be a mix-in of core management and extensions handling. Certainly, the core must-supply some sort of meta_info and frontend_language awareness. If both #1 and #2 are met, localised Labels for Fields and Sections becomes a breeze => completely localised Backend.

For dinamic content, the options are #3.a and #3.b.

- DanMan
- 22 Jun 12, 12:43 am
- Comment #58

Now that you're planning to leave PHP 5.2 behind, it may be worth considering to use intl: http://php.net/intl, which is part of the core as of 5.3. It goes far beyond internationalizing text blobs. It does numbering, dates, transliteration and more. Just sayin'

Symphony.