Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      I know this is a big one and which afaik so far was intentionally left out of the spec. Right now JCR does not cover i18n at all. There are obviously different strategies, like namespacing the property names for different locales, or storing the translateable content pieces in child nodes .. or referencing nodes with the translations etc. Another topic are fallback rules, ie. it might make sense to show content in english if there is no text in german for the given node/property.

      For PHPCR we have actually created a solution that abstracts this inside PHPCR ODM (a hibernate like layer on top of PHPCR):
      http://docs.doctrine-project.org/projects/doctrine-phpcr-odm/en/latest/reference/multilang.html

      But its obviously not so trivial to then deal with this in the context of search:
      http://doctrine-project.org/jira/browse/PHPCR-84

      At any rate the question is if i18n might be a topic that eventually could be covered in the core spec?

        Activity

        Hide
        Peeter Piegaze added a comment -

        Yes i18n is definitely something we are interested in. In fact, we had a container issue (JSR-333-11) which I closed due to lack of interest. But since there is now interest we can continue the discussion. I wil leave the old issue closed and use this one.

        Could you describe the approach of PHPCR in more detail? What aspects of the API would be changed to accomodate i18n?

        Show
        Peeter Piegaze added a comment - Yes i18n is definitely something we are interested in. In fact, we had a container issue (JSR-333-11) which I closed due to lack of interest. But since there is now interest we can continue the discussion. I wil leave the old issue closed and use this one. Could you describe the approach of PHPCR in more detail? What aspects of the API would be changed to accomodate i18n?
        Hide
        lsmith77 added a comment -

        Just to clarify, its not something we did on the PHPCR level.
        Its just something we did on the optional data mapper PHPCR ODM.

        Basically you can define which properties are translatable and what strategy to use for storing the translations.

        So lets say you want to store a class instance with the properties "title", "body" and "date". The date will be the same for all languages, but the title and body need to be translated. If you use the "attribute" strategy for the locale "en" what will happen is that the data is persisted as:

        • property: phpcr_locale:en-title
        • property: phpcr_locale:en-body
        • property: date

        When using the "child" strategy it would be stored like so:

        • child phpcr_locale:en
        • title
        • body
        • date

        When fetching the content the user either defines the locale they want, or a session default locale is used. If the content in the requested locale doesnt exist, an optional fallback chain can be triggered which again is configured on the session.

        This works pretty well in that most frontend code can get away with no i18n specific code. Only in the authoring you may need to deal with i18n by presenting multiple forms for each locale and ensuring that the content provided is persisted for the correct locale.

        Where it however starts to fall flat is when doing searches. At that point it becomes important what strategy is used.

        Speaking of the strategies, each approach obviously has advantages:

        • attribute strategy keeps are node structure "as is", however it means that if there are a lot of different translations reading a node can add a lot of overhead
        • child strategy scales nicely with more translations, however querying the child nodes because more cumbersome and reading the child node (especially for fallbacks) adds overhead (especially in the client-server setup).

        Another strategy we have not implemented yet would be to use references to interlink the different locale variations.

        Show
        lsmith77 added a comment - Just to clarify, its not something we did on the PHPCR level. Its just something we did on the optional data mapper PHPCR ODM. Basically you can define which properties are translatable and what strategy to use for storing the translations. So lets say you want to store a class instance with the properties "title", "body" and "date". The date will be the same for all languages, but the title and body need to be translated. If you use the "attribute" strategy for the locale "en" what will happen is that the data is persisted as: property: phpcr_locale:en-title property: phpcr_locale:en-body property: date When using the "child" strategy it would be stored like so: child phpcr_locale:en title body date When fetching the content the user either defines the locale they want, or a session default locale is used. If the content in the requested locale doesnt exist, an optional fallback chain can be triggered which again is configured on the session. This works pretty well in that most frontend code can get away with no i18n specific code. Only in the authoring you may need to deal with i18n by presenting multiple forms for each locale and ensuring that the content provided is persisted for the correct locale. Where it however starts to fall flat is when doing searches. At that point it becomes important what strategy is used. Speaking of the strategies, each approach obviously has advantages: attribute strategy keeps are node structure "as is", however it means that if there are a lot of different translations reading a node can add a lot of overhead child strategy scales nicely with more translations, however querying the child nodes because more cumbersome and reading the child node (especially for fallbacks) adds overhead (especially in the client-server setup). Another strategy we have not implemented yet would be to use references to interlink the different locale variations.
        Hide
        lsmith77 added a comment -

        Now what I would prefer instead would be to essentially just be able to define the locale for the session (with an optional locale parameter when fetching a node to override the session default). Same when storing a node I would like to be able to provide a locale. There would need to be some way to change the locale of the node in memory. Similar to the "depthHint" we might then also need to define how aggressively all the different locale variations should be loaded. I will try to cook up some pseudo API examples over the weekend. But in many ways the following link describing how things work in the PHPCR ODM would be what I would like to see for the node API:
        http://docs.doctrine-project.org/projects/doctrine-phpcr-odm/en/latest/reference/multilang.html#full-example

        When querying I would then want to be able to specify the locales to consider and each node would only be returned once, regardless of the number of locales matched.

        Show
        lsmith77 added a comment - Now what I would prefer instead would be to essentially just be able to define the locale for the session (with an optional locale parameter when fetching a node to override the session default). Same when storing a node I would like to be able to provide a locale. There would need to be some way to change the locale of the node in memory. Similar to the "depthHint" we might then also need to define how aggressively all the different locale variations should be loaded. I will try to cook up some pseudo API examples over the weekend. But in many ways the following link describing how things work in the PHPCR ODM would be what I would like to see for the node API: http://docs.doctrine-project.org/projects/doctrine-phpcr-odm/en/latest/reference/multilang.html#full-example When querying I would then want to be able to specify the locales to consider and each node would only be returned once, regardless of the number of locales matched.
        Hide
        Peeter Piegaze added a comment -

        One of the reasons we left i18n out of the spec is because we at Day/Adobe have always pushed the locale division to the top of the content tree:

        /content/mysite/en/foo
        /content/mysite/en/bar
        /content/mysite/fr/foo
        /content/mysite/fr/bar

        and therefore management of i18n becomes much more an app-level concern (i.e., app on top of JCR repo platform).

        In our experience having the translations down it the fine-grained content led to more trouble than it was worth and also, we found that "falling back to English" led to a pretty crappy experience for users , at least in the types websites our customer deploy.

        For these reasons this has never been something we see as repo-level.

        Now, there may well be contexts in which it does make sense to do things this way, but even then, would this not be better handled as a matter app best practices or something?

        Show
        Peeter Piegaze added a comment - One of the reasons we left i18n out of the spec is because we at Day/Adobe have always pushed the locale division to the top of the content tree: /content/mysite/en/foo /content/mysite/en/bar /content/mysite/fr/foo /content/mysite/fr/bar and therefore management of i18n becomes much more an app-level concern (i.e., app on top of JCR repo platform). In our experience having the translations down it the fine-grained content led to more trouble than it was worth and also, we found that "falling back to English" led to a pretty crappy experience for users , at least in the types websites our customer deploy. For these reasons this has never been something we see as repo-level. Now, there may well be contexts in which it does make sense to do things this way, but even then, would this not be better handled as a matter app best practices or something?
        Hide
        Peeter Piegaze added a comment -

        Resolving as wontfix

        Show
        Peeter Piegaze added a comment - Resolving as wontfix

          People

          • Assignee:
            Unassigned
            Reporter:
            lsmith77
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: