Skip to content

Specs handling text direction

r12a edited this page Jul 29, 2016 · 25 revisions

Activity Streams

Notes

  • this is JSON
  • basic object sent as single item
  • structured objects
  • only some strings are natural language
  • includes a mechanism for localised text
  • name property has no markup
  • summary and content properties do support HTML markup
  • an object can contains several natural language strings, which may have different base directions
  • either a summaryMap should become several objects, or FS will need to be used
  • the name property admits no markup, so control codes need to be used - otherwise, use markup for inline changes

Current solution proposed by Social WG:

for the name property (no markup allowed) add control codes at start and end of value for overall base direction and inline control codes for inline changes

{
  "@context": {
    "@value": "http://www.w3.org/ns/activitystreams",
    "@language": "he"
  },
  "name": "\u202Bפעילות הבינאום, W3C\u202C",
  "type": "Note",
  "summary": "<span dir="rtl">פעילות הבינאום, W3C</span>"
}

for summary and content properties, use markup with dir attributes to establish overall base direction and inline changes

{
  "@context": {
    "@value": "http://www.w3.org/ns/activitystreams",
    "@language": "he"
  },
  "name": "\u202Bפעילות הבינאום, W3C\u202C",
  "type": "Note",
  "summaryMap": {
    "he": "<span dir="rtl">פעילות הבינאום, W3C</span>",
    "en": "'<span dir="rtl">نشاط التدويل, W3C</span>' is how you say 'i18n Activity, W3C' in Arabic.",
    "ar": "<span dir="rtl">نشاط التدويل، W3C</span>"
  }
}

Problems

  • can't expect Arabic/Hebrew/Divehi/Urdu/Persian/etc users to add control characters or markup for default direction for every natural language string
  • if name has multiple lines, or summary/content have multiple paragraphs, each line/paragraph needs to be annotated with directional information
  • users are expected to use different approaches for notes vs summary/content, which is confusing (and must be correctly done, eg. no control codes before <p>, no control codes inside inline markup, etc.)
  • all the usual problems with control codes (eg. difficult to use, may not be available on keyboard, even harder to edit, etc.)

Alternative suggestions

  • specify that the default is LTR
  • use one property per object to establish the base direction for RTL text
  • user only needs to revert to control codes/markup for exceptional text
  • if property value says auto, does FS analysis, which may reduce the need for user intervention even further
  • setting a property is possibly more helpful when dealing with input from HTML forms, etc, where the direction information is carried separately from the text (dirname)
{
  "@context": {
    "@value": "http://www.w3.org/ns/activitystreams",
    "@language": "he"
  },
  "direction": "rtl"
  "name": "פעילות הבינאום, W3C",
  "type": "Note",
  "summary": "פעילות הבינאום, W3C"
}
{
  "@context": {
    "@value": "http://www.w3.org/ns/activitystreams",
    "@language": "he"
  },
  "direction": "rtl"
  "nameMap": {
    "he": "פעילות הבינאום, W3C",
    "en": "\u2066'\u2067نشاط التدويل, W3C\u2069' is how you say 'i18n Activity, W3C' in Arabic.\u2069",
    "es": "Actividad de internationalización, W3C",
    "ar": "نشاط التدويل، W3C"
  }
  "type": "Note",
  "summaryMap": {
    "he": "פעילות הבינאום, W3C",
    "en": "<span dir="ltr">'<span dir="rtl">نشاط التدويل, W3C</span>' is how you say 'i18n Activity, W3C' in Arabic.</span>",
    "es": "Actividad de internationalización, W3C",
    "ar": "نشاط التدويل، W3C"
  }
}
{
  "@context": {
    "@value": "http://www.w3.org/ns/activitystreams",
    "@language": "he"
  },
  "direction": "auto"
  "nameMap": {
    "he": "פעילות הבינאום, W3C",
    "en": "\u200E'\u2067نشاط التدويل, W3C\u2069' is how you say 'i18n Activity, W3C' in Arabic.",
    "es": "Actividad de internationalización, W3C",
    "ar": "نشاط التدويل، W3C"
  }
  "type": "Note",
  "summaryMap": {
    "he": "פעילות הבינאום, W3C",
    "en": "&lrm;'<span dir="rtl">نشاط التدويل, W3C</span>' is how you say 'i18n Activity, W3C' in Arabic.",
    "es": "Actividad de internationalización, W3C",
    "ar": "نشاط التدويل، W3C"
  }
}

##Web Annotation

###Notes

  • this is JSON
  • basic object sent as single item
  • structured objects
  • each structure has only one natural language string
  • text property allows markup
  • direction may be specified for a string body, or for a target body whose text is to be found elsewhere

###Current solution

  • textDirection property indicates base direction
  • textDirection values can be rtl,ltr,auto
{
  "@context": "http://www.w3.org/ns/anno.jsonld",
  "id": "http://example.org/anno2",
  "type": "Annotation",
  "body": {
    "id": "http://example.org/analysis1.mp3",
    "format": "audio/mpeg",
    "language": "fr"
  },
  "target": {
    "id": "http://example.gov/patent1.pdf",
    "format": "application/pdf",
    "language": ["en", "ar"],
    "textDirection": "ltr",
    "processingLanguage": "en"
  }
}
{
  "@context": "http://www.w3.org/ns/anno.jsonld",
  "id": "http://example.org/anno5",
  "type":"Annotation",
  "body": {
    "type" : "TextualBody",
    "text" : "<p>פעילות הבינאום, W3C</p>",
    "format" : "text/html",
    "language" : "he"
    "direction" : "rtl"
  },
  "target": "http://example.org/photo1"
}