Skip to content

Yi Documentation Contribution Guide

GloriaLee01 edited this page Feb 23, 2024 · 55 revisions

Doc Repo

Yi Repository Structure
│
├── README // Readme platform headers of Hugging Face, ModelScope, WiseModel
│
├── VL // Readme and images of Yi vision language model
│
├── assets // Images used in readme of Yi large language model
│
├── docs // Documentation files of Yi large language model
│
├── README.md // English readme
│
└── README_CN.md // Chinese readme

Using full web addresses (absolute paths) in documents

Why this is important:

  • Universal compatibility: our documents are not only hosted on GitHub but are also shared with platforms like Hugging Face, ModelScope, and Wise Model. these platforms display the content directly from GitHub.

  • Avoid broken links: using shorter web addresses (relative paths) can lead to broken links on platforms other than GitHub. to ensure all links and images work correctly everywhere, we use full web addresses (absolute paths).

Steps for using full web addresses:

  • Locate your image: go to the GitHub page where your image is located.

  • Copy the image address: right-click on the image and choose "Copy Image Address". ensure the URL ends with raw=true (e.g.https://github.com/01-ai/Yi/blob/main/assets/img/yi_34b_chat_web_demo.gif?raw=true. This is how GitHub handles direct links to images).

  • Use the full address: in your document, use this full address for images and links. this ensures they display correctly on every platform where the document is available.

For more information about full web addresses, see How to get "raw" links from GitHub (getting the raw image link).

Limitations of Managing Emoji Usage in Hugging Face Website Document Titles

  • Limitation 1: The effectiveness of Hugging Face's built-in Table of Contents (TOC) feature may be restricted by the limited awareness of its existence among most users.

  • Limitation 2: The use of Hugging Face website URLs may be constrained by its inability to support real-time synchronization with the GitHub readme document.

  • Limitation 3: Removing emojis from document titles may pose limitations by potentially reducing document readability.

Yi Writing Style Guide

Language and grammar

Abbreviations

General guidelines

Avoid abbreviations when the following conditions are true:

  • Their meanings are unclear.
  • They make the information more difficult to understand.
  • They occur infrequently in the information, for example, only two or three times in a large amount of content.
  • They are derived from Latin.
  • They abbreviate names or entities that are owned by other companies, and the owning company does not use those abbreviations.
  • They might be a registered trademark for a different product or entity.

Latin abbreviations

​​Do not use Latin abbreviations; use their English equivalents instead. Latin abbreviations are sometimes misunderstood.

Do Not use Latin abbreviation ❌ Use English equivalent ✅
e.g. for example
etc. Use and so on when you list a clear sequence of elements, such as “1, 2, 3, and so on” or “Monday, Tuesday, Wednesday, and so on.” Otherwise, rewrite the sentence to replace etc. with something more descriptive, such as “and other output.”
i.e. that is

Active Voice

Whenever possible, use active voice instead of passive voice.

  • In active voice, the subject of the sentence is the doer of the action. This is especially important when describing an action that should specify who or what is doing the action.
  • In passive voice, it's easy to neglect to indicate who or what is performing a particular action. In this kind of construction, it's often hard for readers to figure out who's supposed to do something (such as the reader, the computer, the server, an end user, or a visitor to a web page).
Example:
  • Send a query to the service. The server sends an acknowledgment. ✅
  • The service is queried, and an acknowledgment is sent. ❌

Tense

  • Write in the simple present tense as much as possible if you are covering facts that were, are, and forever shall be true.
Example:
  • When you open the latch, the panel slides forward. ✅
  • When you open the latch, the panel will slide forward. ❌
  • Use past or future tense only when you cannot use present tense or it does not make sense to use the present tense.
Example:
  • If you select New in the previous window, the current window displays the recommended default values. ✅
  • If you select New in the previous window, the current window will display the recommended default values. ❌

Pronouns

Personal pronouns

In technical information, follow these general guidelines for personal pronouns:

  • Use the second-person pronoun (you, your, yours, or yourself) as much as possible. The subject of an imperative sentence is understood to be you.
  • Avoid the first-person pronouns I and we, except in these situations:
    • In the question portion of frequently asked questions (FAQs)
    • In articles, white papers, or documents that have listed authors and in which the authors describe their own actions or opinions
  • Avoid third-person pronouns that are gender-specific.
Example:
  • You can now move your cursor in four directions. ✅
  • We can now move his cursor in four directions. ❌
  • The user can now move his cursor in four directions. ❌

Gender-neutral pronouns

Many terms and titles, such as customer engineer, programmer, teacher, or administrative assistant, do not apply exclusively to one gender. Therefore, to make your writing inclusive, avoid using gender-specific pronouns. Do not use he, him, himself, or his unless the person you refer to is male, and do not use she, her, herself, or hers unless the person you refer to is female.

For more information, see Microsoft bias-free writing guidelines and Google inclusive doc writing guide.

Contractions

  • Use simple and common contractions to keep sentences from feeling out-of-touch, robotic, or overly formal. ✅
Example:
  • What's ✅
  • We'll ✅
  • You'll ✅
  • You're ✅
  • You've ✅
  • We're ✅
  • They're ✅
  • Doesn't ✅
  • Didn't ✅
  • Don’t ✅
  • Isn't ✅
  • Aren't ✅
  • Can't ✅
  • Avoid contractions formed from nouns and verbs.
Example:
  • The browser is fast, simple, and secure. ✅
  • The browser's fast, simple, and secure. ❌
  • Avoid double contractions (contain not just one but two contracted words). ❌
Example:
mightn't've ❌ might not have ✅
mustn't've ❌ must not have ✅
wouldn't've ❌ would not have ✅
shouldn't've ❌ should not have ✅
  • Avoid unusual contractions.
Example:
  • there’d ❌
  • It’ll ❌
  • they’d ❌

Contractions

  • Do not use contractions in technical information.
    • Contractions can cause difficulty for translation and for users whose primary language is not English. For example, it’s can be interpreted as it is or it has.
    • Contractions can also contribute to an overly informal tone.
    • The contraction what’s in the phrase what’s new is acceptable in content that presents new and changed items between versions of products or information.

Capitalization

In general, use a lowercase style in text and use sentence-style capitalization for headings.

Capitalization styles

  • Items such as headings, captions, labels, or interface elements generally follow sentence-style capitalization.
  • Sentence-style capitalization: This style is predominantly lowercase; capitalize only the initial letter of the first word in the text and other words that require capitalization, such as proper nouns.
Examples of sentence-style capitalization ✅
  • Business models
  • Creating Boolean expressions
  • Planning network architectures
  • Properties and settings for printing
  • Requirements for Linux and UNIX operating systems

Capitalization in general text

  • Capitalize proper nouns correctly.

Examples of proper nouns include the names of specific people, places, companies, languages, protocols, and products.

Example:
Pulsar pulsar
BookKeeper Bookkeeper
HTTP http
API api
NAR nar
URI uri
TTL ttl
ID Id

Miscellaneous

a/an

  • Use a before a word that begins with a consonant sound.
Example:
Use a with Consonant sound
.mbtest file D, as in dot
one W, as in won
ROM R, as in romp
unit Y, as in you
x4 B, as in by four
Example:
Use an with Vowel sound
HTTP A, as in ache
LU E, as in elf
MVS E, as in empty
RPQ A, as in are
SOASQL E, as in estimate

that/which (restrictive clause)

In American English, "that" is used to start a restrictive clause. "which" is used to start a non-restrictive clause (put a comma before which). When the information is essential to the meaning (a restrictive clause), use "that".

Example:
  • Set the “service_url” to the value which is obtained through Step 1. ❌
  • Set the “service_url” to the value that is obtained through Step 1. ✅

Punctuation

Commas

Commas between items in a series

  • Use commas to separate items in a series of three or more. Use a comma before the conjunction that precedes the final item.
Example ✅
  • A message window describes an error, explains how to correct it, and provides the controls to correct it.
  • Present the items in a meaningful order, such as alphabetically, numerically, or chronologically

Exclamation points

  • Do not punctuate sentences with exclamation points because their tone can be interpreted negatively, for example, as aggressive, condescending, or overly informal.

  • Convey urgency or emphasis with the appropriate words, not with exclamation points. To call attention to important hints, tips, guidance, restrictions, or advice that might be overlooked, consider using a note that has a meaningful label.

Example:
You must complete this step first. Complete this step first!
Important: You must change the default settings. You must change the default settings!
You completed the first lesson in the tutorial. You completed the first lesson in the tutorial!

Period

Use a period (rather than colon) at the end of a sentence.

Formatting and organization

Lists

Ordered lists

Use an ordered (numbered) list when the sequence is significant, for example, when writing procedures or ranking items. If the items in a list represent rules or other types of information that you want to refer to, you can refer to them by number. For example, in a list of rules, the item numbered 1 is implied to be Rule 1.

Example:

Write comment statements according to the following rules:

  1. Use an asterisk in the first column.
  2. Do not exceed 80 characters.
  3. Do not place a comment statement between an instruction and its continuation line.

Tables

To keep tables accessible and scannable, tables should not have any empty cells. If there is no otherwise meaningful value for a cell, consider entering N/A (for “not applicable” or “not available) or None.

Numbers and measurements

Use a space between a number and the abbreviation of a unit of measurement, regardless of whether the value is used as a noun or as an adjective.

Example ✅
  • The unit weighs 48.2 kg (106.3 lb).
  • a 12 m (39 ft) cable
  • an 8 ft clearance
  • a 1 GB memory module

Multiplier prefixes

  • In combination with a prefix, use B (uppercase) to mean bytes, and use b (lowercase) to mean bits.
  • Prefixes are never used alone or with units that you spell out.
Examples (incorrect) ❌ Examples (correct) ✅
16 K16 K bytes 16 KB
  • Use the abbreviation KB or Kb, not the spelled-out form. If your audience needs to be told the spelled-out form of other abbreviations, include the spelled-out form parenthetically on first use only, and use only the abbreviation in all other occurrences.
Examples (incorrect) ❌ Examples (correct) ✅
64 kilobytes4 gigabitsbytes/s 64 KB4 Gb (gigabits)24 kbps (kilobits per second)byte per second, or B/shttps://en.wikipedia.org/wiki/Bit_rate

Dates

Be consistent in the way that you write dates and express them in a way that is understandable internationally.

To avoid confusion about the meaning of a date, do not use an all-numeric representation in text. In the United States, 12/1/07 means 1 December 2007; in many other countries, it means 12 January 2007.

  • Use one or two digits for the day first, then spell out the name of the month, followed by the year using all four digits. Use one space between the day and the month and between the month and the year.
  • Do not use a forward slash (/) or hyphen (-) in dates, nor the abbreviations st, nd, rd, and th.
  • If you want to include the day of the week before a date, use a comma between the day of the week and the date.
Examples (incorrect) ❌ Examples (correct) ✅
10/1/121-10-12The 1st of August, 2010August 1st, 2010December 2December 2nd2nd DecemberMonday May 24th 1 October 20121 August 20102 DecemberMonday, 24 May

Computer interface

Graphical user interface elements

Location of interface elements

When you refer to the location of an element in an interface, use the following terms:

  • For nouns, use the terms upper left, upper right, lower left, and lower right. Do not use left hand or right hand.
  • For adjectives, use the hyphenated terms upper-left, upper-right, lower-left, and lower-right. Do not use left-hand or right-hand.

Menu instructions and navigation

  • Use bold for menus, menu items, separator symbols, items in the navigation tree, and the greater than symbol. If a menu contains variable menu items, use lowercase bold italic.
Example:
  • Click File > Tools > User preferences > required preference . ✅
  • Expand Performance > Advisor types > usertype > Diagnostics . ✅

Miscellaneous

Placeholder text

You might want to provide a command or configuration that uses specific values. In these cases, use < and > to call out where a reader must replace text with their own value.

Example:

cp <your_source_directory> <your_destination_directory>

Writing for diverse audiences

Style

  • Do not use please and thank you in technical information. Technical information requires an authoritative tone. Terms of politeness are superfluous, convey the wrong tone for technical material, and are not regarded the same way in all cultures. In marketing information, terms of politeness might be appropriate. Use the imperative mood in the first sentence of each step.
Example: - Click Install program. ✅ - Please click Install program. ❌

Accessibility

  • Don’t convey information with color alone. For example, use both color and underlined text for links, and use pattern and color to differentiate information in charts and graphs.
  • Don’t hard-code colors. They can become illegible in high-contrast themes.
  • Provide clear descriptions that don’t require pictures, or provide both. Make sure the reader can get the whole story from either the picture or the written description.
  • Spell out words like and, plus, and about. Screen readers can misread text that uses special characters like the plus sign (+) and tilde (~).
  • Don’t force line breaks (also known as hard returns) within sentences and paragraphs. They may not work well in resized windows or with enlarged text.
  • Use SVG instead of PNG if available. SVGs stay sharp when you zoom in on the image. For more benefits of using SVG rather than PNG, refer to here.
  • Use ALT text (less than 100 characters) to describe an image as concisely as possible. Images alternatives add valuable information for low vision or blind screen reader users.

Glossaries

A

Admonition

You can use Tip, Note, Caution, and more admonitions.

  • Tip: provides helpful hints for completing a task. Do not use a tip to give essential information.
  • Note: contains additional information to emphasize or supplement important points of the main text.
  • Caution: indicates deprecated features or provides a warning about procedures that have the potential for data loss.

Usage rules

  • Use admonitions to call attention to information.
  • Use them sparingly, and never have an alert box immediately follow another alert box.
  • Too many notes can make topics difficult to scan. Instead of adding a note:
    • Re-write the sentence as part of a paragraph.
    • Put the information into its own paragraph.
    • Put the content under a new subheading.

L

Later

When referring to a range of version numbers:

  • Do not use above / up version ❌ Reason: version numbers should be essentially viewed as timeline markers rather than quantity markers.
  • Use earlier / later version ✅
  • “higher” version is not wrong and it is used in some product manuals, however, it is recommended to use “later” in Pulsar documentation to standardize on one expression for greater consistency in presenting the information to users.❗️

R

Recommend

Avoid if possible. A phrase such as “We recommend that you take the following action” could create a potential marketing or legal problem

Grammar

Use prepositions with relative pronouns

  • The name of the topic that the message is published to. ✅
  • The name of the topic to which the message is published.

Reference

Pulsar Documentation Writing Style Guide IBM Style Guide