Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: add / fix turndown rules, improve CLI #43

Merged
merged 5 commits into from
Jul 28, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 21 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,30 +16,38 @@ $ npm i -g wp-handbook-converter
$ yarn global add wp-handbook-converter
```

If you want to run the command without installing the package, use this: `$ npx wp-handbook-converter <team>`
If you want to run the command without installing the package, use this: `$ npx wp-handbook-converter`

## `wp-handbook-converter` command

```bash
$ wp-handbook-converter <team>
```

### options
## `wp-handbook-converter` command options

- `-t, --team` &lt;team&gt; Specify team name.
- `-b, --handbook` &lt;handbook&gt; Specify handbook name. (Default "handbook")
- `-s, --sub-domain` &lt;sub-domain&gt; Specify subdomain name. e.g. "developer" for developer.w.org, "w.org" for w.org (Default "make")
- `-o, --output-dir` &lt;output-dir&gt; Specify the directory to save files (default `en/`)
- `-r, --regenerate` &lt;regenerate&gt; If this option is supplied, the directory you specified as output directory will once deleted, and it will regenerate all the files in the directory

### Example
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

いくつかのハンドブックのエンドポイントが変わった?せいか、Exampleのコマンドでは取得出来ないものがあったため、修正に加え、コマンド例も追加しました。


Get Meetup Handbook
Get Core Contributor Handbook

```bash
$ wp-handbook-converter community --handbook meetup-handbook
$ wp-handbook-converter --team core
```

Get theme developer Handbook
Get Meetup Organizer Handbook

```bash
$ wp-handbook-converter --team community --handbook meetup-handbook
```

```bash∑
$ wp-handbook-converter '' --handbook theme-handbook --sub-domain developer
Get theme Handbook

```bash
$ wp-handbook-converter --handbook theme-handbook --sub-domain developer
```

Get plugin Handbook

```bash
$ wp-handbook-converter --handbook plugin-handbook --sub-domain developer
```
143 changes: 126 additions & 17 deletions cli.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,23 +8,132 @@ const { program } = require('commander')
const mkdirp = require('mkdirp')
const del = require('del')
const WPAPI = require('wpapi')
const turndown = require('turndown')
const turndownService = new turndown({
const TurndownService = require('turndown')
const { tables } = require('turndown-plugin-gfm')

// Languages that can be specified in the code markdown
const codeLanguages = {
css: 'css',
bash: 'bash',
php: 'php',
yaml: 'yaml',
xml: 'xml',
jscript: 'javascript',
}

// Rules that remove escapes in code blocks
const unEscapes = [
[/\\\\/g, '\\'],
[/\\\*/g, '*'],
[/\\-/g, '-'],
[/^\\+ /g, '+ '],
[/\\=/g, '='],
[/\\`/g, '`'],
[/\\~~~/g, '~~~'],
[/\\\[/g, '['],
[/\\\]/g, ']'],
[/\\>/g, '>'],
[/\\_/g, '_'],
[/\&quot;/g, '"'],
[/\&lt;/g, '<'],
[/\&gt;/g, '>'],
]

const turndownService = new TurndownService({
headingStyle: 'atx',
codeBlockStyle: 'fenced',
emDelimiter: '*',
})
turndownService.use(tables)

// Remove Glossary
turndownService.addRule('glossary', {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glossasy の除去を、文字列置換ではなく turndown の addRule に置き換えた所、正しく除去できるようになりました。

filter: (node) => {
const classList = node.getAttribute('class')
if (classList) {
return classList === 'glossary-item-hidden-content'
}
return false
},
replacement: () => {
return ''
},
})

// Remove code trigger anchor
turndownService.addRule('code-trigger-anchor', {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

コードブロックを開閉するための#リンク(Expand full source code / Collapse full source code)を除去します。

filter: (node) => {
const classList = node.getAttribute('class')
if (classList) {
return (
classList.includes('show-complete-source') ||
classList.includes(`less-complete-source`)
)
}
return false
},
replacement: () => {
return ''
},
})

// Transform dt tag to strong tag
turndownService.addRule('dt-to-strong', {
filter: ['dt'],
replacement: (content, node, options) => {
return options.strongDelimiter + content + options.strongDelimiter
},
})

// Transform pre code block to code markdown
turndownService.addRule('precode to code', {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

このルールが一番ややこしいと思うのですが、以下のロジックで処理しています。

  • filter でコードブロックを検出(preタグで、必ずbrush: というクラス名が含まれている)
  • クラス名に含まれているコード言語を検出(```の後に付与するため)
  • _* などの文字がエスケープされるので、アンエスケープする
  • 不要なpタグやbrタグが入ってくる事があるので、除去する

filter: (node) => {
const classList = node.getAttribute('class')
const isCode =
node.nodeName === 'PRE' && classList && classList.includes('brush:')
return isCode
},
replacement: (content, node) => {
const classList = node.getAttribute('class')

// Search for a language that matches the list of code languages
const codeLanguage = Object.keys(codeLanguages).reduce(
(currentLanguage, language) => {
if (classList.includes(language)) {
return codeLanguages[language]
}
return currentLanguage
},
undefined
)

// Unescape contents
let newContent = unEscapes.reduce((accumulator, unEscape) => {
return accumulator.replace(unEscape[0], unEscape[1])
}, content)

// Remove br tag
newContent = newContent.replace(/^<br \/>\n\n|<br \/>\n/g, '\n')
// Remove first and last paragraph tag
newContent = newContent.replace(/^<\/p>|<p>$/g, '')
// Remove first new line
newContent = newContent.replace(/^\n/, '')
// Convert to language-aware markdown
newContent = '```' + (codeLanguage ?? '') + '\n' + newContent + '```'

return newContent
},
})

const getAll = (request) => {
return request.then((response) => {
if (!response._paging || !response._paging.next) {
return response
}
// Request the next page and return both responses as one collection
return Promise.all([
response,
getAll(response._paging.next),
]).then((responses) => responses.flat())
return Promise.all([response, getAll(response._paging.next)]).then(
(responses) => responses.flat()
)
})
}

Expand All @@ -35,6 +144,7 @@ const generateJson = async (
outputDir,
regenerate
) => {
team = team ? `${team}/` : ''
handbook = handbook ? handbook : 'handbook'
subdomain = `${
subdomain ? (subdomain === 'w.org' ? '' : subdomain) : 'make'
Expand All @@ -60,12 +170,13 @@ const generateJson = async (
})

const wp = new WPAPI({
endpoint: `https://${subdomain}wordpress.org/${team}/wp-json`,
endpoint: `https://${subdomain}wordpress.org/${team}wp-json`,
})

wp.handbooks = wp.registerRoute('wp/v2', `/${handbook}/(?P<id>)`)

console.log(
`Connecting to https://${subdomain}wordpress.org/${team}/wp-json/wp/v2/${handbook}/`
`Connecting to https://${subdomain}wordpress.org/${team}wp-json/wp/v2/${handbook}/`
)

getAll(wp.handbooks()).then(async (allPosts) => {
Expand All @@ -83,18 +194,16 @@ const generateJson = async (
rootPath = `https://${subdomain}wordpress.org/${team}/${handbook}/`
}
}

for (const item of allPosts) {
const path = item.link.split(rootPath)[1].replace(/\/$/, '') || 'index'
const filePath =
path.split('/').length > 1
? path.substring(0, path.lastIndexOf('/')) + '/'
: ''
// remove <span class='glossary-item-hidden-content'>inner contents
const preContent = item.content.rendered.replace(
/<span class=\'glossary-item-hidden-content\'>.*?<\/span><\/span>/g,
''
)
const markdownContent = turndownService.turndown(preContent)

const content = item.content.rendered
const markdownContent = turndownService.turndown(content)
const markdown = `# ${item.title.rendered}\n\n${markdownContent}`

await mkdirp(`${outputDir}/${filePath}`)
Expand Down Expand Up @@ -156,8 +265,8 @@ const generateJson = async (

program
.version(packageJson.version)
.arguments('<team>')
.description('Generate a menu JSON file for WordPress.org handbook')
mirucon marked this conversation as resolved.
Show resolved Hide resolved
.option('-t, --team <team>', 'Specify team name')
.option(
'-b, --handbook <handbook>',
'Specify handbook name (default "handbook")'
Expand All @@ -174,9 +283,9 @@ program
'-r --regenerate',
'If this option is supplied, the directory you specified as output directory will once deleted, and it will regenerate all the files in the directory'
)
.action((team, options) => {
.action((options) => {
generateJson(
team,
options.team,
options.handbook,
options.subDomain,
options.outputDir,
Expand Down
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
"del": "^6.0.0",
"mkdirp": "^1.0.0",
"turndown": "^7.0.0",
"turndown-plugin-gfm": "^1.0.2",
"wpapi": "^1.1.2"
},
"devDependencies": {
Expand Down
5 changes: 5 additions & 0 deletions yarn.lock
Original file line number Diff line number Diff line change
Expand Up @@ -486,6 +486,11 @@ to-regex-range@^5.0.1:
dependencies:
is-number "^7.0.0"

turndown-plugin-gfm@^1.0.2:
version "1.0.2"
resolved "https://registry.yarnpkg.com/turndown-plugin-gfm/-/turndown-plugin-gfm-1.0.2.tgz#6f8678a361f35220b2bdf5619e6049add75bf1c7"
integrity sha512-vwz9tfvF7XN/jE0dGoBei3FXWuvll78ohzCZQuOb+ZjWrs3a0XhQVomJEb2Qh4VHTPNRO4GPZh0V7VRbiWwkRg==

turndown@^7.0.0:
version "7.1.1"
resolved "https://registry.yarnpkg.com/turndown/-/turndown-7.1.1.tgz#96992f2d9b40a1a03d3ea61ad31b5a5c751ef77f"
Expand Down