-
-
Notifications
You must be signed in to change notification settings - Fork 785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor docs to use ExDocs search_data model for global HexDocs indexing of Gleam packages #3904
base: main
Are you sure you want to change the base?
Refactor docs to use ExDocs search_data model for global HexDocs indexing of Gleam packages #3904
Conversation
e7f6392
to
fae2bc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!! I've left a few small comments inline, mostly around abbreviations, which we avoid in this codebase.
It would be great to have a test for the output format also.
#[serde(rename = "type")] | ||
type_: SearchItemType, | ||
#[serde(rename = "parentTitle")] | ||
parent_title: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's a parent title? Could you add ///
documentation comments for each of these fields pleaes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have a look at my comments above:
- In one I pasted a screenshot to visualize how it's being used currently;
- and in another asked whether you might have an idea for a better name?
Please also have a look at my Notes.
compiler-core/src/docs.rs
Outdated
doc: String, | ||
struct SearchData { | ||
items: Vec<SearchItem>, | ||
proglang: SearchProgLang, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proglang: SearchProgLang, | |
programming_language: SearchProgLang, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the naming and would prefer it too. Yet it is named differently in ExDocs so I went for consistency.
See here for the summarized ExDocs data format and more Info.
So shall we use programming_language
internally and have a separate function that transforms the json output of search-data.json
for HexDocs indexing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
compiler-core/src/docs.rs
Outdated
url: String, | ||
doc: String, | ||
#[serde(rename = "ref")] | ||
ref_: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref_: String, | |
reference: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the naming and would prefer it too. Yet it is named differently in ExDocs so I went for consistency.
See here for the summarized ExDocs data format and more Info.
So shall we use reference
internally and have a separate function that transforms the json output of search-data.json
for HexDocs indexing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
compiler-core/src/docs.rs
Outdated
title: String, | ||
content: String, | ||
url: String, | ||
doc: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc: String, | |
content: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the naming and would prefer it too. Yet it is named differently in ExDocs so I went for consistency.
See here for the summarized ExDocs data format and more Info.
So shall we use content
internally and have a separate function that transforms the json output of search-data.json
for HexDocs indexing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
compiler-core/src/docs.rs
Outdated
|
||
#[derive(Serialize, PartialEq, Eq, PartialOrd, Ord, Clone)] | ||
#[serde(rename_all = "lowercase")] | ||
enum SearchProgLang { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enum SearchProgLang { | |
enum SearchProgrammingLanguage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the naming and would prefer it too. Yet it is named differently in ExDocs so I went for consistency.
See here for the summarized ExDocs data format and more Info.
So shall we use SearchProgrammingLanguage
internally and have a separate function that transforms the json output of search-data.json
for HexDocs indexing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
this.add({ | ||
id: i, | ||
// type: entry.type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another comment here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this comment to discuss, or keep it as a reminder that we have this property in the new data model but are not making use of it yet.
const index = lunr(function () { | ||
this.ref("id"); | ||
// this.field("type"); | ||
// this.field("parentTitle"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this unfinished?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this comment to discuss, or keep it as a reminder that we have this property in the new data model but are not making use of it yet.
Yes, test are a good idea! :) Lets clarify the other topics first though: whether we keep two separate data models. One for Gleam internal use and another for HexDocs indexing? |
I think just one is fine, it's only used to render that JSON object. |
fae2bc9
to
ab689e5
Compare
Ok, I don't know what you want me to do. This isn't a good use for either your, nor my time. Not sure how familiar you are with this part of the code base and when you looked into it last. From my own experience, I know that running a project while overseeing a larger code base with lots of different contributors is quite something. So I understand that revisiting parts of the code base that others have contributed to and you haven't reviewed recently takes getting used to. That's why I added context information, including screenshots in the initial post and comments regarding key changes to make it easier to follow. Do you see them or are they just visible to me? Is there anything else I can do to communicate better? |
Open Questions
Option: 2 Models
/// for HexDocs search of Gleam projects
#[derive(Serialize, PartialEq, Eq, PartialOrd, Ord, Clone)]
struct SearchIndex {
#[serde(rename = "parentTitle")]
parent_title: String,
title: String,
content: String,
reference: String,
}
/// for global HexDocs indexing
#[derive(Serialize, PartialEq, Eq, PartialOrd, Ord, Clone)]
struct SearchData {
items: Vec<SearchItem>,
proglang: SearchProgLang,
}
#[derive(Serialize, PartialEq, Eq, PartialOrd, Ord, Clone)]
struct SearchItem {
#[serde(rename = "type")]
type_: SearchItemType,
title: String,
doc: String,
#[serde(rename = "ref")]
ref_: String,
}
#[derive(Serialize, PartialEq, Eq, PartialOrd, Ord, Clone)]
#[serde(rename_all = "lowercase")]
enum SearchItemType {
Constant,
Function,
Module,
Page,
Type,
}
#[derive(Serialize, PartialEq, Eq, PartialOrd, Ord, Clone)]
#[serde(rename_all = "lowercase")]
enum SearchProgLang {
// Elixir,
// Erlang,
Gleam,
} Option: 1 Model variant A (current implementation)
|
ab689e5
to
9725204
Compare
Hello! Sorry for the delay, my inbox is overwhelming! I'm a bit confused by your notes here. Are you suggesting we propose a change upstream to Hex for them to change the format that all BEAM languages use? Personally I am happy with the format Hex is using so if the comments from my previous PR are applied then we can merge. Thank you |
I give up, @lpil – our miscommunication is too frustrating. Above, there are two questions that are still unanswered since the initial post of this PR:
Above, there are three options mapped out – including a draft implementation, and their pros and cons. Have you had a look at them and understand the impact of each? These are the three options to choose from:
Feel free to close this unless you plan on addressing the above. |
I would like the specific field renaming that I asked for above please 🙏 I am not looking to discuss any alternative designs presently. If you would like to do just the tests and leave the renaming to me that would also be fine. Thank you |
enum SearchProgLang { | ||
// Elixir, | ||
// Erlang, | ||
Gleam, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you want to have the comments removed or kept for documentation?
#[serde(rename = "type")] | ||
type_: SearchItemType, | ||
#[serde(rename = "parentTitle")] | ||
parent_title: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an idea of how to name this better?
See comment regarding this below.
resultDoc.appendChild(resultDocTitle); | ||
let resultDocOrSection = resultDocTitle; | ||
if (doc.doc != doc.title) { | ||
if (searchItem.parentTitle != searchItem.title) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constant, | ||
Function, | ||
Module, | ||
Page, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all of the SearchItemType
s being used.
@ruslandoga propsed Extra
in #3811 (comment) for README.md
and such things.
In the sidebar of the UI and the rest of the code it's Page
. As this is not used for indexing, it might make more sense to stick to SearchItemType::Page
.
See comment regarding this below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found more information regarding Extras
:
here: https://hexdocs.pm/ex_doc/Mix.Tasks.Docs.html#module-configuration
:extras
- List of paths to additional Markdown (.md extension), Live Markdown (.livemd extension), Cheatsheets (.cheatmd extension) and plain text pages to add to the documentation. You can also specify keyword pairs to customize the generated filename, title and source file of each extra page; default: []. Example: ["README.md", "LICENSE", "CONTRIBUTING.md": [filename: "contributing", title: "Contributing", source: "CONTRIBUTING.mdx"]]
and here: https://hexdocs.pm/ex_doc/Mix.Tasks.Docs.html#module-groups
extras: [
"guides/introduction/foo.md",
"guides/introduction/bar.md",
...
"guides/advanced/baz.md",
"guides/advanced/bat.md"
]
groups_for_extras: [
"Introduction": Path.wildcard("guides/introduction/*.md"),
"Advanced": Path.wildcard("guides/advanced/*.md")
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After familiarizing myself more with HexDocs for Elixir and Erlang, I noticed that they use "Pages" as the headline in the sidebar too.
So it might make sense to stay consistent with others are doing in the BEAM
ecosystem and adopt their naming: SearchItemType::Extra
.
compiler-core/src/docs.rs
Outdated
title: config.name.to_string(), | ||
content, | ||
url: page.path.to_string(), | ||
doc: content, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs_pages: &[DocsPage],
for page in docs_pages {
let content = fs.read(&page.source).unwrap_or_default();
...
}
SearchItemType::Page
seems more appropriate than SearchItemType::Extra
.
Or refactor everything that's Page
to Extra
?
See comment regarding this above.
compiler-core/src/docs.rs
Outdated
url: String, | ||
doc: String, | ||
#[serde(rename = "ref")] | ||
ref_: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the naming and would prefer it too. Yet it is named differently in ExDocs so I went for consistency.
See here for the summarized ExDocs data format and more Info.
So shall we use reference
internally and have a separate function that transforms the json output of search-data.json
for HexDocs indexing?
compiler-core/src/docs.rs
Outdated
title: String, | ||
content: String, | ||
url: String, | ||
doc: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the naming and would prefer it too. Yet it is named differently in ExDocs so I went for consistency.
See here for the summarized ExDocs data format and more Info.
So shall we use content
internally and have a separate function that transforms the json output of search-data.json
for HexDocs indexing?
compiler-core/src/docs.rs
Outdated
|
||
#[derive(Serialize, PartialEq, Eq, PartialOrd, Ord, Clone)] | ||
#[serde(rename_all = "lowercase")] | ||
enum SearchProgLang { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the naming and would prefer it too. Yet it is named differently in ExDocs so I went for consistency.
See here for the summarized ExDocs data format and more Info.
So shall we use SearchProgrammingLanguage
internally and have a separate function that transforms the json output of search-data.json
for HexDocs indexing?
this.add({ | ||
id: i, | ||
// type: entry.type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this comment to discuss, or keep it as a reminder that we have this property in the new data model but are not making use of it yet.
const index = lunr(function () { | ||
this.ref("id"); | ||
// this.field("type"); | ||
// this.field("parentTitle"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this comment to discuss, or keep it as a reminder that we have this property in the new data model but are not making use of it yet.
Same here, @lpil – sorry! Glad we figured this out – I am relieved. You should be able to see my comments from weeks back now. |
3bef08f
to
cc0ba6f
Compare
Renamed the abbreviations. You should be able to see all my replies to your comments from weeks ago now.
Added tests to ensure that the output matches the ExDocs specification. |
ee888b7
to
c386647
Compare
Rebased and resolved conflicts. |
…xing of Gleam packages
…ches ExDocs specification
c386647
to
761bd5a
Compare
This PR is regarding issue #3811.
Data Model
The search data model json changed from
SearchIndex
:to
SearchData
,SearchItem
,SearchItemType
andSearchProgrammingLanguage
:Notes:
parentTitle
doesn't seem to have an equivalent in ExDocs from what I have seen so far. It is being populated by:module.name.to_string()
in the context of types"module" | "type" | "function" | "constant"
config.name.to_string()
in the context of type"page"
(README.md
)doc
matches ExDocs, except for headline partitioningref
matches ExDocs, except for aritymodule.html#title/arity
Neither of those is an issue for global HexDocs indexing of Gleam packages.
Output Files
As I'm still exploring how everything works in the codebase, I didn't want to change too much at once. So I added
search-data.json
in addition tosearch-data.js
to keep the existing mechanism working as is for now.search-data.js
Wrapped
search_data_json
inwindow.Gleam.initSearch({});
for use indocumentation_layout.html
to executeinitSearch({})
indocs-js/index.js
which in turn useslunr.js
.search_data.json
Bare
search_data_json
for use in global HexDocs.