Skip to content

Commit

Permalink
Overhaul /repositories
Browse files Browse the repository at this point in the history
Increase performance on instances with large number of repositories by
using the same limiting semantics we use on the search page. The
difference is that this limiting is done entirely in neogrok, as zoekt
has no scoring/truncating parameters for repository search.  So this
limit does nothing to reduce API bandwith, only to help render
performance. Rendering an HTML table with 5000 rows is simply too
expensive to do on every keypress.

Make the table sortable by columns.

Clarify the table layout and expand the documentation to describe what
shards are.
  • Loading branch information
isker committed Nov 23, 2023
1 parent ace6aa0 commit 33a5fe6
Show file tree
Hide file tree
Showing 12 changed files with 394 additions and 136 deletions.
5 changes: 5 additions & 0 deletions .changeset/slimy-eyes-kneel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"neogrok": minor
---

Enhance the repositories list page, making it more performant on instances with large numbers of repositories, and make columns sortable by clicking on their headers
15 changes: 8 additions & 7 deletions src/lib/server/zoekt-list-repositories.ts
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,18 @@ export async function listRepositories(

const statsSchema = v
.object({
Shards: v.number(),
Documents: v.number(),
IndexBytes: v.number(),
ContentBytes: v.number(),
})
.map(({ Documents, IndexBytes, ContentBytes }) => ({
.map(({ Shards, Documents, IndexBytes, ContentBytes }) => ({
shardCount: Shards,
fileCount: Documents,
indexBytes: IndexBytes,
contentBytes: ContentBytes,
}));
export type RepoStats = v.Infer<typeof statsSchema>;

const dateSchema = v.string().chain((str) => {
const date = new Date(str);
Expand Down Expand Up @@ -125,10 +128,10 @@ const listResultSchema = v.object({
})),
Stats: statsSchema,
})
.map(({ Repository, IndexMetadata: { lastIndexed }, Stats }) => ({
.map(({ Repository, IndexMetadata, Stats }) => ({
...Repository,
lastIndexed,
stats: Stats,
...IndexMetadata,
...Stats,
})),
),
)
Expand All @@ -147,7 +150,5 @@ const listResultSchema = v.object({
const toISOStringWithoutMs = (d: Date) =>
d.toISOString().replace(/\.\d{3}Z$/, "Z");

export type ListResults = ReadonlyDeep<
v.Infer<typeof listResultSchema>["List"]
>;
export type ListResults = v.Infer<typeof listResultSchema>["List"];
export type Repository = ListResults["repositories"][number];
64 changes: 50 additions & 14 deletions src/routes/about/+page.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -107,20 +107,56 @@
the repositories indexed in the backing zoekt instance, including a variety
of data about them.
</p>
<p>
Note that the search input on this page has the same semantics as the
search input on the main search page: you are writing a full <Link
to="/syntax">zoekt query</Link
>, but instead of getting normal search results, you get repositories that
contain any results matching the query. So, <ExampleQuery
query="r:linux"
page="repositories"
/> filters the table to repositories with "linux" in their name, while <ExampleQuery
query="linux"
page="repositories"
/> filters the table to repositories with linux in their
<em>contents</em>.
</p>
<section class="space-y-2">
<Heading element="h3" id="repo-search">Repository search</Heading>
<p>
Note that the search input on this page has the same semantics as the
search input on the main search page: you are writing a full <Link
to="/syntax">zoekt query</Link
>, but instead of getting normal search results, you get repositories
that contain any results matching the query. So, <ExampleQuery
query="r:linux"
page="repositories"
/> filters the table to repositories with "linux" in their name, while <ExampleQuery
query="linux"
page="repositories"
/> filters the table to repositories with linux in their
<em>contents</em>.
</p>
<p>
To improve page performance on deployments with large numbers of
repositories, there is a <i>repos</i> input that limits the number of
displayed repositories in the same way that the <i>files</i> and
<i>matches</i> inputs on the search page do.
</p>
</section>
<section class="space-y-2">
<Heading element="h3" id="repository-stats">Repository stats</Heading>
<p>
The tabulated data includes links to the repository and its indexed
branches, the times the repository was last indexed and that it was last
committed to, and data about the index <i>shards</i> and their contents.
The table can be sorted by clicking on column headers: the first click will
sort in ascending order, the second in descending, and the third will restore
the status quo.
</p>
<p>
<i>Shards</i> are what zoekt calls the files emitted from its indexer, and
they're all that's used by the zoekt-webserver backing neogrok to handle
neogrok's API requests; they contain the above-described repository metadata,
indexes used to quickly search repository content, and the repository content
itself (file names and contents). Indexing a repository typically results
in a single shard, but zoekt limits shard files to be about 100MiB in size,
so big repositories get more than one shard.
</p>
<p>
When you search repository contents (i.e. make a non-<code>repo:</code>
query), are in fact searching repository <em>shards</em>, and so for a
repository with more than one shard, you will see that the counts of
shards and associated data in the table go down when you enter a query
that matches content in only some of its shards.
</p>
</section>
</section>

<section class="space-y-2">
Expand Down
2 changes: 1 addition & 1 deletion src/routes/repositories/+page.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,6 @@
? data.listOutcome.results
: previousListResults ?? {
repositories: [],
stats: { fileCount: 0, contentBytes: 0, indexBytes: 0 },
stats: { shardCount: 0, fileCount: 0, contentBytes: 0, indexBytes: 0 },
}}
/>
23 changes: 23 additions & 0 deletions src/routes/repositories/branches.svelte
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
<script lang="ts">
import Link from "$lib/link.svelte";
import type { Repository } from "$lib/server/zoekt-list-repositories";
export let branches: Repository["branches"];
export let commitUrlTemplate: string | null;
// Abbreviate git hashes. Helps make the very wide table a bit narrower.
const abbreviateVersion = (v: string) =>
/^[a-z0-9]{40}$/.test(v) ? v.slice(0, 8) : v;
</script>

{#each branches as { name: branchName, version }}
{branchName}@<span class="font-mono">
{#if commitUrlTemplate}
<Link to={commitUrlTemplate.replaceAll("{{.Version}}", version)}
>{abbreviateVersion(version)}</Link
>
{:else}
{abbreviateVersion(version)}
{/if}
</span>
{/each}
165 changes: 140 additions & 25 deletions src/routes/repositories/repositories-list.svelte
Original file line number Diff line number Diff line change
@@ -1,41 +1,156 @@
<script lang="ts">
import prettyBytes from "pretty-bytes";
import type { ListResults } from "$lib/server/zoekt-list-repositories";
import Repository from "./repository.svelte";
import type {
ListResults,
RepoStats,
} from "$lib/server/zoekt-list-repositories";
import RepositoryName from "./repository-name.svelte";
import Branches from "./branches.svelte";
import Sortable from "./sortable-column-header.svelte";
import { createComparator, type SortBy } from "./table-sorting";
import { routeListQuery } from "./route-list-query";
import Link from "$lib/link.svelte";
export let results: ListResults;
$: ({
stats: { fileCount, indexBytes, contentBytes },
repositories,
} = results);
let sortBy: SortBy | null = null;
$: sorted =
sortBy === null
? results.repositories
: Array.from(results.repositories).sort(createComparator(sortBy));
$: truncated = sorted.slice(0, $routeListQuery.repos);
$: limited = results.repositories.length > $routeListQuery.repos;
$: truncatedStats = limited
? truncated.reduce<RepoStats>(
(acc, val) => {
acc.shardCount += val.shardCount;
acc.fileCount += val.fileCount;
acc.indexBytes += val.indexBytes;
acc.contentBytes += val.contentBytes;
return acc;
},
{
shardCount: 0,
fileCount: 0,
indexBytes: 0,
contentBytes: 0,
},
)
: results.stats;
</script>

<!-- FIXME perf is bad on an instance with thousands of repos. -->
<!-- FIXME the file count/ram size data is for repo shards matching the query,
but that's not understandable from the UI -->
<h1 class="text-xs py-1">
{repositories.length}
{repositories.length === 1 ? "repository" : "repositories"} containing
{fileCount} files consuming
{prettyBytes(indexBytes + contentBytes, { space: false })} of RAM
<h1 class="text-xs flex flex-wrap py-1">
<span>
zoekt: {results.repositories.length}
{results.repositories.length === 1 ? "repository" : "repositories"} /
{results.stats.shardCount}
{results.stats.shardCount === 1 ? "shard" : "shards"} /
{results.stats.fileCount}
{results.stats.fileCount === 1 ? "file" : "files"} /
{prettyBytes(results.stats.indexBytes + results.stats.contentBytes, {
space: false,
binary: true,
})} RAM
</span>
<span class="ml-auto"
>neogrok: <span class:text-yellow-700={limited}
>{truncated.length}
{truncated.length === 1 ? "repository" : "repositories"}</span
>
/
{truncatedStats.shardCount}
{truncatedStats.shardCount === 1 ? "shard" : "shards"} /
{truncatedStats.fileCount}
{truncatedStats.fileCount === 1 ? "file" : "files"} /
{prettyBytes(truncatedStats.indexBytes + truncatedStats.contentBytes, {
space: false,
binary: true,
})} RAM
</span>
</h1>

<div class="overflow-x-auto">
<table class="border-collapse text-sm w-full text-center">
<thead>
<tr class="border bg-slate-100">
<th class="p-1">Repository</th>
<th class="p-1">File count</th>
<table class="border-collapse text-sm w-full text-center h-fit">
<thead class="border bg-slate-100">
<tr>
<th></th>
<th></th>
<th class="p-1 border-x" colspan="4"
>Index <Link to="/about#repository-stats">shard files</Link></th
>
<th></th>
<th></th>
</tr>
<tr class="h-full">
<th class="p-1"
><Sortable bind:sortBy sortColumn={{ prop: "name", kind: "string" }}
><div>Repository</div></Sortable
></th
>
<th class="p-1">Branches</th>
<th class="p-1">Content size in RAM</th>
<th class="p-1">Index size in RAM</th>
<th class="p-1">Last indexed</th>
<th class="p-1">Last commit</th>
<th class="p-1 border-l"
><Sortable
bind:sortBy
sortColumn={{ prop: "shardCount", kind: "number" }}
>Shard count</Sortable
></th
>
<th class="p-1"
><Sortable
bind:sortBy
sortColumn={{ prop: "fileCount", kind: "number" }}
>Contained files</Sortable
></th
>
<th class="p-1"
><Sortable
bind:sortBy
sortColumn={{ prop: "indexBytes", kind: "number" }}
>Index size in RAM</Sortable
></th
>
<th class="p-1 border-r"
><Sortable
bind:sortBy
sortColumn={{ prop: "contentBytes", kind: "number" }}
>Content size in RAM</Sortable
></th
>
<th class="p-1"
><Sortable
bind:sortBy
sortColumn={{ prop: "lastIndexed", kind: "string" }}
>Last indexed</Sortable
></th
>
<th class="p-1"
><Sortable
bind:sortBy
sortColumn={{ prop: "lastCommit", kind: "string" }}
>Last commit</Sortable
></th
>
</tr>
</thead>
<tbody>
{#each repositories as repository}
<Repository {repository} />
{#each truncated as { name, url, branches, commitUrlTemplate, shardCount, fileCount, indexBytes, contentBytes, lastIndexed, lastCommit }}
<tr class="border">
<td class="p-1 border-x"><RepositoryName {name} {url} /></td>
<td class="p-1 border-x"
><Branches {branches} {commitUrlTemplate} /></td
>
<td class="p-1 border-x text-right">{shardCount}</td>
<td class="p-1 border-x text-right">{fileCount}</td>
<td class="p-1 border-x text-right"
>{prettyBytes(indexBytes, { space: false, binary: true })}</td
>
<td class="p-1 border-x text-right"
>{prettyBytes(contentBytes, { space: false, binary: true })}</td
>
<td class="p-1 border-x">{lastIndexed}</td>
<td class="p-1 border-x">{lastCommit}</td>
</tr>
{/each}
</tbody>
</table>
Expand Down
10 changes: 10 additions & 0 deletions src/routes/repositories/repository-name.svelte
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
<script lang="ts">
import Link from "$lib/link.svelte";
export let name: string;
export let url: string | null;
</script>

<div class="text-left">
{#if url}<Link to={url}>{name}</Link>{:else}{name}{/if}
</div>
45 changes: 0 additions & 45 deletions src/routes/repositories/repository.svelte

This file was deleted.

Loading

0 comments on commit 33a5fe6

Please sign in to comment.