Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include user tags in Beatmapset search #11763

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion app/Jobs/EsDocument.php
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ class EsDocument implements ShouldQueue
{
use Dispatchable, InteractsWithQueue, Queueable;

private array $modelMeta;
protected array $modelMeta;

/**
* Create a new job instance.
Expand Down
20 changes: 20 additions & 0 deletions app/Jobs/EsDocumentUnique.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<?php

// Copyright (c) ppy Pty Ltd <[email protected]>. Licensed under the GNU Affero General Public License v3.0.
// See the LICENCE file in the repository root for full licence text.

declare(strict_types=1);

namespace App\Jobs;

use Illuminate\Contracts\Queue\ShouldBeUnique;

class EsDocumentUnique extends EsDocument implements ShouldBeUnique
{
public int $uniqueFor = 600;

public function uniqueId(): string
{
return "{$this->modelMeta['class']}-{$this->modelMeta['id']}";
}
}
3 changes: 3 additions & 0 deletions app/Libraries/Search/BeatmapsetQueryParser.php
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,9 @@ public static function parse(?string $query): array
case 'source':
$option = static::makeTextOption($op, $m['value']);
break;
case 'tag':
$option = [static::makeTextOption($op, $m['value'])];
break;
case 'title':
$option = static::makeTextOption($op, $m['value']);
break;
Expand Down
27 changes: 27 additions & 0 deletions app/Libraries/Search/BeatmapsetSearch.php
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
use App\Models\Beatmapset;
use App\Models\Follow;
use App\Models\Solo;
use App\Models\Tag;
use App\Models\User;
use Ds\Set;

Expand Down Expand Up @@ -60,6 +61,12 @@ public function getQuery()
->should(['term' => ['_id' => ['value' => $this->params->queryString, 'boost' => 100]]])
->should(QueryHelper::queryString($this->params->queryString, $partialMatchFields, 'or', 1 / count($terms)))
->should(QueryHelper::queryString($this->params->queryString, [], 'and'))
->should([
'nested' => [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be also shared with the beatmap specific $nested query down below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scoring and matching of the query string is not the same in the other nested filter

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this match tag that isn't part of beatmap specific field match? searching by this and difficulty for example

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Searching for potato foo difficulty=mayday returns beatmapsets with mayday difficulty with relevance score according to how the fields in the query match potato foo

'path' => 'beatmaps',
'query' => QueryHelper::queryString($this->params->queryString, ['beatmaps.top_tags'], 'or', 0.5 / count($terms)),
],
])
);
}

Expand All @@ -82,6 +89,7 @@ public function getQuery()
$this->addPlayedFilter($query, $nested);
$this->addRankFilter($nested);
$this->addRecommendedFilter($nested);
$this->addTagsFilter($nested);

$this->addSimpleFilters($query, $nested);
$this->addCreatorFilter($query, $nested);
Expand Down Expand Up @@ -398,6 +406,25 @@ private function addTextFilter(BoolQuery $query, string $paramField, array $fiel
$query->must($subQuery);
}

private function addTagsFilter(BoolQuery $query): void
{
if ($this->params->tags === null) {
return;
}

$tagSet = new Set(array_map('mb_strtolower', $this->params->tags));
$tags = Tag::whereIn('name', $this->params->tags)->limit(10)->pluck('name');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the limit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you can't put hundreds of tags in; the limit should be in the parser but I didn't decide whether it should be throwing or how the UI is supposed to handle it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, the remaining hundreds of tags will still be looked up anyway... presumably with slower text query instead of keyword

$tagSet->remove(...$tags->map(fn ($name) => mb_strtolower($name))->toArray());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toArray not needed?


foreach ($tagSet as $tag) {
$query->filter(QueryHelper::queryString($tag, ['beatmaps.top_tags'], 'and'));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apparently searching for tag="a b" will also include beatmaps which has tag a and b in addition to a b...? not sure if intentional.

also tag="a" will also include a b and I can't find way to filter just a (although there isn't such tag at the moment)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's partly a parsing issue unless we want to start going tag=""a"" for exact matches...
creator="a" already includes a b and a c so it's not really doing anything different at the moment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

creator=a does only include user a though 🤔 (it uses user id lookup if the user exists)

}

foreach ($tags as $tag) {
$query->filter(['term' => ['beatmaps.top_tags.raw' => $tag]]);
}
}

private function getPlayedBeatmapIds(?array $rank = null)
{
$query = Solo\Score
Expand Down
1 change: 1 addition & 0 deletions app/Libraries/Search/BeatmapsetSearchParams.php
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ class BeatmapsetSearchParams extends SearchParams
public bool $showSpotlights = false;
public ?string $source = null;
public ?string $status = null;
public ?array $tags = null;
public ?string $title = null;
public ?array $statusRange = null;
public ?array $totalLength = null;
Expand Down
1 change: 1 addition & 0 deletions app/Libraries/Search/BeatmapsetSearchRequestParams.php
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,7 @@ private function parseQuery(): void
'source' => 'source',
'stars' => 'difficultyRating',
'status' => 'statusRange',
'tag' => 'tags',
'title' => 'title',
'updated' => 'updated',
];
Expand Down
4 changes: 2 additions & 2 deletions app/Models/Beatmap.php
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
namespace App\Models;

use App\Exceptions\InvariantException;
use App\Jobs\EsDocument;
use App\Jobs\EsDocumentUnique;
use App\Libraries\Transactions\AfterCommit;
use App\Traits\Memoizes;
use Illuminate\Database\Eloquent\Builder;
Expand Down Expand Up @@ -247,7 +247,7 @@ public function afterCommit()
$beatmapset = $this->beatmapset;

if ($beatmapset !== null) {
dispatch(new EsDocument($beatmapset));
dispatch(new EsDocumentUnique($beatmapset));
}
}

Expand Down
4 changes: 2 additions & 2 deletions app/Models/Beatmapset.php
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
use App\Exceptions\ImageProcessorServiceException;
use App\Exceptions\InvariantException;
use App\Jobs\CheckBeatmapsetCovers;
use App\Jobs\EsDocument;
use App\Jobs\EsDocumentUnique;
use App\Jobs\Notifications\BeatmapsetDiscussionLock;
use App\Jobs\Notifications\BeatmapsetDiscussionUnlock;
use App\Jobs\Notifications\BeatmapsetDisqualify;
Expand Down Expand Up @@ -1508,7 +1508,7 @@ public function refreshCache(bool $resetEligibleMainRulesets = false): void

public function afterCommit()
{
dispatch(new EsDocument($this));
dispatch(new EsDocumentUnique($this));
}

public function notificationCover()
Expand Down
10 changes: 6 additions & 4 deletions app/Models/Traits/Es/BaseDbIndexable.php
Original file line number Diff line number Diff line change
Expand Up @@ -82,12 +82,13 @@ public function esRouting()

public function esDeleteDocument(array $options = [])
{
$document = array_merge([
$document = [
'index' => static::esIndexName(),
'routing' => $this->esRouting(),
'id' => $this->getEsId(),
'client' => ['ignore' => 404],
], $options);
...$options,
];

return Es::getClient()->delete($document);
}
Expand All @@ -98,12 +99,13 @@ public function esIndexDocument(array $options = [])
return $this->esDeleteDocument($options);
}

$document = array_merge([
$document = [
'index' => static::esIndexName(),
'routing' => $this->esRouting(),
'id' => $this->getEsId(),
'body' => $this->toEsJson(),
], $options);
...$options,
];

return Es::getClient()->index($document);
}
Expand Down
46 changes: 34 additions & 12 deletions app/Models/Traits/Es/BeatmapsetSearch.php
Original file line number Diff line number Diff line change
Expand Up @@ -31,18 +31,30 @@ public static function esSchemaFile()
return config_path('schemas/beatmapsets.json');
}

private static function esBeatmapTags(Beatmap $beatmap): array
{
$tags = app('tags');

return array_reject_null(
array_map(
fn ($tagId) => $tags->get($tagId['tag_id'])?->name,
$beatmap->topTagIds()
)
);
}

public function esShouldIndex()
{
return !$this->trashed() && !present($this->download_disabled_url);
}

public function toEsJson()
{
return array_merge(
$this->esBeatmapsetValues(),
['beatmaps' => $this->esBeatmapsValues()],
['difficulties' => $this->esDifficultiesValues()]
);
return [
...$this->esBeatmapsetValues(),
'beatmaps' => $this->esBeatmapsValues(),
'difficulties' => $this->esDifficultiesValues(),
];
}

private function esBeatmapsetValues()
Expand Down Expand Up @@ -78,12 +90,17 @@ private function esBeatmapsValues()
foreach ($this->beatmaps as $beatmap) {
$beatmapValues = [];
foreach ($mappings as $field => $mapping) {
$beatmapValues[$field] = $beatmap->$field;
$value = match ($field) {
'top_tags' => $this->esBeatmapTags($beatmap),
// TODO: remove adding $beatmap->user_id once everything else also populated beatmap_owners by default.
// Duplicate user_id in the array should be fine for now since the field isn't scored for querying.
'user_id' => $beatmap->beatmapOwners->pluck('user_id')->add($beatmap->user_id),
default => $beatmap->$field,
};

$beatmapValues[$field] = $value;
}

// TODO: remove adding $beatmap->user_id once everything else also populated beatmap_owners by default.
// Duplicate user_id in the array should be fine for now since the field isn't scored for querying.
$beatmapValues['user_id'] = $beatmap->beatmapOwners->pluck('user_id')->add($beatmap->user_id);
$values[] = $beatmapValues;

if ($beatmap->playmode === Beatmap::MODES['osu']) {
Expand All @@ -96,11 +113,16 @@ private function esBeatmapsValues()
$convert->playmode = $modeInt;
$convert->convert = true;
$convertValues = [];
foreach ($mappings as $field => $mapping) {
$convertValues[$field] = $convert->$field;
foreach ($mappings as $field => $_mapping) {
$convertValues[$field] = match ($field) {
// just add a copy for converts too.
'top_tags',
'user_id' => $beatmapValues[$field],

default => $convert->$field,
};
}

$convertValues['user_id'] = $beatmapValues['user_id']; // just add a copy for converts too.
$values[] = $convertValues;
}
}
Expand Down
8 changes: 8 additions & 0 deletions config/schemas/beatmapsets.json
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,14 @@
"playmode": {
"type": "byte"
},
"top_tags": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"total_length": {
"type": "long"
},
Expand Down
3 changes: 3 additions & 0 deletions tests/Libraries/Search/BeatmapsetQueryParserTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@ public static function queryDataProvider()
['ranked>="2020-07-21 12:30:30 +09:00"', ['keywords' => null, 'options' => ['ranked' => ['gte' => static::parseTime('2020-07-21 03:30:30')]]]],
['ranked="2020-07-21 12:30:30 +09:00"', ['keywords' => null, 'options' => ['ranked' => ['gte' => static::parseTime('2020-07-21 03:30:30'), 'lt' => static::parseTime('2020-07-21 03:30:31')]]]],
['ranked="invalid date format"', ['keywords' => 'ranked="invalid date format"', 'options' => []]],
['tag=hello', ['keywords' => null, 'options' => ['tag' => ['hello']]]],
['tag=hello tag=world', ['keywords' => null, 'options' => ['tag' => ['hello', 'world']]]],
['tag="hello world"', ['keywords' => null, 'options' => ['tag' => ['hello world']]]],

// multiple options
['artist=hello creator:world', ['keywords' => null, 'options' => ['artist' => 'hello', 'creator' => 'world']]],
Expand Down