Skip to content

Commit 5edbbee

Browse files
author
twitter-team
committed
Open-sourcing Representation Scorer
Representation Scorer (RSX) serves as a centralized scoring system, offering SimClusters or other embedding-based scoring solutions as machine learning features.
1 parent 43cdcf2 commit 5edbbee

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+2544
-0
lines changed

representation-scorer/BUILD.bazel

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# This prevents SQ query from grabbing //:all since it traverses up once to find a BUILD

representation-scorer/README.md

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Representation Scorer #
2+
3+
**Representation Scorer** (RSX) serves as a centralized scoring system, offering SimClusters or other embedding-based scoring solutions as machine learning features.
4+
5+
The Representation Scorer acquires user behavior data from the User Signal Service (USS) and extracts embeddings from the Representation Manager (RMS). It then calculates both pairwise and listwise features. These features are used at various stages, including candidate retrieval and ranking.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
#!/bin/bash
2+
3+
export CANARY_CHECK_ROLE="representation-scorer"
4+
export CANARY_CHECK_NAME="representation-scorer"
5+
export CANARY_CHECK_INSTANCES="0-19"
6+
7+
python3 relevance-platform/tools/canary_check.py "$@"
8+

representation-scorer/bin/deploy.sh

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
#!/usr/bin/env bash
2+
3+
JOB=representation-scorer bazel run --ui_event_filters=-info,-stdout,-stderr --noshow_progress \
4+
//relevance-platform/src/main/python/deploy -- "$@"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
#!/bin/bash
2+
3+
set -o nounset
4+
set -eu
5+
6+
DC="atla"
7+
ROLE="$USER"
8+
SERVICE="representation-scorer"
9+
INSTANCE="0"
10+
KEY="$DC/$ROLE/devel/$SERVICE/$INSTANCE"
11+
12+
while test $# -gt 0; do
13+
case "$1" in
14+
-h|--help)
15+
echo "$0 Set up an ssh tunnel for $SERVICE remote debugging and disable aurora health checks"
16+
echo " "
17+
echo "See representation-scorer/README.md for details of how to use this script, and go/remote-debug for"
18+
echo "general information about remote debugging in Aurora"
19+
echo " "
20+
echo "Default instance if called with no args:"
21+
echo " $KEY"
22+
echo " "
23+
echo "Positional args:"
24+
echo " $0 [datacentre] [role] [service_name] [instance]"
25+
echo " "
26+
echo "Options:"
27+
echo " -h, --help show brief help"
28+
exit 0
29+
;;
30+
*)
31+
break
32+
;;
33+
esac
34+
done
35+
36+
if [ -n "${1-}" ]; then
37+
DC="$1"
38+
fi
39+
40+
if [ -n "${2-}" ]; then
41+
ROLE="$2"
42+
fi
43+
44+
if [ -n "${3-}" ]; then
45+
SERVICE="$3"
46+
fi
47+
48+
if [ -n "${4-}" ]; then
49+
INSTANCE="$4"
50+
fi
51+
52+
KEY="$DC/$ROLE/devel/$SERVICE/$INSTANCE"
53+
read -p "Set up remote debugger tunnel for $KEY? (y/n) " -r CONFIRM
54+
if [[ ! $CONFIRM =~ ^[Yy]$ ]]; then
55+
echo "Exiting, tunnel not created"
56+
exit 1
57+
fi
58+
59+
echo "Disabling health check and opening tunnel. Exit with control-c when you're finished"
60+
CMD="aurora task ssh $KEY -c 'touch .healthchecksnooze' && aurora task ssh $KEY -L '5005:debug' --ssh-options '-N -S none -v '"
61+
62+
echo "Running $CMD"
63+
eval "$CMD"
64+
65+
66+

representation-scorer/docs/index.rst

+39
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
Representation Scorer (RSX)
2+
###########################
3+
4+
Overview
5+
========
6+
7+
Representation Scorer (RSX) is a StratoFed service which serves scores for pairs of entities (User, Tweet, Topic...) based on some representation of those entities. For example, it serves User-Tweet scores based on the cosine similarity of SimClusters embeddings for each of these. It aims to provide these with low latency and at high scale, to support applications such as scoring for ANN candidate generation and feature hydration via feature store.
8+
9+
10+
Current use cases
11+
-----------------
12+
13+
RSX currently serves traffic for the following use cases:
14+
15+
- User-Tweet similarity scores for Home ranking, using SimClusters embedding dot product
16+
- Topic-Tweet similarity scores for topical tweet candidate generation and topic social proof, using SimClusters embedding cosine similarity and CERTO scores
17+
- Tweet-Tweet and User-Tweet similarity scores for ANN candidate generation, using SimClusters embedding cosine similarity
18+
- (in development) User-Tweet similarity scores for Home ranking, based on various aggregations of similarities with recent faves, retweets and follows performed by the user
19+
20+
Getting Started
21+
===============
22+
23+
Fetching scores
24+
---------------
25+
26+
Scores are served from the recommendations/representation_scorer/score column.
27+
28+
Using RSX for your application
29+
------------------------------
30+
31+
RSX may be a good fit for your application if you need scores based on combinations of SimCluster embeddings for core nouns. We also plan to support other embeddings and scoring approaches in the future.
32+
33+
.. toctree::
34+
:maxdepth: 2
35+
:hidden:
36+
37+
index
38+
39+

representation-scorer/server/BUILD

+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
jvm_binary(
2+
name = "bin",
3+
basename = "representation-scorer",
4+
main = "com.twitter.representationscorer.RepresentationScorerFedServerMain",
5+
platform = "java8",
6+
tags = ["bazel-compatible"],
7+
dependencies = [
8+
"finatra/inject/inject-logback/src/main/scala",
9+
"loglens/loglens-logback/src/main/scala/com/twitter/loglens/logback",
10+
"representation-scorer/server/src/main/resources",
11+
"representation-scorer/server/src/main/scala/com/twitter/representationscorer",
12+
"twitter-server/logback-classic/src/main/scala",
13+
],
14+
)
15+
16+
# Aurora Workflows build phase convention requires a jvm_app named with ${project-name}-app
17+
jvm_app(
18+
name = "representation-scorer-app",
19+
archive = "zip",
20+
binary = ":bin",
21+
tags = ["bazel-compatible"],
22+
)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
resources(
2+
sources = [
3+
"*.xml",
4+
"*.yml",
5+
"com/twitter/slo/slo.json",
6+
"config/*.yml",
7+
],
8+
tags = ["bazel-compatible"],
9+
)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
{
2+
"servers": [
3+
{
4+
"name": "strato",
5+
"indicators": [
6+
{
7+
"id": "success_rate_3m",
8+
"indicator_type": "SuccessRateIndicator",
9+
"duration": 3,
10+
"duration_unit": "MINUTES"
11+
}, {
12+
"id": "latency_3m_p99",
13+
"indicator_type": "LatencyIndicator",
14+
"duration": 3,
15+
"duration_unit": "MINUTES",
16+
"percentile": 0.99
17+
}
18+
],
19+
"objectives": [
20+
{
21+
"indicator": "success_rate_3m",
22+
"objective_type": "SuccessRateObjective",
23+
"operator": ">=",
24+
"threshold": 0.995
25+
},
26+
{
27+
"indicator": "latency_3m_p99",
28+
"objective_type": "LatencyObjective",
29+
"operator": "<=",
30+
"threshold": 50
31+
}
32+
],
33+
"long_term_objectives": [
34+
{
35+
"id": "success_rate_28_days",
36+
"objective_type": "SuccessRateObjective",
37+
"operator": ">=",
38+
"threshold": 0.993,
39+
"duration": 28,
40+
"duration_unit": "DAYS"
41+
},
42+
{
43+
"id": "latency_p99_28_days",
44+
"objective_type": "LatencyObjective",
45+
"operator": "<=",
46+
"threshold": 60,
47+
"duration": 28,
48+
"duration_unit": "DAYS",
49+
"percentile": 0.99
50+
}
51+
]
52+
}
53+
],
54+
"@version": 1
55+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
enableLogFavBasedApeEntity20M145KUpdatedEmbeddingCachedStore:
2+
comment: "Enable to use the non-empty store for logFavBasedApeEntity20M145KUpdatedEmbeddingCachedStore (from 0% to 100%). 0 means use EMPTY readable store for all requests."
3+
default_availability: 0
4+
5+
enableLogFavBasedApeEntity20M145K2020EmbeddingCachedStore:
6+
comment: "Enable to use the non-empty store for logFavBasedApeEntity20M145K2020EmbeddingCachedStore (from 0% to 100%). 0 means use EMPTY readable store for all requests."
7+
default_availability: 0
8+
9+
representation-scorer_forward_dark_traffic:
10+
comment: "Defines the percentage of traffic to forward to diffy-proxy. Set to 0 to disable dark traffic forwarding"
11+
default_availability: 0
12+
13+
"representation-scorer_load_shed_non_prod_callers":
14+
comment: "Discard traffic from all non-prod callers"
15+
default_availability: 0
16+
17+
enable_log_fav_based_tweet_embedding_20m145k2020_timeouts:
18+
comment: "If enabled, set a timeout on calls to the logFavBased20M145K2020TweetEmbeddingStore"
19+
default_availability: 0
20+
21+
log_fav_based_tweet_embedding_20m145k2020_timeout_value_millis:
22+
comment: "The value of this decider defines the timeout (in milliseconds) to use on calls to the logFavBased20M145K2020TweetEmbeddingStore, i.e. 1.50% is 150ms. Only applied if enable_log_fav_based_tweet_embedding_20m145k2020_timeouts is true"
23+
default_availability: 2000
24+
25+
enable_log_fav_based_tweet_embedding_20m145kUpdated_timeouts:
26+
comment: "If enabled, set a timeout on calls to the logFavBased20M145KUpdatedTweetEmbeddingStore"
27+
default_availability: 0
28+
29+
log_fav_based_tweet_embedding_20m145kUpdated_timeout_value_millis:
30+
comment: "The value of this decider defines the timeout (in milliseconds) to use on calls to the logFavBased20M145KUpdatedTweetEmbeddingStore, i.e. 1.50% is 150ms. Only applied if enable_log_fav_based_tweet_embedding_20m145kUpdated_timeouts is true"
31+
default_availability: 2000
32+
33+
enable_cluster_tweet_index_store_timeouts:
34+
comment: "If enabled, set a timeout on calls to the ClusterTweetIndexStore"
35+
default_availability: 0
36+
37+
cluster_tweet_index_store_timeout_value_millis:
38+
comment: "The value of this decider defines the timeout (in milliseconds) to use on calls to the ClusterTweetIndexStore, i.e. 1.50% is 150ms. Only applied if enable_cluster_tweet_index_store_timeouts is true"
39+
default_availability: 2000
40+
41+
representation_scorer_fetch_signal_share:
42+
comment: "If enabled, fetches share signals from USS"
43+
default_availability: 0
44+
45+
representation_scorer_fetch_signal_reply:
46+
comment: "If enabled, fetches reply signals from USS"
47+
default_availability: 0
48+
49+
representation_scorer_fetch_signal_original_tweet:
50+
comment: "If enabled, fetches original tweet signals from USS"
51+
default_availability: 0
52+
53+
representation_scorer_fetch_signal_video_playback:
54+
comment: "If enabled, fetches video playback signals from USS"
55+
default_availability: 0
56+
57+
representation_scorer_fetch_signal_block:
58+
comment: "If enabled, fetches account block signals from USS"
59+
default_availability: 0
60+
61+
representation_scorer_fetch_signal_mute:
62+
comment: "If enabled, fetches account mute signals from USS"
63+
default_availability: 0
64+
65+
representation_scorer_fetch_signal_report:
66+
comment: "If enabled, fetches tweet report signals from USS"
67+
default_availability: 0
68+
69+
representation_scorer_fetch_signal_dont_like:
70+
comment: "If enabled, fetches tweet don't like signals from USS"
71+
default_availability: 0
72+
73+
representation_scorer_fetch_signal_see_fewer:
74+
comment: "If enabled, fetches tweet see fewer signals from USS"
75+
default_availability: 0
76+
77+
# To create a new decider, add here with the same format and caller's details : "representation-scorer_load_shed_by_caller_id_twtr:{{role}}:{{name}}:{{environment}}:{{cluster}}"
78+
# All the deciders below are generated by this script - ./strato/bin/fed deciders ./ --service-role=representation-scorer --service-name=representation-scorer
79+
# If you need to run the script and paste the output, add only the prod deciders here. Non-prod ones are being taken care of by representation-scorer_load_shed_non_prod_callers
80+
81+
"representation-scorer_load_shed_by_caller_id_all":
82+
comment: "Reject all traffic from caller id: all"
83+
default_availability: 0
84+
85+
"representation-scorer_load_shed_by_caller_id_twtr:svc:frigate:frigate-pushservice-canary:prod:atla":
86+
comment: "Reject all traffic from caller id: twtr:svc:frigate:frigate-pushservice-canary:prod:atla"
87+
default_availability: 0
88+
89+
"representation-scorer_load_shed_by_caller_id_twtr:svc:frigate:frigate-pushservice-canary:prod:pdxa":
90+
comment: "Reject all traffic from caller id: twtr:svc:frigate:frigate-pushservice-canary:prod:pdxa"
91+
default_availability: 0
92+
93+
"representation-scorer_load_shed_by_caller_id_twtr:svc:frigate:frigate-pushservice-send:prod:atla":
94+
comment: "Reject all traffic from caller id: twtr:svc:frigate:frigate-pushservice-send:prod:atla"
95+
default_availability: 0
96+
97+
"representation-scorer_load_shed_by_caller_id_twtr:svc:frigate:frigate-pushservice:prod:atla":
98+
comment: "Reject all traffic from caller id: twtr:svc:frigate:frigate-pushservice:prod:atla"
99+
default_availability: 0
100+
101+
"representation-scorer_load_shed_by_caller_id_twtr:svc:frigate:frigate-pushservice:prod:pdxa":
102+
comment: "Reject all traffic from caller id: twtr:svc:frigate:frigate-pushservice:prod:pdxa"
103+
default_availability: 0
104+
105+
"representation-scorer_load_shed_by_caller_id_twtr:svc:frigate:frigate-pushservice:staging:atla":
106+
comment: "Reject all traffic from caller id: twtr:svc:frigate:frigate-pushservice:staging:atla"
107+
default_availability: 0
108+
109+
"representation-scorer_load_shed_by_caller_id_twtr:svc:frigate:frigate-pushservice:staging:pdxa":
110+
comment: "Reject all traffic from caller id: twtr:svc:frigate:frigate-pushservice:staging:pdxa"
111+
default_availability: 0
112+
113+
"representation-scorer_load_shed_by_caller_id_twtr:svc:home-scorer:home-scorer:prod:atla":
114+
comment: "Reject all traffic from caller id: twtr:svc:home-scorer:home-scorer:prod:atla"
115+
default_availability: 0
116+
117+
"representation-scorer_load_shed_by_caller_id_twtr:svc:home-scorer:home-scorer:prod:pdxa":
118+
comment: "Reject all traffic from caller id: twtr:svc:home-scorer:home-scorer:prod:pdxa"
119+
default_availability: 0
120+
121+
"representation-scorer_load_shed_by_caller_id_twtr:svc:stratostore:stratoapi:prod:atla":
122+
comment: "Reject all traffic from caller id: twtr:svc:stratostore:stratoapi:prod:atla"
123+
default_availability: 0
124+
125+
"representation-scorer_load_shed_by_caller_id_twtr:svc:stratostore:stratoserver:prod:atla":
126+
comment: "Reject all traffic from caller id: twtr:svc:stratostore:stratoserver:prod:atla"
127+
default_availability: 0
128+
129+
"representation-scorer_load_shed_by_caller_id_twtr:svc:stratostore:stratoserver:prod:pdxa":
130+
comment: "Reject all traffic from caller id: twtr:svc:stratostore:stratoserver:prod:pdxa"
131+
default_availability: 0
132+
133+
"representation-scorer_load_shed_by_caller_id_twtr:svc:timelinescorer:timelinescorer:prod:atla":
134+
comment: "Reject all traffic from caller id: twtr:svc:timelinescorer:timelinescorer:prod:atla"
135+
default_availability: 0
136+
137+
"representation-scorer_load_shed_by_caller_id_twtr:svc:timelinescorer:timelinescorer:prod:pdxa":
138+
comment: "Reject all traffic from caller id: twtr:svc:timelinescorer:timelinescorer:prod:pdxa"
139+
default_availability: 0
140+
141+
"representation-scorer_load_shed_by_caller_id_twtr:svc:topic-social-proof:topic-social-proof:prod:atla":
142+
comment: "Reject all traffic from caller id: twtr:svc:topic-social-proof:topic-social-proof:prod:atla"
143+
default_availability: 0
144+
145+
"representation-scorer_load_shed_by_caller_id_twtr:svc:topic-social-proof:topic-social-proof:prod:pdxa":
146+
comment: "Reject all traffic from caller id: twtr:svc:topic-social-proof:topic-social-proof:prod:pdxa"
147+
default_availability: 0
148+
149+
"enable_sim_clusters_embedding_store_timeouts":
150+
comment: "If enabled, set a timeout on calls to the SimClustersEmbeddingStore"
151+
default_availability: 10000
152+
153+
sim_clusters_embedding_store_timeout_value_millis:
154+
comment: "The value of this decider defines the timeout (in milliseconds) to use on calls to the SimClustersEmbeddingStore, i.e. 1.50% is 150ms. Only applied if enable_sim_clusters_embedding_store_timeouts is true"
155+
default_availability: 2000

0 commit comments

Comments
 (0)