-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support collect and display the heartbeat thread status #17473
base: master-2.x
Are you sure you want to change the base?
Conversation
Thanks for the work! Would you mind fixing the tests? |
78f0c0d
to
fb2e430
Compare
c2929a1
to
4d6c067
Compare
4d6c067
to
499db76
Compare
pending review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few comments, PTAL thanks!
core/common/src/main/java/alluxio/heartbeat/HeartbeatThread.java
Outdated
Show resolved
Hide resolved
core/common/src/main/java/alluxio/heartbeat/HeartbeatThread.java
Outdated
Show resolved
Hide resolved
mExecutor.heartbeat(limitTime); | ||
long endHeartbeatTime = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
mStartTickTime = 0L; | ||
mStartHeartbeatTime = 0L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why reset to 0 every time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 stand for that it is not start to heartbeat this period.
List<HeartbeatThread> list = HEARTBEAT_THREAD_INDEX_MAP.get(key); | ||
if (list == null) { | ||
list = new LinkedList<>(); | ||
HEARTBEAT_THREAD_INDEX_MAP.put(key, list); | ||
} | ||
HEARTBEAT_THREAD_MAP.put(heartbeatThread.getThreadName(), heartbeatThread); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
map.computeIfAbsent(key, (k) -> new ArrayList<>());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
private static final Map<String, HeartbeatThread> HEARTBEAT_THREAD_MAP | ||
= new ConcurrentHashMap<>(); | ||
private static final Map<Object, List<HeartbeatThread>> HEARTBEAT_THREAD_INDEX_MAP | ||
= new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you use synchronized methods so no need for ConcurrentHashMap here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
|
||
@Override | ||
public boolean equals(Object o) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do you need equals() and hashCode()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I am lazy copy the style from existing code, I just reference the file under alluxio/core/common/src/main/java/alluxio/wire
, like AlluxioWorkerInfo
BlockInfo
and LogInfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you can remove equals
and hashCode
mExecutor.heartbeat(limitTime); | ||
long endHeartbeatTime = System.currentTimeMillis(); | ||
mPreviousReport = String.format("#%d [%s - %s - %s] ticked(s) %d, run(s) %d.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't create report on run(), you may update the states but generate reports on demand
@jiacheliu3 Thanks for your review, I've addressed your comments, PTAL~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work! I added some comments for discussion.
mExecutor.heartbeat(limitTime); | ||
mEndHeartbeatTime = CommonUtils.getCurrentMs(); | ||
updateState(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every tick you are resetting to zero? looks like a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it looks need a structure to store the previous and current state
} | ||
|
||
@Override | ||
public boolean equals(Object o) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you can remove equals
and hashCode
public final class WebUIHeartbeatThreads implements Serializable { | ||
private static final long serialVersionUID = -2903043308252679410L; | ||
|
||
private boolean mDebug; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove it
return RestUtils.call(() -> { | ||
WebUIHeartbeatThreads response = new WebUIHeartbeatThreads(); | ||
|
||
response.setDebug(Configuration.getBoolean(PropertyKey.DEBUG)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will the output be different if you set PropertyKey.DEBUG
? I dont think this is relevant?
*/ | ||
public static Future<?> submit(ExecutorService executorService, | ||
HeartbeatThread heartbeatThread) { | ||
HeartbeatThreadManager.register(executorService, heartbeatThread); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think it's a good idea to use a thread pool as key...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, what can be the better key? Thread name?
* @return the heartbeat threads info | ||
*/ | ||
public static synchronized Map<String, HeartbeatThreadInfo> getHeartbeatThreads() { | ||
SortedMap<String, HeartbeatThreadInfo> heartbeatThreads = new TreeMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double check if the thread is still alive, and make sure this ref is removed if the thread is already dead
What changes are proposed in this pull request?
Support collect and display the heartbeat thread status
Why are the changes needed?
With this feature, we can insight the background heartbeat thread status and detect the issue before encountered an accident.
Does this PR introduce any user facing changes?
http://localhost:19999/api/v1/master/webui_heartbeat_threads
http://localhost:30000/api/v1/worker/webui_heartbeat_threads
http://localhost:20002/api/v1/job_master/webui_heartbeat_threads
http://localhost:30003/api/v1/job_worker/webui_heartbeat_threads
Base on this, we can create a web ui form for this heartbeat thread info.