You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What version of the product are you using? On what operating system?
PFAC 1.0, on RHEL 6
Please provide any additional information below.
I measured the time it takes for PFAC_matchFromHostReduce and the equivalent
steps when using PFAC_matchFromDeviceReduce. Both functions take about the same
time to complete when the size of the input string is 100MB.
Timing for PFAC_matchFromHostReduce: 56 ms
Timing for equivalent steps using PFAC_matchFromDeviceReduce:
cudaMalloc: 0.3 ms
cudaMemcpy(d_input_string, h_input_string, input_size, cudaMemcpyHostToDevice):
18 ms
PFAC_matchFromDeviceReduce: 26 ms
cudaMemcpy of d_pos and d_match_result back to CPU: 0.3 ms
cudaFree of d_input_string, d_pos and d_match_result: 11 ms
Total: 57 ms
Original issue reported on code.google.com by [email protected] on 29 Apr 2011 at 1:49
The text was updated successfully, but these errors were encountered:
PFAC_matchFromHostReduce() needs to free working space d_input_string,
d_matched_result and d_pos.
[code]
cudaMalloc((void **) &d_input_string, n_hat*sizeof(int) );
cudaMalloc((void **) &d_matched_result, input_size*sizeof(int) );
cudaMalloc((void **) &d_pos, input_size*sizeof(int) );
cudaMemcpy(d_input_string, h_input_string, input_size, cudaMemcpyHostToDevice);
same as PFAC_matchFromDeviceReduce()
cudaMemcpy(h_pos, d_pos, (*h_num_matched)*sizeof(int), cudaMemcpyDeviceToHost);
cudaMemcpy(h_match_result, d_match_result_zip, (*h_num_matched)*sizeof(int), cudaMemcpyDeviceToHost);
cudaFree(d_input_string);
cudaFree(d_matched_result);
cudaFree(d_pos);
[/code]
In my tests, cudaFree() needs 12ms for 100MB input string and 24ms for 200MB
input stream.
If you are not a beginner, then I will suggest PFAC_matchFromDeviceReduce().
Original issue reported on code.google.com by
[email protected]
on 29 Apr 2011 at 1:49The text was updated successfully, but these errors were encountered: