-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finalize feluda operator system requirement #29
Comments
@aatmanvaidya @duggalsu |
I had an issue setting up feluda on my machine.
flagging this as something that might trump us. Python 3.9.18 on Ubuntu 20.04.2 LTS. Using the c35079 |
This shouldn't have happened. But it is happening because we have not upgraded boto3 (and other packages) to the latest in feluda core. So there are dependency mismatches when generating |
A note on trying to break the operator to its limit. |
A neat thing. I eventually got the operator to run on this 1 hour video without running into out of memory error!!! I figured that the cause for out of memory error was def extract_frames(self, v):
# print("extracting frames")
images = []
for i in range(self.n_frames):
success, image = v.read()
if image is None:
continue
else:
if i % self.sampling_rate == 0:
images.append(Image.fromarray(image))
# print("extracted frames")
return images Every frame of the video is added to the I tried a rudimentary trick to convert this into a generator and return 100 frames at a time : def extract_frames(self, v):
# print("extracting frames")
for i in range(0, self.n_frames, 100):
images = []
for i in range(100):
success, image = v.read()
if image is None:
continue
else:
if i % self.sampling_rate == 0:
images.append(Image.fromarray(image))
yield images
# print("extracted frames") and the corresponding change in the def analyze(self, video):
# print("analyzing video")
for frames in self.extract_frames(video):
feature_matrix = self.extract_features(frames)
self.keyframe_indices = self.find_keyframes(feature_matrix)
self.keyframe_features = feature_matrix[:, self.keyframe_indices]
# print("analysed video") Result : function took 625.5453 seconds to run. |
Current status is that we know RAM usage depends on the length of the video file. Given my proof of concept above, it looks like we can process long files by chunking the processing of frames and get a decent upper limit for RAM consumption. Given that for the next milestone, our priority is to be able to support processing of video files that are a few minutes long and that right now we dont anyways want to support processing really long files, we can assume the file length to be limited and hence the RAM usage also to be limited. In today's call Aurora mentioned that looking at the code we use for inference, trying out a GPU wont be worth it also. So we are parking all GPU related tests for later as well. This leaves us with compute optimized EC2s as the category of instances to try within. One thing we can also check for is that since our cores and memory isnt used at full capacity, this means that kubernetes can successfully schedule multiple pods on the same node. Getting us more value for money for every new node we provision. |
documentation on memory and cpu profiling is here - https://github.com/tattle-made/feluda/wiki/Optimization |
I've selected some EC2s for the first round of test. Included the hourly and daily cost because we might scale the nodes up and down and might not need a large node to stay on throughout.
|
@aatmanvaidya @duggalsu when we deploy the container to kubernetes, we can specify the command it should run when launched. for the sake of this test I was thinking we can create scripts inside So i was thinking that our bench mark script could be something like this python test.py
tail -f /dev/null script2.sh python3 -m memray run -o vid_vec_rep_resnet.bin vid_vec_rep_resnet.py
tail -f /dev/null So lets create appropriate scripts like these. Then we can deploy the container and change the command that needs to be executed on container start and run these tests in the cluster. |
Sharing the Kubernetes deployment file for reference. We'll simply change the replica count and command to run different containers. apiVersion: apps/v1
kind: Deployment
metadata:
name: feluda-operator-vidvec
labels:
app.kubernetes.io/name: feluda-operator-vidvec
spec:
replicas: 1
resources:
requests:
cpu: "1000m"
memory: "4000Mi"
limits:
cpu: "4000m"
memory: "8000Mi"
selector:
matchLabels:
app.kubernetes.io/name: feluda-operator-vidvec
template:
metadata:
labels:
app.kubernetes.io/name: feluda-operator-vidvec
spec:
containers:
- name: feluda-operator-vidvec
image: tattletech/feluda-operator-vid-vec:f6bb56c
imagePullPolicy: Always
command: ["python"]
args: ["test.py"] |
We'll rely on the github actions to push new docker images of our operators to dockerhub. Reference implementation https://github.com/tattle-made/feluda/blob/9f425587f93e02005554b496c059144c90e19f74/.github/workflows/prod-deploy.yml#L44-L50 |
Workflow :
We are charged hourly on the EC2 instance usage so once spun up, we have no reason to shut down the EC2 immediately. So we can run a few tests in one go within that hour and learn all we need to before shutting it down. We then repeate these steps for all EC2 instances we care to test this on. |
|
A trivial feedback on using I also notice that the largest thing in the docker image is the torch library. around 800 mb. doesnt seem like much we can do to reduce it. what is your opinion? |
Further optimized dockerfiles
|
Scenario Planning :Goal : Offer an acceptable response time (lets assume < 5 minutes for now) for every possible scenario. Scenarios :
|
Question to focus on : Whey do we get slow performance on multicore intel machines (c7i* family) when we increase the number of pod replicas (container). Especially when core > 4 |
Tasks
|
Overview
Acceptance Criteria
The text was updated successfully, but these errors were encountered: