Skip to content

Commit

Permalink
added 'Human Activity Understanding' page
Browse files Browse the repository at this point in the history
  • Loading branch information
fasiddiky committed Mar 8, 2024
1 parent b3510f6 commit 3b764dc
Showing 1 changed file with 72 additions and 0 deletions.
72 changes: 72 additions & 0 deletions content/page/softwaretools/HumanActivityUnderstanding/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
title: "Video-Enhanced Human Activity Understanding"
date: 2023-11-03T10:35:35-05:00
subtitle: ""
tags: ["Subsystem"]
dropCap: false
displayInMenu: false
displayInList: true
draft: false
---

In the field of robotics and AI, teaching robots human like daily activities, especially handling
and manipulating objects, is challenging due to the variety of tasks and conditions. In this software
paltform, "Video-Enhanced Human Activity Understanding," harnesses advanced machine learning algorithms
alongside MetaHuman avatars within Unreal Engine simulations. This combination enables robots to grasp
and execute tasks by interpreting video-generated activity instructions and subsequently simulating these
activities in a virtual world governed by physical parameters. As a component of the Physics-enabled Virtual
Demonstration (PVD) framework, our platform offers a lifelike and effective training environment for robots,
leveraging physical laws to ensure safer and more productive learning outcomes. This innovative approach significantly
enhances robotic competence in complex activities, effectively narrowing the gap between theoretical
learning and practical application.

<!--more-->


DeepActionObserver: Refining Instructions for objects Manipulation Actions
---

Robotic agents are tasked with learning diverse manipulation actions, a challenge compounded
by the variability in object interactions, tool usage, task contexts, and operational environments.
Addressing the complexity of determining the appropriate execution of these actions, the DeepActionObserver
framework empowers robots to interpret text instructions and analyze corresponding video demonstrations
. This process generates symbolic action descriptions, enriched and clarified by video content, aligning
closely with advanced cognition-enabled robotic control schemes.

DeepActionObserver synergizes two advanced learning and reasoning paradigms: the Multi-Task
Network and Markov Logic Networks. The Multi-Task Network utilizes convolutional architectures
to accurately recognize objects, hand positions, and to predict poses and movements. Meanwhile,
Markov Logic Networks augment the framework's ability to reason by using joint probabilities to navigate
and clarify instructional content, thus resolving ambiguities and enriching action descriptions.
This concise description captures the complementary functionalities of these technologies in enhancing
the framework's overall performance.

DeepActionObserver
---

<figure class="video_container">
<iframe width="560" height="315" src="https://www.youtube.com/embed/zRAmyKp8CiY?si=Z_MC4DT_PSAHG_kA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
</figure>

Physics-enable Virtual Demonstration
---
We've been actively engaged in developing the Physics-enabled Virtual Demonstration (PVD) framework, meticulously crafted to
enhance robotics manipulation activities. PVD functions as an instructional tool for both robots and virtual humans (MetaHumans),
utilizing advanced machine learning within a controlled, simulated environment to comprehend essential principles of physics.
This virtual environment meticulously replicates real-world physics, encompassing crucial aspects such as gravity and object interactions.
The primary aim of PVD is to assist robots and MetaHumans in adapting and learning through practical exercises within these
highly realistic simulations, significantly improving their efficiency and safety when confronted with real-world scenarios.
Furthermore, within the PVD framework, the Human Demonstration element serves as a comprehensive guide, systematically breaking
down human actions within controlled settings. It simplifies these actions into understandable instructions for robots,
enhancing their understanding of actions, conditions, movements, and the forces involved. This knowledge equips robots
to plan their actions more effectively, resulting in improved outcomes and reduced errors across various tasks.

Virtual Demonstrations through Human Manipulation Observation
---

<figure class="video_container">

<iframe width="560" height="315" src="https://www.youtube.com/embed/pATzTwBOfUs?si=SV3J_niVKi9RPRXv" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

</figure>

0 comments on commit 3b764dc

Please sign in to comment.