-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
201 lines (186 loc) · 10.8 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta name="generator" content="jemdoc, see http://jemdoc.jaboc.net/" />
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<link rel="stylesheet" href="jemdoc.css" type="text/css" />
<title>Yiping Wang 王宜平</title>
</head>
<body>
<table summary="Table for page layout." id="tlayout">
<tr valign="top">
<td id="layout-menu">
<div class="menu-category">Yiping Wang</div>
<div class="menu-item"><a href="index.html" class="current">Home</a></div>
<div class="menu-item"><a href="pub.html">Publications</a></div>
<div class="menu-item"><a href="miscellaneous.html">Miscellaneous</a></div>
<div class="menu-item"><a href="fun.html">Fun</a></div>
<div class="menu-item"><a href="CV_YipingWang_phd.pdf">CV</a></div>
</td>
<td id="layout-content">
<div id="toptitle">
<h1>Yiping Wang 王宜平</h1>
</div>
<table class="imgtable"><tr><td>
<!-- <img src="photos/sunshine2.png" alt="alt text" width="190px" height="240px" /> </td> -->
<img src="photos/bio_01_25.jpg" alt="alt text" width="240px" height="320px" /> </td>
<td align="left"><p>Yiping Wang<br />
Ph.D student<br /> <a href="https://www.cs.washington.edu/">Paul G. Allen School of Computer Science & Engineering</a>, <br />
<a href="https://www.washington.edu/">University of Washington</a><br />
Email: [email protected] <br /><br />
<a href="https://scholar.google.com/citations?user=IuMFxFUAAAAJ&hl=en&oi=ao">Google Scholar</a> / <a href="https://twitter.com/ypwang61">Twitter</a> / <a href="https://github.com/ypwang61">Github</a> / <a href="https://www.linkedin.com/in/yiping-wang-323647294/">LinkedIn</a><br /></p>
</td></tr></table>
<h2>About me</h2>
<p>I'm a second-year Ph.D. student in Paul G. Allen School of Computer Science & Engineering from University of Washington.
I feel very fortunate to have worked under the guidance of <a href="https://simonshaoleidu.com/index.html">Prof. Simon Shaolei Du</a> since 2022 summer.</p>
<p>My main research interest broadly spread across <b>machine learning theory</b> and <b>foundation models</b>.
For the theortical part, I care about understanding the foundations of deep learning and representation learning, especially the <b>training dynamics of</b> the basic components like <b>Transformer</b>.
For the empirical part, I am keen on developing efficient algorithms with strong theoretical guarantees or insightful observations. Currently, in this aspect, I'm working on <b>data selection/scheduling for multi-modal pretraining</b> and improving inference efficiency of LLM. I'm also working on some projects related to video generation.
In addition, I have always held a strong enthusiasm for understanding the essence of intelligence and exploring the cross-cutting areas of mathematics, physics, and AGI, such as using LLMs for mathematical proof and seeking scientific truth.</p>
<p>I'm grateful to all my collaborators and mentors along the way.
I'm priviledged to be working closely with <a href="http://yuandong-tian.com/">Dr. Yuandong Tian</a> since 2023 spring.
Besides, I'm also having intern at Microsoft started from June 2024, fortunate to be advised by <a href="https://scholar.google.com/citations?user=S6OFEFEAAAAJ">Yelong Shen</a> and <a href="https://sites.google.com/site/shuohangsite/">Shuohang Wang</a>.
During my undergraduate, I was fortunate to work closely with <a href="https://www.huaxiuyao.io/">Prof. Huaxiu Yao</a> and <a href="https://linjunz.github.io/">Prof. Linjun Zhang</a>.</p>
<p>Previously, I studied Computer Science and Mathematics in <a href="https://www.zju.edu.cn/english/">Zhejiang University</a>, got an honors degree from <a href="http://ckc.zju.edu.cn/ckcen/_t1906/main.psp">Chu Kochen Honors College</a>.</p>
<h2>News</h2>
<ul>
<li><p>
02/2025: One paper (<a href="https://arxiv.org/abs/2412.16211">StoryEval</a>) is accepted by CVPR 2025!
</p></li>
<li><p>
12/2024: Releasing a new video generation benchmark <a href="https://ypwang61.github.io/project/StoryEval/">StoryEval</a>!
</p></li>
<li><p>
12/2024: Attending NeurIPS 2024 in Vancouver and presenting our <a href="https://arxiv.org/abs/2405.19547">CLIPLoss</a> paper!
</p></li>
<li><p>
09/2024: Attending MoDL 2024 in New York sponsored by Simons Foundation, and presenting our <a href="https://arxiv.org/abs/2405.19547">CLIPLoss</a> poster!
</p></li>
<li><p>
09/2024: Our <a href="https://arxiv.org/abs/2405.19547">CLIPLoss</a> paper is accepted by NeurIPS 2024 as spotlight!
</p></li>
<li><p>
06/2024: Started my internship at Microsoft!
</p></li>
<li><p>
01/2024: One paper (<a href="https://arxiv.org/abs/2310.00535">JoMA</a>) is accepted by ICLR 2024!
</p></li>
<li><p>
12/2023: Attended NeurIPS 2023 in New Orleans!
</p></li>
<li><p>
09/2023: One paper (<a href="https://arxiv.org/abs/2305.16380">Scan&Snap</a>) is accepted by NeurIPS 2023!
</p></li>
<li><p>
09/2023: Become a husky in UW!
</p></li>
</ul>
<!-- <h2>My Favourite Papers</h2> -->
<h2>Research directions and Selected Papers</h2>
<!-- <p><span class="preserve-space">(* denotes equal contribution or alphabetic ordering.)</span> <br /><br /></p> -->
<br>
<p><span class="topic-head">
Data Selection Algorithm
</span></p>
<p><div class="boxed">
We studied how to efficiently select data for multimodal pretraining tasks, drawing inspiration from both empirical observations and theoretical insights.
</p>
<table class="imgtable"><tr><td>
<img src="photos/negcliploss.png" alt="alt text" width="300px" height="120px" /> </td>
<td align="left"><p><a href="https://arxiv.org/abs/2405.19547">
CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
</a>
<br>
<b>Yiping Wang</b>*, Yifang Chen*, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du
<br>
<i> NeurIPS 2024 (<font color="red">Spotlight</font>)</i>
<br>
<a href="https://arxiv.org/abs/2405.19547" style="color: #666666">[Arxiv]</a>
<a href="https://github.com/ypwang61/negCLIPLoss_NormSim" style="color: #666666">[Code]</a>
<a href="./pdfs/Poster_negCLIPLoss_NormSim.pdf" style="color: #666666">[Poster]</a>
<a href="https://twitter.com/ypwang61/status/1798396572516151612" style="color: #666666">[Twitter]</a>
<a href="https://arxiv.org/abs/2402.02055" style="color: #666666">[Previous Versions]</a>
<br><br>
<!-- tl;dr: We design universal data selection methods for CLIP pretraining and achieve near SOTA results with less than 10% of preprocessing resources. It can obtain a new SOTA in <a href="https://www.datacomp.ai/dcclip/leaderboard.html">DataComp benchmark</a> when combined with other approaches.</p> -->
tl;dr: We design simple but efficient data selection methods for CLIP pretraining, and get new SOTA in <a href="https://www.datacomp.ai/dcclip/leaderboard.html">DataComp benchmark</a>.</p>
</td></tr></table>
<!-- <table class="imgtable"><tr><td>
<img src="photos/L1_A_MTRL.png" alt="alt text" width="400px" height="140px" /> </td>
<td align="left"><p><b><a href="https://arxiv.org/abs/2306.02556">
Improved Active Multi-Task Representation Learning via Lasso
</a></b> <span class="preserve-space"> </span>
<a href="https://arxiv.org/abs/2306.02556">[Arxiv]</a> <br />
<b>Yiping Wang</b>, Yifang Chen, Kevin Jamieson, Simon S. Du <br />
📍<i>ICML 2023</i> <br /><br />
tl;dr: We improve the sample complexity of active multi-task representation learning by proposing a new LASSO-based strategy.</p>
</td></tr></table> -->
<p></div></p>
<br>
<p><span class="topic-head">Video Generation Evaluation</span></p>
<p><div class="boxed">
We explore the common issues existing in the current top video generative models.
</p>
<table class="imgtable"><tr><td>
<img src="photos/storyeval.gif" alt="alt text" width="300px" height="180px" /> </td>
<td align="left"><p><a href="https://arxiv.org/abs/2405.19547">
Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
</a>
<br>
<b>Yiping Wang</b>, Xuehai He, Kuan Wang, Luyao Ma, Jianwei Yang, Shuohang Wang, Simon Shaolei Du, Yelong Shen
<br>
<i>CVPR 2025</i>
<br>
<a href="https://arxiv.org/abs/2412.16211" style="color: #666666">[Arxiv]</a>
<a href="https://github.com/ypwang61/StoryEval" style="color: #666666">[Code]</a>
<a href="./pdfs/poster_storyEval_final.pdf", style="color: #666666">[Poster]</a>
<a href="https://x.com/ypwang61/status/1877079012742144276" style="color: #666666">[Twitter]</a>
<a href="https://ypwang61.github.io/project/StoryEval/" style="color: #666666">[Website]</a>
<br><br>
<!-- tl;dr: We design universal data selection methods for CLIP pretraining and achieve near SOTA results with less than 10% of preprocessing resources. It can obtain a new SOTA in <a href="https://www.datacomp.ai/dcclip/leaderboard.html">DataComp benchmark</a> when combined with other approaches.</p> -->
tl;dr: Current top video generative models can not present multi-event stories like "How to Put an Elephant in a Refrigerator".
</td></tr></table>
<p></div></p>
<br>
<p><span class="topic-head">
Theory of Transformer Dynamics
</span></p>
<p><div class="boxed">
We attempted to analyze the training dynamics of transformers in a mathematical way.<br /></p>
<table class="imgtable"><tr><td>
<img src="photos/scan.png" alt="alt text" width="300px" height="120px" /> </td>
<td align="left"><p><a href="https://arxiv.org/abs/2305.16380">
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
</a>
<br>
Yuandong Tian, <b>Yiping Wang</b>, Beidi Chen, Simon Shaolei Du
<br>
<i>NeurIPS 2023</i>
(<font color="red">Oral presentation</font> @ ICML2023-HiDL)
<br>
<a href="https://arxiv.org/abs/2305.16380" style="color: #666666">[Arxiv]</a>
<a href="./pdfs/poster_scan_snap.pdf" style="color: #666666">[Poster]</a>
<a href="https://twitter.com/tydsh/status/1663611845603885056" style="color: #666666">[Twitter]</a>
<br><br>
tl;dr: We analyze the 1-layer transformer with next token prediction loss, and rigorously prove its training process.</p>
</td></tr></table>
<table class="imgtable"><tr><td>
<img src="photos/joma.png" alt="alt text" width="300px" height="120px" /> </td>
<td align="left"><p><a href="https://arxiv.org/abs/2310.00535">
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
</a>
<br>
Yuandong Tian, <b>Yiping Wang</b>, Zhenyu Zhang, Beidi Chen, Simon Shaolei Du <br />
<i>ICLR 2024</i>
<br>
<a href="https://arxiv.org/abs/2310.00535" style="color: #666666">[Arxiv]</a>
<a href="https://twitter.com/tydsh/status/1709785496056930654" style="color: #666666">[Twitter]</a>
<br><br>
tl;dr: We analyze the training dynamics of multilayer transformer, characterizing the role of self-attention and MLP nonlinearity.</p>
</td></tr></table>
<p></div></p>
</td>
</tr>
</table>
</body>
</html>