diff --git a/index.html b/index.html index 4abf58d..fd62696 100644 --- a/index.html +++ b/index.html @@ -99,7 +99,16 @@

A Multimodal Automated Interpretability - + +
+
+
+

Understanding an AI system can take many forms. For instance, we might want to know when and how the system relies on sensitive or spurious features, identify systematic errors in its predictions, or learn how to modify the training data and model architecture to improve accuracy and robustness. Today, answering these types of questions often involves significant effort on the part of researchers: synthesizing the outcomes of different experiments that use a variety of tools.


+

Can an interpretability agent automate this process of experimenting on a system to explain its behavhior?

+
+
+
+
@@ -110,15 +119,6 @@

A Multimodal Automated Interpretability

-
-
-
-

Answering an interpretability query often involves synthesizing the outcomes of different experiments that use a variety of tools.


-

Can an interpretability agent automate this process of experimenting on a system to explain its behavhior?

-
-
-
-
@@ -144,7 +144,7 @@

MAIA

-

MAIA Experiments

+

MAIA Tools

MAIA composes interpretability subroutines into python programs to answer user queries about a system. What kind of experiments does MAIA design? Below we highlight example usage of individual tools to run experiments on neurons inside common vision architectures (CLIP, ResNet, DINO). These are experimental excerpts intended to demonstrate tool use (often, MAIA runs many more experiments to reach its final conclusion!) For full experiment logs, check out our interactive [neuron viewer].