Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test consistency #239

Merged
merged 7 commits into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions autotx/agents/SendTokensAgent.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
You use the tools available to assist the user in their tasks.
Your job is to only prepare the transactions by calling the prepare_transfer_transaction tool and the user will take care of executing them.
NOTE: There is no reason to call get_token_balance after calling prepare_transfer_transaction as the transfers are only prepared and not executed.
NOTE: A balance of a token is not required to perform a send, if there is an earlier prepared transaction that will provide the token.

Example 1:
User: Send 0.1 ETH to vitalik.eth and then swap ETH to 5 USDC
Expand Down
64 changes: 28 additions & 36 deletions autotx/agents/SwapTokensAgent.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,46 +23,31 @@
...
Other agent messages
...
Call prepare_bulk_swap_transactions with args:
{{
"tokens": "ETH to 5 USDC"
}}
Call prepare_bulk_swap_transactions: "ETH to 5 USDC"

Example 2:
User: Swap ETH to 5 USDC, then swap that USDC for 6 UNI
Call prepare_bulk_swap_transactions with args:
{{
"tokens": "ETH to 5 USDC\nUSDC to 6 UNI"
}}

Example 3:
Example 1:
User: Buy 10 USDC with ETH and then buy UNI with 5 USDC
Call prepare_bulk_swap_transactions with args:
{{
"tokens": "ETH to 10 USDC\n5 USDC to UNI"
}}

Example 4 (Mistake):
User: Swap ETH for 5 USDC, then swap that USDC for 6 UNI
Call prepare_bulk_swap_transactions with args:
{{
"tokens": "ETH to 5 USDC\n5 USDC to 6 UNI"
}}
Invalid input. Only one token amount should be provided. IMPORTANT: Take another look at the user's goal, and try again.
To fix the error run:
Call prepare_bulk_swap_transactions with args:
{{
"tokens": "ETH to 5 USDC\nUSDC to 6 UNI"
}}
Example 5:
Call prepare_bulk_swap_transactions: "ETH to 10 USDC\n5 USDC to UNI"

Example 2:
User: Buy UNI, WBTC, USDC and SHIB with 0.92 ETH
Call prepare_bulk_swap_transactions with args:
{{
"tokens": "0.23 ETH to UNI\n0.23 ETH to WBTC\n0.23 ETH to USDC\n0.23 ETH to SHIB"
}}
Call prepare_bulk_swap_transactions: "0.23 ETH to UNI\n0.23 ETH to WBTC\n0.23 ETH to USDC\n0.23 ETH to SHIB"

Example 3:
User: Swap ETH to 5 USDC, then swap that USDC for 6 UNI
Call prepare_bulk_swap_transactions: "ETH to 5 USDC\nUSDC to 6 UNI"

Example of a bad input:
User: Swap ETH to 1 UNI, then swap UNI to 4 USDC
Call prepare_bulk_swap_transactions: "ETH to 1 UNI\n1 UNI to 4 USDC"
Prepared transaction: Swap 1.0407386618866115 ETH for at least 1 WBTC
Invalid input: "1 UNI to 4 USDC". Only one token amount should be provided. IMPORTANT: Take another look at the user's goal, and try again.
In the above example, you recover with:
Call prepare_bulk_swap_transactions: "UNI to 4 USDC"

Above are examples, NOTE these are only examples and in practice you need to call the prepare_bulk_swap_transactions tool with the correct arguments.
Take extra care in ensuring you have to right amount next to the token symbol.
Take extra care in ensuring you have to right amount next to the token symbol. NEVER use more than one amount per swap, the other amount will be calculated for you.
The swaps are NOT NECESSARILY correlated, focus on the exact amounts the user wants to buy or sell (leave the other amounts to be calculated for you).
Only call tools, do not respond with JSON.
"""
)
Expand Down Expand Up @@ -95,7 +80,10 @@ def swap(autotx: AutoTx, token_to_sell: str, token_to_buy: str) -> list[Prepared
raise InvalidInput(f"Invalid input: \"{token_to_sell} to {token_to_buy}\". Only one token amount should be provided. IMPORTANT: Take another look at the user's goal, and try again.")

if len(sell_parts) < 2 and len(buy_parts) < 2:
raise InvalidInput("Invalid input: \"{token_to_sell} to {token_to_buy}\". Token amount is missing.")
raise InvalidInput(f"Invalid input: \"{token_to_sell} to {token_to_buy}\". Token amount is missing. Only one token amount should be provided.")

if len(sell_parts) > 2 or len(buy_parts) > 2:
raise InvalidInput(f"Invalid input: \"{token_to_sell} to {token_to_buy}\". Too many token amounts or token symbols provided. Only one token amount and two token symbols should be provided per line.")

token_symbol_to_sell = sell_parts[1] if len(sell_parts) == 2 else sell_parts[0]
token_symbol_to_buy = buy_parts[1] if len(buy_parts) == 2 else buy_parts[0]
Expand Down Expand Up @@ -167,6 +155,10 @@ def run(

if all_errors:
summary += "\n".join(str(e) for e in all_errors)
if len(all_txs) > 0:
summary += f"\n{len(all_errors)} errors occurred. {len(all_txs)} transactions were prepared. There is no need to re-run the transactions that were prepared."
else:
summary += f"\n{len(all_errors)} errors occurred."

return summary

Expand Down
4 changes: 3 additions & 1 deletion autotx/helper_agents/user_proxy.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,13 @@ def build(user_prompt: str, agents_information: str, get_llm_config: Callable[[]

Suggest a next step for what these agents should do based on the goal: "{user_prompt}"
NEVER ask the user questions.

NEVER make up a token, ALWAYS ask the 'research-tokens' agent to first search for the token.

If the goal has been achieved, FIRST reflect on the goal and make sure nothing is missing, then end the conversation with "TERMINATE".
Consider the goal met if the other agents have prepared the necessary transactions and all user queries have been answered.
If the user's goal involves buying tokens, make sure the correct number of tokens are bought.
If you encounter an error, try to resolve it (either yourself of with other agents) and only respond with "TERMINATE" if the goal is impossible to achieve.
If a token is not supported, ask the researcher agent to find a supported token (if it fits within the user's goal).
"""
),
description="user_proxy is an agent authorized to act on behalf of the user.",
Expand Down
49 changes: 49 additions & 0 deletions autotx/tests/agents/token/research/test_advanced.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
from autotx.utils.ethereum.eth_address import ETHAddress

def test_research_and_swap_many_tokens_subjective_simple(configuration, auto_tx):
(_, _, _, manager) = configuration
uni_address = ETHAddress(auto_tx.network.tokens["uni"])

uni_balance_in_safe = manager.balance_of(uni_address)
assert uni_balance_in_safe == 0

starting_balance = manager.balance_of()

prompt = f"I want to use 3 ETH to purchase 3 of the best projects in: GameFi, AI, and MEMEs. Please research the top projects, come up with a strategy, and purchase the tokens that look most promising. All of this should be on ETH mainnet."

result = auto_tx.run(prompt, non_interactive=True)

ending_balance = manager.balance_of()

# Verify the balance is lower by max 3 ETH
assert starting_balance - ending_balance <= 3
# Verify there are at least 3 transactions
assert len(result.transactions) == 3
# Verify there are only swap transactions
assert all([tx.summary.startswith("Swap") for tx in result.transactions])
# Verify the tokens are different
assert len(set([tx.summary.split(" ")[-1] for tx in result.transactions])) == 3

def test_research_and_swap_many_tokens_subjective_complex(configuration, auto_tx):
(_, _, _, manager) = configuration
uni_address = ETHAddress(auto_tx.network.tokens["uni"])

uni_balance_in_safe = manager.balance_of(uni_address)
assert uni_balance_in_safe == 0

starting_balance = manager.balance_of()

prompt = f"I want to use 3 ETH to purchase exactly 10 of the best projects in: GameFi, NFTs, ZK, AI, and MEMEs. Please research the top projects, come up with a strategy, and purchase the tokens that look most promising. All of this should be on ETH mainnet."

result = auto_tx.run(prompt, non_interactive=True)

ending_balance = manager.balance_of()

# Verify the balance is lower by max 3 ETH
assert starting_balance - ending_balance <= 3
# Verify there are at least 5 transactions
assert len(result.transactions) == 10
# Verify there are only swap transactions
assert all([tx.summary.startswith("Swap") for tx in result.transactions])
# Verify the tokens are different
assert len(set([tx.summary.split(" ")[-1] for tx in result.transactions])) == 10
78 changes: 20 additions & 58 deletions autotx/tests/agents/token/research/test_research_and_swap.py
Original file line number Diff line number Diff line change
@@ -1,86 +1,48 @@
from autotx.utils.ethereum import load_w3
from autotx.utils.ethereum.eth_address import ETHAddress
from autotx.utils.ethereum.get_erc20_balance import get_erc20_balance

def test_research_and_swap_meme_token(configuration, auto_tx):
def test_research_and_buy_one(configuration, auto_tx):
(_, _, _, manager) = configuration

shib_address = ETHAddress(auto_tx.network.tokens["shib"])
shib_balance_in_safe = manager.balance_of(shib_address)
assert shib_balance_in_safe == 0

prompt = (
f"Swap 1 ETH for the meme token with the largest market cap in ethereum mainnet"
f"Buy 1 ETH worth of a meme token with the largest market cap in ethereum mainnet"
)

auto_tx.run(prompt, non_interactive=True)

shib_balance_in_safe = manager.balance_of(shib_address)
assert shib_balance_in_safe > 1000

def test_research_swap_and_send_governance_token(configuration, auto_tx, test_accounts):
def test_research_and_buy_multiple(configuration, auto_tx):
(_, _, _, manager) = configuration
web3 = load_w3()

uni_address = ETHAddress(auto_tx.network.tokens["uni"])
uni_balance_in_safe = manager.balance_of(uni_address)

assert uni_balance_in_safe == 0
receiver = test_accounts[0]

prompt = f"Swap 1 ETH for the governance token with the largest market cap in ethereum mainnet and send 100 units of the bought token to {receiver}"

auto_tx.run(prompt, non_interactive=True)
shib_address = ETHAddress(auto_tx.network.tokens["shib"])
shib_balance_in_safe = manager.balance_of(shib_address)
assert shib_balance_in_safe == 0

uni_balance_in_safe = manager.balance_of(uni_address)
assert uni_balance_in_safe > 90
receiver_balance = get_erc20_balance(web3, uni_address, receiver)
assert receiver_balance == 100

def test_research_and_swap_many_tokens_subjective_simple(configuration, auto_tx):
(_, _, _, manager) = configuration
uni_address = ETHAddress(auto_tx.network.tokens["uni"])

uni_balance_in_safe = manager.balance_of(uni_address)
assert uni_balance_in_safe == 0

starting_balance = manager.balance_of()

prompt = f"I want to use 3 ETH to purchase 3 of the best projects in: GameFi, AI, and MEMEs. Please research the top projects, come up with a strategy, and purchase the tokens that look most promising. All of this should be on ETH mainnet."
old_eth_balance = manager.balance_of()

result = auto_tx.run(prompt, non_interactive=True)

ending_balance = manager.balance_of()

# Verify the balance is lower by max 3 ETH
assert starting_balance - ending_balance <= 3
# Verify there are at least 5 transactions
assert len(result.transactions) == 3
# Verify there are only swap transactions
assert all([tx.summary.startswith("Swap") for tx in result.transactions])
# Verify the tokens are different
assert len(set([tx.summary.split(" ")[-1] for tx in result.transactions])) == 3

def test_research_and_swap_many_tokens_subjective_complex(configuration, auto_tx):
(_, _, _, manager) = configuration
uni_address = ETHAddress(auto_tx.network.tokens["uni"])
prompt = f"""
Buy 1 ETH worth of a meme token with the largest market cap
Then buy the governance token with the largest market cap with 0.5 ETH
This should be on ethereum mainnet
"""

uni_balance_in_safe = manager.balance_of(uni_address)
assert uni_balance_in_safe == 0

starting_balance = manager.balance_of()

prompt = f"I want to use 3 ETH to purchase 10 of the best projects in: GameFi, NFTs, ZK, AI, and MEMEs. Please research the top projects, come up with a strategy, and purchase the tokens that look most promising. All of this should be on ETH mainnet."
auto_tx.run(prompt, non_interactive=True)

new_eth_balance = manager.balance_of()

result = auto_tx.run(prompt, non_interactive=True)
assert old_eth_balance - new_eth_balance == 1.5

ending_balance = manager.balance_of()
shib_balance_in_safe = manager.balance_of(shib_address)
assert shib_balance_in_safe > 1000

# Verify the balance is lower by max 3 ETH
assert starting_balance - ending_balance <= 3
# Verify there are at least 5 transactions
assert len(result.transactions) == 10
# Verify there are only swap transactions
assert all([tx.summary.startswith("Swap") for tx in result.transactions])
# Verify the tokens are different
assert len(set([tx.summary.split(" ")[-1] for tx in result.transactions])) == 10
uni_balance_in_safe = manager.balance_of(uni_address)
assert uni_balance_in_safe > 90
93 changes: 93 additions & 0 deletions autotx/tests/agents/token/research/test_research_swap_and_send.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
from autotx.utils.ethereum import get_erc20_balance, load_w3
from autotx.utils.ethereum.eth_address import ETHAddress

DIFFERENCE_PERCENTAGE = 0.01

def test_research_buy_one_send_one(configuration, auto_tx, test_accounts):
(_, _, _, manager) = configuration
web3 = load_w3()

receiver = test_accounts[0]

shib_address = ETHAddress(auto_tx.network.tokens["shib"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a lot of tests that fetch the meme coin with largest market cap - since this is dynamic, should we get the data from coingecko in each test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I just made an issue for it here: #247
Seems a bit lower prio though since the highest market cap coins won't likely change that soon

shib_balance_in_safe = manager.balance_of(shib_address)
assert shib_balance_in_safe == 0

prompt = (
f"Buy 1 ETH worth of a meme token with the largest market cap in ethereum mainnet then send it to {receiver}"
)

auto_tx.run(prompt, non_interactive=True)

shib_balance_in_safe = manager.balance_of(shib_address)

receiver_balance = get_erc20_balance(web3, shib_address, receiver)
assert receiver_balance > 10000

assert shib_balance_in_safe / receiver_balance < DIFFERENCE_PERCENTAGE

def test_research_buy_one_send_multiple(configuration, auto_tx, test_accounts):
(_, _, _, manager) = configuration
web3 = load_w3()

receiver_1 = test_accounts[0]
receiver_2 = test_accounts[1]

shib_address = ETHAddress(auto_tx.network.tokens["shib"])
shib_balance_in_safe = manager.balance_of(shib_address)
assert shib_balance_in_safe == 0

prompt = (
f"Buy 1 ETH worth of a meme token with the largest market cap in ethereum mainnet then 10,000 of it to {receiver_1} and 250 of it to {receiver_2}"
)

auto_tx.run(prompt, non_interactive=True)

shib_balance_in_safe = manager.balance_of(shib_address)

receiver_1_balance = get_erc20_balance(web3, shib_address, receiver_1)
assert receiver_1_balance == 10000

receiver_2_balance = get_erc20_balance(web3, shib_address, receiver_2)
assert receiver_2_balance == 250

assert shib_balance_in_safe > 10000

def test_research_buy_multiple_send_multiple(configuration, auto_tx, test_accounts):
(_, _, _, manager) = configuration
web3 = load_w3()

receiver_1 = test_accounts[0]
receiver_2 = test_accounts[1]

shib_address = ETHAddress(auto_tx.network.tokens["shib"])
shib_balance_in_safe = manager.balance_of(shib_address)
assert shib_balance_in_safe == 0

uni_address = ETHAddress(auto_tx.network.tokens["uni"])
uni_balance_in_safe = manager.balance_of(uni_address)
assert uni_balance_in_safe == 0

old_eth_balance = manager.balance_of()

prompt = f"""
Buy 1 ETH worth of a meme token with the largest market cap
Then buy the governance token with the largest market cap with 0.5 ETH
This should be on ethereum mainnet.
Send all of the meme token to {receiver_1} and all of the governance token to {receiver_2}
"""

auto_tx.run(prompt, non_interactive=True)

new_eth_balance = manager.balance_of()

assert old_eth_balance - new_eth_balance == 1.5

shib_balance = get_erc20_balance(web3, shib_address, receiver_1)
assert shib_balance > 10000

uni_balance = get_erc20_balance(web3, uni_address, receiver_2)
assert uni_balance > 90

assert shib_balance_in_safe / shib_balance < DIFFERENCE_PERCENTAGE
assert uni_balance_in_safe / uni_balance < DIFFERENCE_PERCENTAGE
13 changes: 8 additions & 5 deletions benchmarks.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,18 @@
"autotx/tests/agents/token/test_swap_and_send.py::test_send_and_swap_simple": "100.00",
"autotx/tests/agents/token/test_swap_and_send.py::test_send_and_swap_complex": "100.00",
"autotx/tests/agents/token/research/test_research.py::test_get_top_5_memecoins_in_optimism": "100.00",
"autotx/tests/agents/token/research/test_research.py::test_get_top_5_memecoins": "100.00",
"autotx/tests/agents/token/research/test_research_and_swap.py::test_research_and_swap_meme_token": "100.00",
"autotx/tests/agents/token/research/test_research_and_swap.py::test_research_swap_and_send_governance_token": "100.00",
"autotx/tests/agents/token/research/test_research.py::test_get_top_5_memecoins": "90.00",
"autotx/tests/agents/token/research/test_research.py::test_price_change_information": "100.00",
"autotx/tests/agents/token/research/test_research.py::test_get_top_5_tokens_from_base": "100.00",
"autotx/tests/agents/token/research/test_research.py::test_get_top_5_most_traded_tokens_from_l1": "100.00",
"autotx/tests/agents/token/research/test_research.py::test_get_token_exchanges": "100.00",
"autotx/tests/agents/token/research/test_research_and_swap.py::test_research_and_swap_many_tokens_subjective_complex": "33.33",
"autotx/tests/agents/token/research/test_research_and_swap.py::test_research_and_swap_many_tokens_subjective_simple": "100.00"
"autotx/tests/agents/token/research/test_research_and_swap.py::test_research_and_buy_one": "100.00",
"autotx/tests/agents/token/research/test_research_swap_and_send.py::test_research_buy_one_send_one": "100.00",
"autotx/tests/agents/token/research/test_advanced.py::test_research_and_swap_many_tokens_subjective_complex": "10.00",
"autotx/tests/agents/token/research/test_research_swap_and_send.py::test_research_buy_multiple_send_multiple": "100.00",
"autotx/tests/agents/token/research/test_research_and_swap.py::test_research_and_buy_multiple": "100.00",
"autotx/tests/agents/token/research/test_advanced.py::test_research_and_swap_many_tokens_subjective_simple": "90.00",
"autotx/tests/agents/token/research/test_research_swap_and_send.py::test_research_buy_one_send_multiple": "100.00"
},
"iterations": 10
}
Loading