Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] DO NOT MERGE #4

Open
wants to merge 35 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
4e70d91
adding adapter
Apr 13, 2023
6d8aa0c
Ff
Apr 13, 2023
b6b8b95
fixing gatherer
Apr 15, 2023
bb675c4
fixing minor interface issues
Apr 15, 2023
0123355
fixing list scaffold
Apr 15, 2023
1267dd4
adding test-starting
Apr 16, 2023
a3a7b98
fixed twitter handlers
Apr 16, 2023
5b0f87b
debugging search iterators
Apr 16, 2023
66c0a71
adding data model and fixing last bugs, now just need to tune for the…
Apr 16, 2023
b4c1497
adding arweave
Apr 18, 2023
ec2d662
debugging arweave logic now
Apr 18, 2023
26ddd5f
fixed looping
Apr 18, 2023
d4439dd
fixing main loop
Apr 19, 2023
d0e6339
fixing main loop
Apr 19, 2023
f9e74bb
arweave code review
SomaKoii Apr 26, 2023
0f46e77
use puppeteer log in to twitter
SomaKoii Apr 27, 2023
3f7d4c8
add twitter verify option
SomaKoii Apr 27, 2023
d6bc4ad
add twitter stuff in env
SomaKoii Apr 27, 2023
b5f63c9
update scrape code
SomaKoii Apr 27, 2023
6b3ea66
getting comments, likes, shares and views
SomaKoii Apr 27, 2023
0173f0f
update scraping script
SomaKoii Apr 27, 2023
b2fa4eb
store results of twitter scrape to local leveldb
aelavell May 2, 2023
d353405
update leveldb class
SomaKoii May 2, 2023
644a3cc
Merge pull request #2 from aelavell/feat/twitter-leveldb
somali0128 May 2, 2023
23a056c
fix and update leveldb function
SomaKoii May 2, 2023
367188e
format leveldb
SomaKoii May 2, 2023
8ddf6a1
Merge pull request #3 from GET-Store-CAT/@feature/leveldb
somali0128 May 2, 2023
252c0c7
update twitter scraping code work with db-model
SomaKoii May 2, 2023
5819ee9
update some calls to db.getPending to become db.getPendingList
aelavell May 2, 2023
05e9b60
Merge pull request #1 from GET-Store-CAT/@feature/twitter-scrape
somali0128 May 2, 2023
b10b30b
Merge pull request #4 from aelavell/feature/arweave
somali0128 May 2, 2023
20ef0b4
add TODO twitter scrape scrollPage()
SomaKoii May 2, 2023
3d00069
autoscroll with randomized "natural" scrolling, plus test
aelavell May 3, 2023
ea44428
Merge pull request #5 from aelavell/feature/autoscroll
somali0128 May 3, 2023
aa106fe
fixing readme, wip
alexander-morris May 16, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 0 additions & 50 deletions .env-local

This file was deleted.

4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,6 @@ namespace/*
config/*
.env
taskStateInfoKeypair.json
localKOIIDB
localKOIIDB
my_test_db
twitter-db
117 changes: 17 additions & 100 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,113 +1,30 @@
# K2-Task-Template
# Data Gatherer Template
The Data Gatherer is a standardized subset of [Koii Tasks](https://docs.koii.network/develop/microservices-and-tasks/what-are-tasks/), which allow a developer to hire thousands of automated worker nodes from the [Koii Network](docs.koii.network).

Tasks run following a periodic structure of 'rounds':
In general, a data gatherer is any task which is designed to crowdsource information from a pool of nodes. Because many Data Gatherer tasks follow a standardized incentive and fraud-detection mechanism, it is possible to abstract much of the complexity of task design, and the developer need only provide an Adapter class (see `adapters/` for some examples).

![Screenshot_20230307-091958](https://user-images.githubusercontent.com/66934242/223565192-3ecce9c6-0f9a-4a58-8b02-2db19c61141f.png)
For more information on how the broader Task Flow works, check out [Koii's runtime environment docs](https://docs.koii.network/develop/microservices-and-tasks/what-are-tasks/gradual-consensus#why-is-it-gradual).

Each round is set by a specific time period, and nodes participate by uploading data to IPFS, posting CIDs to the K2 settlement layer, and sending messages across REST APIs and WebSockets.
# Requirements

For more information on how the Task Flow works, check out [the runtime environment docs](https://docs.koii.network/develop/microservices-and-tasks/what-are-tasks/gradual-consensus#why-is-it-gradual).

If this is your first time writing a Koii Task, you might want to use the [task organizer](https://www.figma.com/community/file/1220194939977550205/Task-Outline).

## Requirements

- [Node >=16.0.0](https://nodejs.org)
- [Node >=16.0.0](https://nodejs.org)
- Yarn is preferred over NPM
- [Docker compose](https://docs.docker.com/compose/install/docker)

## What's in the template?

`index.js` is the hub of your app, and ties together the other pieces. This will be the entrypoint when your task runs on Task Nodes

`NamespaceWrappers.js` contains the interfaces to make API calls to the core of the task-node. It contains all the necessary functions required to submit and audit the work, as well as the distribution lists

`coreLogic.js` is where you'll define your task, audit, and distribution logic, and controls the majority of task functionality. You can of course break out separate features into sub-files and import them into the core logic before web-packing.

## Runtime Options

There are two ways to run your task when doing development:

1. With GLOBAL_TIMERS="true" (see .env-local)- When the timer is true, IPC calls are made by calculating the average time slots of all the task running your node.

2. With GLOBAL_TIMERS="false" - This allows you to do manual calls to K2 and disables the triggers for round managemnt on K2. Transactions are only accepted during the correct period. Guide for manual calls is in index.js

# Modifying CoreLogic.js

Task nodes will trigger a set of predefined functions during operation.

There are in total 9 functions in CoreLogic which the you can modify according to your needs:

1. _task()_ - The logic for what your task should do goes here. There is a window in round that is dedicated to do work. The code in task is executed in that window.

2. _fetchSubmission()_ - After completing the task , the results/work will be stored somewhere like on IPFS or local levelDB. This function is the place where you can write the logic to fetch that work. It is called in submitTask() function which does the actual submission on K2.

3. _submitTask()_ - It makes the call to namespace function of task-node using the wrapper.

4. _generateDistributionList()_ - You have full freedom to prepare your reward distributions as you like and the logic for that goes here. We have provided a sample logic that rewards 1 KOII to all the needs who did the correct submission for that round. This function is called in submitDistributionList()

5. _submitDistributionList()_ - makes call to the namesapce function of task-node to upload the list and on succesful upload does the transaction to update the state.

6. _validateNode()_ - this function is called to verify the submission value, so based on the value received from the task-state we can vote on the submission.

7. _validateDistribution()_ - The logic to validate the distribution list goes here and the function will receive the distribution list submitted form task-state.

8. _auditTask()_ - makes call to namespace of task-node to raise an audit against the submission value if the validation fails.

9. _auditDistribution()_ - makes call to namespace of task-node to raise an audit against the distribution list if the validation fails.

# Testing and Deploying

Before you begin this process, be sure to check your code and write unit tests wherever possible to verify individual core logic functions. Testing using the docker container should be mostly used for consensus flows, as it will take longer to rebuild and re-deploy the docker container.

## Build

Before deploying a task, you'll need to build it into a single file executable by running
`yarn webpack`

## Deploy your bundle

Complete the following to deploy your task on the k2 testnet and test it locally with docker compose.

### To get a web3.storage key

If you have already created an account on [web3.storage](https://web3.storage/docs/#quickstart) you'll just need to enter the API key after the prompts in the deploy process.

### Find or create a k2 wallet key

If you have already generated a Koii wallet on yoru filesystem you can obtain the path to it by running `koii config get` which should return something similar to the following:

![截图 2023-03-07 18-13-17](https://user-images.githubusercontent.com/66934242/223565661-ece1591f-2189-4369-8d2a-53393da15834.png)

The `Keypair Path` will be used to pay gas fees and fund your bounty wallet by inputting it into the task CLI.

If you need to create a Koii wallet you can follow the instructions [here](https://docs.koii.network/develop/koii-software-toolkit-sdk/using-the-cli#create-a-koii-wallet). Make sure to either copy your keypair path from the output, or use the method above to supply the task CLI with the proper wallet path.

### Deploy to K2

To test the task with the [K2 Settlement Layer](https://docs.koii.network/develop/settlement-layer/k2-tick-tock-fast-blocks#docusaurus_skipToContent_fallback) you'll need to deploy it.

To publish tasks to the K2 network use `npx @_koii/create-task-cli` . You have two options to create your task using `config-task.yml` and using the `cli`. Check out the sample `config-task.yml` attached in this repo, by default it will look for both `config-task.yml` and `id.json` in your current directory and if not deteched you will have an option to enter your path. Tips on this flow and detailed meaning of each task parameter can be found [in the docs](https://docs.koii.network/develop/koii-software-toolkit-sdk/create-task-cli). One important thing to note is when you're presented with the choice of ARWEAVE, IPFS, or DEVELOPMENT you can select DEVELOPMENT and enter `main` in the following prompt. This will tell the task node to look for a `main.js` file in the `dist` folder. You can create this locally by running `yarn webpack`.

## Run a node locally

If you want to get a closer look at the console and test environment variables, you'll want to use the included docker-compose stack to run a task node locally.

1. Link or copy your wallet into the `config` folder as `id.json`
2. Open `.env-local` and add your TaskID you obtained after deploying to K2 into the `TASKS` environment variable.\
3. Run `docker compose up` and watch the output of the `task_node`. You can exit this process when your task has finished, or any other time if you have a long running persistent task.

### Redeploying
# What's in the template?
This repo provides a number of example Data Gatherers, which together give samples of how the standard can be used to navigate OAUTH, large Web3 Node Networks, and a variety of other common applications. This is an ongoing project, so feel free to contribute templates of your own.

You do not need to publish your task every time you make modifications. You do however need to restart the `task_node` in order for the latest code to be used. To prepare your code you can run `yarn webpack` to create the bundle. If you have a `task_node` ruinning already, you can exit it and then run `docker compose up` to restart (or start) the node.
## Designing Adapters
The Adapter class, defined in `model/adapter.js` is the core of the project, and can be extended to build your very own Data Gatherer for your particular application. To extend a Data Gatherer with an adapter, you'll need to define six key functions:

### Environment variables
1. NavigateSession()

Open the `.env-local` file and make any modifications you need. You can include environment variables that your task expects to be present here, in case you're using [custom secrets](https://docs.koii.network/develop/microservices-and-tasks/task-development-kit-tdk/using-the-task-namespace/keys-and-secrets).
2. ParseOne()

### API endpoints
3. ParseMany()

By default your API's will be exposed on base URL: http://localhost:8080/task/{TASKID}
4. NewSearch()

You can check out the state of your task using the default API : http://localhost:8080/task/{TASKID}/taskState
5. CheckSession()

`TASKID` is the id that you get when you create your task using `npx`
6. ListFromSearch()
142 changes: 142 additions & 0 deletions adapters/arweave/arweave.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
// TODO - produce map of arweave
// Import required modules
require('dotenv').config();
const axios = require('axios');
const Data = require(__dirname + '/../../model/data.js');
const Adapter = require(__dirname + '/../../model/adapter.js');
const {Builder, Browser, By, Key, until} = require('selenium-webdriver');


class Arweave extends Adapter {
constructor(credentials, maxRetry, db, txId) {
super(credentials, maxRetry, txId);
this.credentials = credentials || {};
this.maxRetry = maxRetry || 3;
this.txId = txId;
this.shims = {
"parseOne" : async (node) => {
// TODO - fetch one arweave node from the pending list, and see if it is online
let healthCheck = await this.checkNode(node)

// if it is online, then parse it and add it's peers to the pending list
if (healthCheck) {
this.parseNode(node)
}
},
"checkNode" : async () => {
// TODO check if the session is valid
}
}
this.db = db;
}

getNextPage = async (query) => {
// there is only 1000 results per page in this model, so we don't need to have a second page
return null;
}

parseNode = async (node) => {

let peers = await this.getPeers (node)

let txCheck = await this.checkTx (this.txId)

// TODO - add db updates here
// 1. Remove from pending
// 2. update db to reflect node status?

return this
}

getPeers = async (node) => {
let peers = [];
try {
// console.log('sending PEER check for ', this.location)
const payload = await superagent.get(`${node}/peers`).timeout({
response: superagentdelays.peers.response,
deadline: superagentdelays.peers.deadline,
})
// console.log('payload returned from ' + this.location, payload)
const body = JSON.parse(payload.text);
// console.log("BODY", body)
if (body) {
peers = body;
}
return

} catch (err) {
console.error ("can't fetch peers from " + this.location, err)
}
return peers;
}
checkTx = async function ( node, txId ) {
let containsTx = false;
try {
// console.log('sending txid check for ', peerUrl)
const payload = await superagent.get(`${node}/${ txId }`).timeout({
response: superagentdelays.txfetch.response,
deadline: superagentdelays.txfetch.deadline,
})
// console.log('payload returned from ' + peerUrl, payload)
const body = JSON.parse(payload.text);
containsTx = true;

} catch (err) {
// if (debug) console.error ("can't fetch " + this.location, err)
}
return containsTx;
}

negotiateSession = async () => {
return true; // only leaving this here so it doesn't throw errors in gatherers
}

getNextPendingItem = async () => {
return this.db.getPendingList(1);
}

checkNode = async () => {
// TODO - need a clean way to reintroduce this, for now it's wasting API credits
this.session.isValid = true
return true;
}

getPendingItems() {
return this.db.getPendingItems();
}

storeListAsPendingItems(list) {
console.log('db', this.db)
// TODO - store the list of nodes as pending items using db
for (let node of list) {
// the main difference with this adapter is that the node's IP address is the data for each item, so the ID === VALUE
//if (!this.db.isPendingItem(node) && !this.db.isDataItem(node)) {
this.db.addPendingItem(node, node)
//}
}
return true;
}

newSearch = async (query) => {
console.log('fetching peer list');
let newNodes = [];

let driver = await new Builder().forBrowser(Browser.FIREFOX).build();

try {
await driver.get("https://arweave.net" + '/peers');
let l = await driver.findElement(By.tagName("body"));
let text = (await l.getText()).toString();
let items = text.replace(/['"\[\]\n]+/g, '')
newNodes = items.split(',');

} finally {
await driver.quit();
}

return newNodes;
}


}
module.exports = Arweave;
Loading