Task Lifecycle Deep-dive

This section offers a deep-dive into a cycle of an AI Arena Task.

Figure3. Workflow of an AI Arena Task.

1. Task Creation

Task creation is the primary stage of the training cycle. Task creators define the desired models and submit tasks to the platform.

To qualify as a task creator, users must meet one or more of the following criteria:

• Stake a sufficient amount of FLAI

• Have successfully trained or validated a task previously, as evidenced by on-chain records

• Possess a reputation in the ML space or be recognised as a domain expert in relevant fields, as verified by the FLock community

2. Training Node and Validator Selection

Each participant is required to stake in order to participate either as a training node or a validator. Also, rate limiting is adopted to determine the number of times participants can be eligible as validator for a given task. Essentially, the likelihood of a participant being selected to validate a task submission increases with their stake. However, the rate at which validation frequency increases relative to the staking amount tends to diminish as the staking amount grows.

3. Training

Each training node is given DlocalDlocal, which contains locally sourced data samples, comprising feature set XX and label set YY, with each sample xi∈Xxi​∈X corresponding to a label yi∈Yyi​∈Y. The goal of training is to define a predictive model ff, which learns patterns within DlocalDlocal such that f(xi)≈yif(xi​)≈yi​.

To quantify the success (i.e. ability to predict) of the predictive model ff, we introduce a loss function L(f(xi),yi)L(f(xi​),yi​), assessing the discrepancy between predictions f(xi)f(xi​) and actual labels f(yi)f(yi​). A generic expression for this function is: L=1N∑i=1Nl(f(xi),yi) where NN denotes the total sample count, and ll signifies a problem-specific loss function, e.g., mean squared error or cross-entropy loss.

Ultimately, the optimisation goal of training is to adjust the model parameters θθ to minimise LL, typically through algorithms such as gradient descent.

4. Validation

After the training node produces a trained model θptaskθptask​, a selected group of validators, denoted as Vj∈VVj​∈V, each equipped with the evaluation dataset DevalDeval​ from the task creator, will validate the model. The dataset consists of pairs (xi,yi)(xi​,yi​), where xixi​ represents the features of the i−thi−th sample, and yiyi​is the corresponding true label.

To assess the performance of the trained model, we use an general evaluation, which is calculated as follows:

Here, 11 represents the indicator function that returns 1 if the predicted label y^iy^​i matches the true label yiyi​, and 00 otherwise. The function ∣Deval∣∣Deval∣ denotes the total number of samples within the evaluation dataset.

Each predicted label y^iy^​i​ from the model θtaskpθtaskp is compared against its corresponding true label yiyi​ within the dataset DevalDeval. The calculated metric result (accuracy here) serves as a quantifiable measure of θptaskθptask​'s effectiveness at label prediction across the evaluation dataset.

Last updated