JS Task API Examples: executing tasks
Introduction
With Golem JS Task API you can execute just a single task on a single provider or a series of multiple tasks in parallel on many providers. In the case of the latter, you can additionally define how providers should be initialized and how many tasks you want to run simultaneously.
In this article, the following examples are presented:
- Running tasks in parallel
- Defining the number of providers working in parallel
- Initializing providers
- Running a single task
Prerequisites
Yagna service is installed and running with the try_golem
app-key configured.
How to run examples
Create a project folder, initialize a Node.js project, and install the @golem-sdk/task-executor
library.
mkdir golem-example
cd golem-example
npm init
npm install @golem-sdk/task-executor
Copy the code into the index.mjs
file in the project folder and run:
node index.mjs
Running tasks in parallel
If you want to run your tasks in parallel, you can call .run()
multiple times on the same TaskExecutor
object. The maximum number of concurrently engaged providers is defined by the maxParallelTasks
parameter. Providers can be engaged more than once until all tasks are executed.
import { TaskExecutor, pinoPrettyLogger } from "@golem-sdk/task-executor";
(async () => {
const executor = await TaskExecutor.create({
package: "golem/alpine:latest",
logger: pinoPrettyLogger(),
yagnaOptions: { apiKey: "try_golem" },
});
try {
const data = [1, 2, 3, 4, 5];
const futureResults = data.map(async (item) =>
executor.run(async (ctx) => {
return await ctx.run(`echo "${item}"`);
}),
);
const results = await Promise.allSettled(futureResults);
results.forEach((result) => {
if (result.status === "fulfilled") {
console.log("Success", result.value.stdout);
} else {
console.log("Failure", result.value.reason);
}
});
} catch (err) {
console.error("An error occurred:", err);
} finally {
await executor.shutdown();
}
})();
Note that we utilize the Promise.all()
method to process outputs from futureResults
which is an iterable of promises. While this approach allowed us to keep the snippet short and simple, there is a pitfall with such an approach. This method rejects when any of the input's promises are rejected, so if any of the tasks fail (after retries) user will not get any results at all. To avoid such a scenario use Promise.allSettled()
instead.
In the example, we have an array of 5 elements [1, 2, 3, 4, 5]
and we map
over each value to define a separate step for each element in the array. Each step is a command that prints the value of the element to the console.
In the output logs, you can have some interesting observations:
The provider sharkoon_379_6
was engaged first. When he had finished his first task he was still the only available provider and he received another task to execute. In the meantime, other providers were successfully engaged and the next tasks were dispatched to them.
Note that even though provider sharkoon_379_8
was engaged before provider 10hx4r2_2.h
, the latter completed its task before the former. In the network, different nodes offer varying performance.
Defining the number of providers used
You can set the maximum number of providers to be engaged at the same time. The TaskExecutor will scan available proposals and engage additional providers if the number of actually engaged providers is lower than maxParallelTasks
and there are still some tasks to be executed. If you do not define this parameter, a default value of 5 will be used.
Note that the actual number of engaged providers might be:
- lower than
maxParallelTasks
, if there are not enough providers available in the network. - higher, when considering the total number of engaged providers for all the tasks in your job. Providers might get disconnected or simply fail, in which case the TaskExecutor will engage another one to maintain the number of active workers at the level defined by
maxParallelTasks
.
Below, you can see how to define the maximum number of providers to be engaged.
import { TaskExecutor, pinoPrettyLogger } from "@golem-sdk/task-executor";
(async () => {
const executor = await TaskExecutor.create({
package: "golem/alpine:latest",
logger: pinoPrettyLogger(),
yagnaOptions: { apiKey: "try_golem" },
maxParallelTasks: 3,
});
try {
const data = [1, 2, 3, 4, 5];
const futureResults = data.map((item) => executor.run((ctx) => ctx.run(`echo "${item}"`)));
const results = await Promise.allSettled(futureResults);
results.forEach((result) => {
if (result.status === "fulfilled") {
console.log("Success", result.value.stdout);
} else {
console.log("Failure", result.value.reason);
}
});
} catch (err) {
console.error("Error occurred during task execution:", err);
} finally {
await executor.shutdown();
}
})();
Initialization tasks
Normally, when a larger job is divided into smaller tasks to be run in parallel on a limited number of providers, these providers might be utilized for more than one task. In such cases, each task is executed in the same environment as the previous task run on that provider. To optimize performance, you might decide that some initialization tasks need only be run once per provider. This can be particularly useful if you have to send a large amount of data to the provider.
This example requires an action_log.txt
file that can be created with the following command:
echo Action log: > action_log.txt
You can address such a need using the onActivityReady()
method of the TaskExecutor. Here is an example:
import { TaskExecutor, pinoPrettyLogger } from "@golem-sdk/task-executor";
(async () => {
const executor = await TaskExecutor.create({
package: "golem/alpine:latest",
logger: pinoPrettyLogger(),
yagnaOptions: { apiKey: "try_golem" },
maxParallelTasks: 3,
});
executor.onActivityReady(async (ctx) => {
console.log(ctx.provider.name + " is downloading action_log file");
await ctx.uploadFile("./action_log.txt", "/golem/input/action_log.txt");
});
const inputs = [1, 2, 3, 4, 5];
try {
const futureResults = inputs.map(async (item) => {
return await executor.run(async (ctx) => {
await ctx
.beginBatch()
.run(`echo 'processing item: ${item}' >> /golem/input/action_log.txt`)
.downloadFile("/golem/input/action_log.txt", `./output_${ctx.provider.name}.txt`)
.end();
});
});
await Promise.all(futureResults);
} catch (error) {
console.error("A critical error occurred:", error);
} finally {
await executor.shutdown();
}
})();
In the code, we decreased the maxParallelTasks
value from the default value of 5, to make sure that some of our five tasks will be run on the same provider.
The onActivityReady()
method is used to upload a file to a remote computer that will be used to log all future activity run on this provider. The code used in the onActivityReady()
method contains an additional console.log
to demonstrate that even if the whole job consists of five tasks, the task function used in onActivityReady()
will be executed only once per provider. (Unless the provider disengages and is engaged again - in such a situation, its virtual machine would be created anew, and we would upload the file again there).
Note how we utilized the ctx
worker context to get the provider name using the provider.name
property.
Inside each task function we employed the beginBatch()
to chain multiple commands - you can see more about this feature in the Defining Tasks article.
The log from this example shows that even if the provider imapp1019_0.h
was eventually used to execute 4 tasks, it uploaded the log only once. Its output file - downloaded after the last task was executed - contained the following:
some action log
processing item: 3
processing item: 4
processing item: 5
This log once again illustrates that providers offer different performance levels. Even though SBG5-1.h
and WAW1-16.h
signed agreements before Task 3 was computed on imapp1019_0.h
, this provider managed to complete task 3, 4 and 5 before the others downloaded the action_log
file and completed their first task.
Single run
Sometimes you don't need to run tasks in parallel and a single run is sufficient. In such cases, you can use the run()
method as demonstrated in the example below.
import { TaskExecutor, pinoPrettyLogger } from "@golem-sdk/task-executor";
(async () => {
const executor = await TaskExecutor.create({
package: "golem/node:20-alpine",
logger: pinoPrettyLogger(),
yagnaOptions: { apiKey: "try_golem" },
});
try {
const result = await executor.run(async (ctx) => (await ctx.run("node -v")).stdout);
console.log("Task result:", result);
} catch (err) {
console.error("Error during task execution:", err);
} finally {
await executor.shutdown();
}
})();
The requestor script runs a single task defined by a task function: ctx.run("node -v")
. The output of the command is available through stdout
of the result
object returned from the ctx.run()
function:
Below, you can see the script logs:
In the logs, we can see that the requestor uses the Holesky network for payments (a test network). The task was executed once on a single provider.
In the table, we see a summary of the costs (paid here in test GLM), along with the result of the command which outputs the version of the node in the image deployed on the provider.