DCP Worker Installation

This document is intended to help people who would like to get a standalone command line mode DCP worker running on an Ubuntu linux system. At the moment Ubuntu is the supported platform but in the future other variants will be added, and in the medium term we hope to create packages for both Fedora and debian variants and subvariants of linux. However, at the moment we are still very actively developing more fundamental aspects of the software ecosystem that is the Distributed Compute Protocol, so that work still lies in the future.

Prerequisites

There are a few basic prerequisites that will be required in order to run a dcp worker. The first is nodejs version 16 or greater. I would recommend that you follow the instructions at:

https://github.com/nodesource/distributions/blob/master/README.md

to install the specific version of node you want; here at our offices we have people who install node versions 16 and 18.

The specific commands you’d use under Ubuntu from this page that you’d use to configure apt to use their repo and install node 18 are

curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash - &&\
sudo apt-get install -y nodejs

Please note that curl must be installed; if it is not installed sudo apt install curl should get it installed for you.

You will also need to install the xinetd package; we will be configuring the evaluator as an xinet service that listens on port 9000 on loopback and starts an evaluator when a request comes in for one: sudo apt install xinetd.

DCP Evaluator

After you have nodejs and npm installed, you will need the evaluator binary. We currently have them at

https://archive.distributed.computer/releases/linux/ubuntu-20.04/evaluator-v8-latest.tar.gz

which points to the latest posted release of the DCP evaluator. The evaluator is what does the actual computation, and is a Javascript interpreter that only has stdin and stdout as its interfaces as a security measure.

If you look inside the tar.gz file, you’ll see this listing:

drwxrwxr-x 1000/1000         0 2023-02-16 13:23 ./
drwxrwxr-x 1000/1000         0 2023-02-16 13:23 ./opt/
drwxrwxr-x 1000/1000         0 2023-02-16 13:23 ./opt/dcp/
drwxr-xr-x 1000/1000         0 2023-02-16 13:24 ./opt/dcp/bin/
-rwxrwxr-x 1000/1000  39013608 2023-02-16 13:24 ./opt/dcp/bin/dcp-evaluator
drwxr-xr-x 1000/1000         0 2023-02-16 13:23 ./opt/dcp/lib/
drwxrwxrwt 1000/1000         0 2023-02-16 13:23 ./opt/dcp/tmp/
drwxrwxr-x 1000/1000         0 2023-02-16 13:23 ./opt/dcp/log/
drwxr-xr-x 1000/1000         0 2023-02-16 13:23 ./opt/dcp/etc/
drwxr-xr-x 1000/1000         0 2023-02-16 13:23 ./opt/dcp/etc/keys/
drwxrwxr-x 1000/1000         0 2023-02-16 13:23 ./opt/dcp/libexec/
drwxr-xr-x 1000/1000         0 2023-02-16 13:23 ./opt/dcp/libexec/evaluator/
drwxr-xr-x 1000/1000         0 2023-02-16 13:23 ./opt/dcp/www/
-rw-rw-r-- 1000/1000       701 2023-02-16 13:24 ./opt/dcp/install.log
drwxrwxrwt 1000/1000         0 2023-02-16 13:23 ./opt/dcp/run/
drwxrwxrwt 1000/1000         0 2023-02-16 13:23 ./opt/dcp/run/tlog/

This should be untarred into the root directory of your system with admin priveleges.

The DCP User

The next thing you’ll need to do is to create a user to run the dcp worker. This user should not have access to any admin privileges. When creating our docker worker, we use the useradd command so we can specify a few things about the user on the command line:

useradd -U -d /opt/dcp -s /usr/sbin/nologin dcp

In particular, we set its home directory to the /opt/dcp hierarchy, and set its shell to make it impossible for it to have a login shell.

Then, we need to create a .dcp folder under /opt/dcp to contain any configuration information required for the worker:

sudo mkdir /opt/dcp/.dcp

In particular, that is where you can put the dcp-config.js, a JSON file that contains any desired configuration information for the worker. This will also be the location of the id.keystore file, which contains the unique ID of the worker that will identify it to the scheduler.

Finally, now that the folder hierarchy and the dcp user are in place, we need to change the file’s ownership to that of the worker:

chown -R dcp:dcp /opt/dcp/

Services Configuration

The next step is configuring the services required to get a worker going. This is assuming that you’re using xinetd to launch evaluators and systemd to control the actual worker as a whole. You can find example files by downloading

https://archive.distributed.computer/releases/linux/ubuntu-20.04/dcpWorkerServicesExamples.tgz

Following are notes about the files found in that archive.

dcp-config.js.EXAMPLE

This is just a short example of a JSON used to configure the worker. In this example, the timeout set for the evaluator to emit a work progress event is 300 seconds; if an evaluator has not emitted on after working for five minutes the worker assumes the evaluator has crashed, kills it, and returns an error to the scheduler. You may want to change this depending on what kind of workload you’re doing; in particular, people running private compute groups with on-premise workers doing specific work may wish to change this value depending on what the slices’ characteristics they are sending to their workers.

dcp-evaluator.EXAMPLE

This is the configuration for the xinetd service. Copy this file into /etc/xinetd.d/ and restart the xinetd service to load it. Note in particular that there is one option that you may wish to configure, which is changing the “-- --options=--max-old-space-size=8192” on the server_args line. This configures the amount of heap space available for the evaluator. Node’s default is to use 1.5 GB of heap, this option sets it to 8 GB of heap (8192 MB). You may wish to tweak this value based on the number of cores and amount of RAM you have available; 8 GB should assure that you can in fact do pretty much any job, but may also result in system responsiveness issues on computers with a large number of cores but smaller amounts of RAM available. This is tweakable by changing the number and issuing sudo systemctl restart xinetd to reload xinetd’s configuration. Note also that the two dashes by themselves (--) before the --options= flag is necessary to ensure that the options are passed in to the evaluator to configure the V8 process the evaluator runs in.

dcp-worker.service.EXAMPLE

This is the systemd service file used to manage the worker as a whole. There are two particular items in here that are of interest to the user; the Environment= and ExecStart= lines. Currently, the service is enabling computeGroups debug information to be logged so the user can check their logs if they are configuring the worker to join a particular compute group, leave the public group, and/or use the public group as a fallback as a source of work if the private group a worker is joining doesn’t have work available. Later in this file you will find the help output from the worker which will explain the common command line options you can pass in to the worker, however for the purposes of this section note that the worker is configured to pay into a “catch-all” bank account; you almost certainly will wish to change this account number to an account that you have created by registering at https://portal.distributed.computer and creating a bank account for your earnings to be deposited in.

startup.sh.EXAMPLE

This is a pre-execute script that accomplishes a few things for you automatically when the service is started. First is to ensure that you are running the latest node modules for dcp-worker and dcp-util. The second is to check if there is an identity keystore (/opt/dcp/.dcp/id.keystore) and if one does not exist to create one to identify the worker to the scheduler. Note that this is distinct from the payment address; it identifies the worker uniquely, not you uniquely. The bank account is generally managed using a file called default.keystore, however, while this file is necessary to withdraw funds from your account (it contains the private key that permits this) for the purpose of a worker all that is required is the unique identifier that provides the address of the account that payment for work accomplished is to be deposited into.

File locations

Generally, you will want to copy dcp-config.js.EXAMPLE to /opt/dcp/.dcp, dcp-evaluator.EXAMPLE to /etc/xinetd.d, dcp-worker.service.EXAMPLE to /etc/systemd/system, and startup.sh.EXAMPLE to /opt/dcp/bin, and modify as appropriate for your system and context. You will also need to issue sudo systemctl daemon-reload to load the dcp-worker.service service, as well as sudo systemctl restart xinetd to ensure that xinetd has loaded the dcp-evaluator configuration file.

Running the worker

At this point, the worker should be managed by issuing systemctl commands against the dcp-worker service (i.e. systemctl start dcp-worker and systemctl stop dcp-worker.)

Here is the command line help for the worker. Please consult this and configure either the /etc/systemd/system/dcp-worker.service or /etc/xinetd.d/dcp-evaluator service definitions to configure your desired behaviour.

dcp-worker - Standalone NodeJS DCP Worker
Copyright (c) 2020 Distributive Corp Ltd., All Rights Reserved.

To view the common DCP options, use the --help flag with --dcp-options.

DCP Common Options
      --dcp-scheduler  Specify a scheduler's URL to connect to/fetch dcp-client
                       from                                             [string]

Output options
  -v, --verbose               Enable verbose output     [count] [default: false]
  -o, --outputMode, --output  Set the output mode
     [string] [choices: "detect", "console", "dashboard", "event-log", "syslog",
                                                  "logfile"] [default: "detect"]
      --reportInterval        If set, output a status summary every [interval]
                              seconds in console output mode            [number]

Identity options
      --identityKey       Identity key, in hex format                   [string]
      --identityKeystore  Identity keystore, in json format             [string]

Log File output options
      --logfile  Path to log file (if --output=file)                    [string]

Syslog output options
      --syslogAddress    Address of rsyslog server (if --output=syslog) [string]
      --syslogTransport  Transport to connect to rsyslog daemon (if
                         --output=syslog)       [string] [choices: "udp", "tcp"]
      --syslogPort       UDP/TCP port of rsyslog server                 [number]

Options:
  -h, --help                        Show help, use with commands to show
                                    detailed command help: node <cmd> --help
                                                                       [boolean]
      --show-hidden, --dcp-options  Show hidden options, and DCP common options
                                    (must be used with the --help flag)[boolean]
      --paymentAddress              The address to deposit funds into, will use
                                    the default bank keystore if not provided.
                                                                        [string]
  -c, --cores                       Number of cores to work with
                                                  [number] [default: numCores-1]
  -H, --hostname                    Evaluator hostname
                                                 [string] [default: "localhost"]
  -p, --port                        Evaluator port      [number] [default: 9000]
  -P, --priorityOnly                Set the priority mode [deprecated]
                                                      [boolean] [default: false]
  -j, --job-id                      Restrict worker to a specific job (use N
                                    times for N jobs)                    [array]
  -g, --join                        Join compute group; the format is
                                    "joinKey,joinSecret" or
                                    "joinKey,eh1-joinHash"               [array]
      --leavePublicGroup            Do not fetch slices from public compute
                                    group             [boolean] [default: false]
      --publicGroupFallback         If set, worker will prefer private groups
                                    but fall back on the public group if no
                                    preferred work is available
                                                      [boolean] [default: false]
      --eventDebug                  If set, dump all sandbox and worker events
  -a, --allowedOrigins              modify the 'any' allow origins of dcpConfig
                                                                         [array]
      --replPort                    If set, open a REPL on specified TCP port
                                                                        [number]