Pseudo-Cluster Deployment

The purpose of pseudo-cluster deployment is to deploy the DolphinScheduler service on a single machine. In this mode, DolphinScheduler’s master, worker, api server, and logger server are all on the same machine.

If you are a green hand and want to experience DolphinScheduler, we recommended you install follow Standalone. If you want to experience more complete functions or schedule large tasks number, we recommended you install follow pseudo-cluster deployment. If you want to using DolphinScheduler in production, we recommended you follow cluster deployment or kubernetes

Prepare

Pseudo-cluster deployment of DolphinScheduler requires external software support

  • JDK:Download JDK (1.8+), and configure JAVA_HOME to and PATH variable. You can skip this step, if it already exists in your environment.
  • Binary package: Download the DolphinScheduler binary package at download page
  • Database: PostgreSQL (8.2.15+) or MySQL (5.7+), you can choose one of the two, such as MySQL requires JDBC Driver 8.0.16
  • Registry Center: ZooKeeper (3.4.6+),download link
  • Process tree analysis
    • pstree for macOS
    • psmisc for Fedora/Red/Hat/CentOS/Ubuntu/Debian

Note: DolphinScheduler itself does not depend on Hadoop, Hive, Spark, but if you need to run tasks that depend on them, you need to have the corresponding environment support

DolphinScheduler startup environment

Configure user exemption and permissions

Create a deployment user, and be sure to configure sudo without password. We here make a example for user dolphinscheduler.

  1. # To create a user, login as root
  2. useradd dolphinscheduler
  3. # Add password
  4. echo "dolphinscheduler" | passwd --stdin dolphinscheduler
  5. # Configure sudo without password
  6. sed -i '$adolphinscheduler ALL=(ALL) NOPASSWD: NOPASSWD: ALL' /etc/sudoers
  7. sed -i 's/Defaults requirett/#Defaults requirett/g' /etc/sudoers
  8. # Modify directory permissions and grant permissions for user you created above
  9. chown -R dolphinscheduler:dolphinscheduler apache-dolphinscheduler-*-bin

NOTICE:

  • Because DolphinScheduler’s multi-tenant task switch user by command sudo -u {linux-user}, the deployment user needs to have sudo privileges and is password-free. If novice learners don’t understand, you can ignore this point for the time being.
  • If you find the line “Defaults requirest” in the /etc/sudoers file, please comment it

Configure machine SSH password-free login

Since resources need to be sent to different machines during installation, SSH password-free login is required between each machine. The steps to configure password-free login are as follows

  1. su dolphinscheduler
  2. ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
  3. cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  4. chmod 600 ~/.ssh/authorized_keys

Notice: After the configuration is complete, you can run the command ssh localhost to test if it work or not, if you can login with ssh without password.

Start zookeeper

Go to the zookeeper installation directory, copy configure file zoo_sample.cfg to conf/zoo.cfg, and change value of dataDir in conf/zoo.cfg to dataDir=./tmp/zookeeper

  1. # Start zookeeper
  2. ./bin/zkServer.sh start

Modify configuration

After completing the preparation of the basic environment, you need to modify the configuration file according to your environment. The configuration file is in the path of conf/config/install_config.conf. Generally, you just needs to modify the INSTALL MACHINE, DolphinScheduler ENV, Database, Registry Server part to complete the deployment, the following describes the parameters that must be modified

  1. # ---------------------------------------------------------
  2. # INSTALL MACHINE
  3. # ---------------------------------------------------------
  4. # Because the master, worker, and API server are deployed on a single node, the IP of the server is the machine IP or localhost
  5. ips="localhost"
  6. masters="localhost"
  7. workers="localhost:default"
  8. alertServer="localhost"
  9. apiServers="localhost"
  10. pythonGatewayServers="localhost"
  11. # DolphinScheduler installation path, it will auto create if not exists
  12. installPath="~/dolphinscheduler"
  13. # Deploy user, use what you create in section **Configure machine SSH password-free login**
  14. deployUser="dolphinscheduler"
  15. # ---------------------------------------------------------
  16. # DolphinScheduler ENV
  17. # ---------------------------------------------------------
  18. # The path of JAVA_HOME, which JDK install path in section **Prepare**
  19. javaHome="/your/java/home/here"
  20. # ---------------------------------------------------------
  21. # Database
  22. # ---------------------------------------------------------
  23. # Database type, username, password, IP, port, metadata. For now dbtype supports `mysql` and `postgresql`, `H2`
  24. # Please make sure that the value of configuration is quoted in double quotation marks, otherwise may not take effect
  25. DATABASE_TYPE="mysql"
  26. SPRING_DATASOURCE_URL="jdbc:mysql://ds1:3306/ds_201_doc?useUnicode=true&characterEncoding=UTF-8"
  27. # Have to modify if you are not using dolphinscheduler/dolphinscheduler as your username and password
  28. SPRING_DATASOURCE_USERNAME="dolphinscheduler"
  29. SPRING_DATASOURCE_PASSWORD="dolphinscheduler"
  30. # ---------------------------------------------------------
  31. # Registry Server
  32. # ---------------------------------------------------------
  33. # Registration center address, the address of zookeeper service
  34. registryServers="localhost:2181"

Initialize the database

DolphinScheduler metadata is stored in relational database. Currently, PostgreSQL and MySQL are supported. If you use MySQL, you need to manually download mysql-connector-java driver (8.0.16) and move it to the lib directory of DolphinScheduler. Let’s take MySQL as an example for how to initialize the database

  1. mysql -uroot -p
  2. mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
  3. # Change {user} and {password} by requests
  4. mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO '{user}'@'%' IDENTIFIED BY '{password}';
  5. mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO '{user}'@'localhost' IDENTIFIED BY '{password}';
  6. mysql> flush privileges;

After above steps done you would create a new database for DolphinScheduler, then run shortcut Shell scripts to init database

  1. sh script/create-dolphinscheduler.sh

Start DolphinScheduler

Use deployment user you created above, running the following command to complete the deployment, and the server log will be stored in the logs folder

  1. sh install.sh

Note: For the first time deployment, there maybe occur five times of sh: bin/dolphinscheduler-daemon.sh: No such file or directory in terminal , this is non-important information and you can ignore it.

Login DolphinScheduler

The browser access address http://localhost:12345/dolphinscheduler can login DolphinScheduler UI. The default username and password are admin/dolphinscheduler123

Start or stop server

  1. # Stop all DolphinScheduler server
  2. sh ./bin/stop-all.sh
  3. # Start all DolphinScheduler server
  4. sh ./bin/start-all.sh
  5. # Start or stop DolphinScheduler Master
  6. sh ./bin/dolphinscheduler-daemon.sh stop master-server
  7. sh ./bin/dolphinscheduler-daemon.sh start master-server
  8. # Start or stop DolphinScheduler Worker
  9. sh ./bin/dolphinscheduler-daemon.sh start worker-server
  10. sh ./bin/dolphinscheduler-daemon.sh stop worker-server
  11. # Start or stop DolphinScheduler Api
  12. sh ./bin/dolphinscheduler-daemon.sh start api-server
  13. sh ./bin/dolphinscheduler-daemon.sh stop api-server
  14. # Start or stop Logger
  15. sh ./bin/dolphinscheduler-daemon.sh start logger-server
  16. sh ./bin/dolphinscheduler-daemon.sh stop logger-server
  17. # Start or stop Alert
  18. sh ./bin/dolphinscheduler-daemon.sh start alert-server
  19. sh ./bin/dolphinscheduler-daemon.sh stop alert-server
  20. # Start or stop Python Gateway Server
  21. sh ./bin/dolphinscheduler-daemon.sh start python-gateway-server
  22. sh ./bin/dolphinscheduler-daemon.sh stop python-gateway-server

Note:: Please refer to the section of “System Architecture Design” for service usage