Kill Mesos Framework

Want to kill Mesos Framework ? The following API call can do it for you.

curl -X POST http://$your_mesos:5050/master/teardown -d 'frameworkId=$id'

Install and Configure Mesos-DNS on Mesos Cluser

Overview

Before you read this wiki, please consider to read: Install and Configure Production-Ready Mesos Cluster on Photon Hosts , Install and Configure Marathon for Mesos Cluster on Photon Hosts and Install and Configure DCOS CLI for Mesos.
After you have fully installed and configured Mesos cluster you can execute jobs on it. But if you want a service discovery and load balancing capabilities you will need to use Mesos-DNS and Haproxy. In this post I will explain how to install and configure Mesos-DNS for your Mesos cluster.

Mesos-DNS supports service discovery in Apache Mesos clusters. It allows applications and services running on Mesos to find each other through the domain name system (DNS), similarly to how services discover each other throughout the Internet. Applications launched by Marathon are assigned names like search.marathon.mesos. Mesos-DNS translates these names to the IP address and port on the machine currently running each application. To connect to an application in the Mesos datacenter, all you need to know is its name. Every time a connection is initiated, the DNS translation will point to the right machine in the datacenter.

Installation
I will explain how to configure Mesos-DNS docker and run it through Marathon. I will show how to create a configuration file for mesos-dns-docker container and how to run it via Marathon.

root@pt-mesos-node1 [ ~ ]# cat /etc/mesos-dns/config.json
{
  "zk": "zk://192.168.0.1:2181,192.168.0.2:2181,192.168.0.3:2181/mesos",
  "masters": ["192.168.0.1:5050", "192.168.0.2:5050", "192.168.0.3:5050"],
  "refreshSeconds": 60,
  "ttl": 60,
  "domain": "mesos",
  "port": 53,
  "resolvers": ["10.23.1.1"],
  "timeout": 5,
  "httpon": true,
  "dnson": true,
  "httpport": 8123,
  "externalon": true,
  "SOAMname": "ns1.mesos",
  "SOARname": "root.ns1.mesos",
  "SOARefresh": 60,
  "SOARetry":   600,
  "SOAExpire":  86400,
  "SOAMinttl": 60
}

Create Application Run File
Next step is to create a json file and run the service from Marathon for HA. It is possible to run the service via api or via DCOS CLI.

knesenko@knesenko-mbp:~/mesos/jobs$ cat mesos-dns-docker.json
{
    "args": [
        "/mesos-dns",
        "-config=/config.json"
    ],
    "container": {
        "docker": {
            "image": "mesosphere/mesos-dns",
            "network": "HOST"
        },
        "type": "DOCKER",
        "volumes": [
            {
                "containerPath": "/config.json",
                "hostPath": "/etc/mesos-dns/config.json",
                "mode": "RO"
            }
        ]
    },
    "cpus": 0.2,
    "id": "mesos-dns",
    "instances": 1,
    "constraints": [["hostname", "CLUSTER", "pt-mesos-node2.example.com"]]
}

Now we can see in the Marathon UI that we launched the application.
Screen Shot 2015-12-27 at 4.25.05 PM

Setup Resolvers and Testing
To allow Mesos tasks to use Mesos-DNS as the primary DNS server, you must edit the file /etc/resolv.conf in every slave and add a new nameserver. For instance, if mesos-dns runs on the server with IP address 192.168.0.5 at the beginning of /etc/resolv.conf on every slave.

root@pt-mesos-node2 [ ~/mesos-dns ]# cat /etc/resolv.conf
# This file is managed by systemd-resolved(8). Do not edit.
#
# Third party programs must not access this file directly, but
#only through the symlink at /etc/resolv.conf. To manage
# resolv.conf(5) in a different way, replace the symlink by a
# static file or a different symlink.
nameserver 192.168.0.5
root@pt-mesos-node2 [ ~/mesos-dns ]#

Lets run some simple docker app and see if we can resolve in DNS.

knesenko@knesenko-mbp:~/mesos/jobs$ cat docker.json
{
    "id": "docker-hello",
    "container": {
        "docker": {
            "image": "centos"
        },
        "type": "DOCKER",
        "volumes": []
    },
    "cmd": "echo hello; sleep 10000",
    "mem": 16,
    "cpus": 0.1,
    "instances": 10,
    "disk": 0.0,
    "ports": [0]
}
knesenko@knesenko-mbp:~/mesos/jobs$ dcos marathon app add docker.json

Lets try to resolve it:

root@pt-mesos-node2 [ ~/mesos-dns ]# dig _docker-hello._tcp.marathon.mesos SRV
;; Truncated, retrying in TCP mode.
; < <>> DiG 9.10.1-P1 < <>> _docker-hello._tcp.marathon.mesos SRV
;; global options: +cmd
;; Got answer:
;; ->>HEADER< <- opcode: QUERY, status: NOERROR, id: 25958
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 10, AUTHORITY: 0, ADDITIONAL: 10
;; QUESTION SECTION:
;_docker-hello._tcp.marathon.mesos. IN SRV
;; ANSWER SECTION:
_docker-hello._tcp.marathon.mesos. 60 IN SRV 0 0 31998 docker-hello-4bjcf-s2.marathon.slave.mesos.
_docker-hello._tcp.marathon.mesos. 60 IN SRV 0 0 31844 docker-hello-jexm6-s1.marathon.slave.mesos.
_docker-hello._tcp.marathon.mesos. 60 IN SRV 0 0 31111 docker-hello-6ms44-s2.marathon.slave.mesos.
_docker-hello._tcp.marathon.mesos. 60 IN SRV 0 0 31719 docker-hello-muhui-s2.marathon.slave.mesos.
_docker-hello._tcp.marathon.mesos. 60 IN SRV 0 0 31360 docker-hello-jznf4-s1.marathon.slave.mesos.
_docker-hello._tcp.marathon.mesos. 60 IN SRV 0 0 31306 docker-hello-t41ti-s1.marathon.slave.mesos.
_docker-hello._tcp.marathon.mesos. 60 IN SRV 0 0 31124 docker-hello-mq3oz-s1.marathon.slave.mesos.
_docker-hello._tcp.marathon.mesos. 60 IN SRV 0 0 31816 docker-hello-tcep8-s1.marathon.slave.mesos.
_docker-hello._tcp.marathon.mesos. 60 IN SRV 0 0 31604 docker-hello-5uu37-s1.marathon.slave.mesos.
_docker-hello._tcp.marathon.mesos. 60 IN SRV 0 0 31334 docker-hello-jqihw-s1.marathon.slave.mesos.
 
;; ADDITIONAL SECTION:
docker-hello-muhui-s2.marathon.slave.mesos. 60 IN A 192.168.0.4
docker-hello-4bjcf-s2.marathon.slave.mesos. 60 IN A 192.168.0.5
docker-hello-jexm6-s1.marathon.slave.mesos. 60 IN A 192.168.0.4
docker-hello-jqihw-s1.marathon.slave.mesos. 60 IN A 192.168.0.4
docker-hello-mq3oz-s1.marathon.slave.mesos. 60 IN A 192.168.0.6
docker-hello-tcep8-s1.marathon.slave.mesos. 60 IN A 192.168.0.4
docker-hello-6ms44-s2.marathon.slave.mesos. 60 IN A 192.168.0.6
docker-hello-t41ti-s1.marathon.slave.mesos. 60 IN A 192.168.0.5
docker-hello-jznf4-s1.marathon.slave.mesos. 60 IN A 192.168.0.4
docker-hello-5uu37-s1.marathon.slave.mesos. 60 IN A 192.168.0.5
;; Query time: 0 msec
;; SERVER: 10.23.106.147#53(10.23.106.147)
;; WHEN: Sun Dec 27 14:36:32 UTC 2015
;; MSG SIZE  rcvd: 1066
root@pt-mesos-node2 [ ~/mesos-dns ]#

We can see that we can resolve our app !!! Next step is to configure HAProxy for our cluster.

Install and Configure Marathon for Mesos Cluster on Photon Hosts

In my previous post I described how to install and configure Mesos cluster on Photon hosts. In this post I am going to explain how to install and configure Marathon for Mesos cluster. I will use Photon OS. All following steps should be done on each Mesos master.

First of all download marathon:

root@pt-mesos-master1 [ ~ ]# mkdir -p  /opt/mesosphere/marathon/ && cd /opt/mesosphere/marathon/
root@pt-mesos-master1 [ /opt/mesosphere/marathon ]#  curl -O http://downloads.mesosphere.com/marathon/v0.13.0/marathon-0.13.0.tgz
root@pt-mesos-master1 [ /opt/mesosphere/marathon ]# tar -xf marathon-0.13.0.tgz
root@pt-mesos-master1 [ /opt/mesosphere/marathon ]# mv marathon-0.13.0 marathon

Create configuration for Marathon:

root@pt-mesos-master1 [ /opt/mesosphere/marathon ]# ls -l /etc/marathon/conf/
total 8
-rw-r--r-- 1 root root 68 Dec 24 14:33 master
-rw-r--r-- 1 root root 71 Dec 24 14:33 zk
root@pt-mesos-master1 [ /opt/mesosphere/marathon ]# cat /etc/marathon/conf/*
zk://192.168.0.1:2181,192.168.0.2:2181,192.168.0.3:2181/mesos
zk://192.168.0.1:2181,192.168.0.2:2181,192.168.0.3:2181/marathon
root@pt-mesos-master1 [ /opt/mesosphere/marathon ]# cat /etc/systemd/system/marathon.service
[Unit]
Description=Marathon
After=network.target
Wants=network.target
 
[Service]
Environment="JAVA_HOME=/opt/OpenJDK-1.8.0.51-bin"
ExecStart=/opt/mesosphere/marathon/bin/start \
    --master zk://192.168.0.1:2181,192.168.0.2:2181,192.168.0.3:2181/mesos \
    --zk zk://192.168.0.1:2181,192.168.0.2:2181,192.168.0.3:2181/marathon
Restart=always
RestartSec=20
 
[Install]
WantedBy=multi-user.target
root@pt-mesos-master1 [ /opt/mesosphere/marathon ]#


Last thing we need to do is to change marathon startup script, since Photon hosts using not standard JRE. Make sure you add JAVA_HOME to java path:
root@pt-mesos-master1 [ /opt/mesosphere/marathon ]# tail -n3 /opt/mesosphere/marathon/bin/start
# Start Marathon
marathon_jar=$(find "$FRAMEWORK_HOME"/target -name 'marathon-assembly-*.jar' | sort | tail -1)
exec "${JAVA_HOME}/bin/java" "${java_args[@]}" -jar "$marathon_jar" "${app_args[@]}"
root@pt-mesos-master1 [ /opt/mesosphere/marathon ]#

Now we can start Marathon service:

root@pt-mesos-master1 [ /opt/mesosphere/marathon ]# systemctl start marathon
root@pt-mesos-master1 [ /opt/mesosphere/marathon ]# ps -ef | grep marathon
root     15821     1 99 17:14 ?        00:00:08 /opt/OpenJDK-1.8.0.51-bin/bin/java -jar /opt/mesosphere/marathon/bin/../target/scala-2.11/marathon-assembly-0.13.0.jar --master zk://192.168.0.1:2181,192.168.0.2:2181,192.168.0.3:2181/mesos --zk 
zk://192.168.0.1:2181,192.168.0.2:2181,192.168.0.3:2181/marathon
root     15854 14692  0 17:14 pts/0    00:00:00 grep --color=auto marathon
root@pt-mesos-master1 [ /opt/mesosphere/marathon ]#

How To Configure Production-Ready Mesos Cluster on Photon Hosts

In this post I will try to explain how to install a production ready Mesos cluser and Zookeeper. If you are not familiar with some of these technologies, you can read more about Mesos, Marathon and Zookeeper.


Overview

For this setup I will use 3 Mesos masters and 3 slaves. On each Mesos master I will run a Zookeeper, means we will have 3 Zookeepers as well. Mesos cluster will be configured with a quorum of 2. For networking Mesos use Mesos-DNS. I tried to run Mesos-DNS as container, but got into some resolving issues, so in my next post I will explain how to configure Mesos-DNS and run it through Marathon. Photon hosts will be used for masters and slaves.

Masters:

pt-mesos-master1.example.com	192.168.0.1
pt-mesos-master2.example.com	192.168.0.2
pt-mesos-master2.example.com	192.168.0.3

 

Slaves:

pt-mesos-node1.example.com	192.168.0.4
pt-mesos-node2.example.com	192.168.0.5
pt-mesos-node3.example.com	192.168.0.6

 


 

Masters Installation and Configuration
First of all we will install Zookeeper. Since currently there is a bug in Photon related to Zookeeper installation I will use the tarball. Do the following for each master:

root@pt-mesos-master1 [ ~ ]# mkdir -p /opt/mesosphere && cd /opt/mesosphere && wget http://apache.mivzakim.net/zookeeper/stable/zookeeper-3.4.7.tar.gz
root@pt-mesos-master1 [ /opt/mesosphere ]# tar -xf zookeeper-3.4.7.tar.gz && mv zookeeper-3.4.7 zookeeper

Example of the Zookeeper configuration file:

root@pt-mesos-master1 [ ~ ]# cat /opt/mesosphere/zookeeper/conf/zoo.cfg | grep -v '#'
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/lib/zookeeper
clientPort=2181
server.1=192.168.0.1:2888:3888
server.2=192.168.0.2:2888:3888
server.3=192.168.0.3:2888:3888

Example of zookeeper systemd configuration file:

root@pt-mesos-master1 [ ~ ]# cat /etc/systemd/system/zookeeper.service 
[Unit]
Description=Apache ZooKeeper
After=network.target

[Service]
Environment="JAVA_HOME=/opt/OpenJDK-1.8.0.51-bin"
WorkingDirectory=/opt/mesosphere/zookeeper
ExecStart=/bin/bash -c "/opt/mesosphere/zookeeper/bin/zkServer.sh start-foreground"
Restart=on-failure
RestartSec=20
User=root
Group=root

[Install]
WantedBy=multi-user.target

Need to attribute the server id to each machine by creating a file named myid, one for each server, which resides in that server’s data directory, as specified by the configuration file parameter dataDir. The myid file consists of a single line containing only the text of that machine’s id. So myid of server 1 would contain the text “1” and nothing else. The id must be unique within the ensemble and should have a value between 1 and 255.

echo 1 > /var/lib/zookeeper/myid

Now lets install Mesos masters. Do the following for each master:

root@pt-mesos-master1 [ ~ ]# yum -y install mesos
Setting up Install Process
Package mesos-0.23.0-2.ph1tp2.x86_64 already installed and latest version
Nothing to do
root@pt-mesos-master1 [ ~ ]#

Example of master service systemd configuration file that should be located on each master:

root@pt-mesos-master1 [ ~ ]# cat /etc/systemd/system/mesos-master.service 
[Unit]
Description=Mesos Slave
After=network.target
Wants=network.target

[Service]
ExecStart=/bin/bash -c "/usr/sbin/mesos-master \
	--ip=192.168.0.1 \
	--work_dir=/var/lib/mesos \
	--log_dir=/var/lob/mesos \
	--cluster=EXAMPLE \
	--zk=zk://192.168.0.1:2181,192.168.0.2:2181,192.168.0.3:2181/mesos \
	--quorum=2"
KillMode=process
Restart=always
RestartSec=20
LimitNOFILE=16384
CPUAccounting=true
MemoryAccounting=true

[Install]
WantedBy=multi-user.target

Make sure you replace –ip setting on each master. Add server id to the configuration file, so zookeeper will understand the id of your master server. Should be done for each master with its own id.

root@pt-mesos-master1 [ ~ ]# echo 1 > /var/lib/zookeeper/myid 
root@pt-mesos-master1 [ ~ ]# cat /var/lib/zookeeper/myid 
1

So far we have 3 masters with Zookeeper and Mesos packages installed. Lets start zookeeper and mesos-master services on each master:

root@pt-mesos-master1 [ ~ ]# systemctl start zookeeper
root@pt-mesos-master1 [ ~ ]# systemctl start mesos-master
root@pt-mesos-master1 [ ~ ]# ps -ef | grep mesos
root     11543     1  7 12:09 ?        00:00:01 /opt/OpenJDK-1.8.0.51-bin/bin/java -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp /opt/mesosphere/zookeeper/bin/../build/classes:/opt/mesosphere/zookeeper/bin/../build/lib/*.jar:/opt/mesosphere/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/mesosphere/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/opt/mesosphere/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/opt/mesosphere/zookeeper/bin/../lib/log4j-1.2.16.jar:/opt/mesosphere/zookeeper/bin/../lib/jline-0.9.94.jar:/opt/mesosphere/zookeeper/bin/../zookeeper-3.4.7.jar:/opt/mesosphere/zookeeper/bin/../src/java/lib/*.jar:/opt/mesosphere/zookeeper/bin/../conf: -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false org.apache.zookeeper.server.quorum.QuorumPeerMain /opt/mesosphere/zookeeper/bin/../conf/zoo.cfg
root     11581     1  0 12:09 ?        00:00:00 /usr/sbin/mesos-master --ip=192.168.0.1 --work_dir=/var/lib/mesos --log_dir=/var/lob/mesos --cluster=EXAMPLE --zk=zk://10.23.106.149:2181,10.23.106.145:2181,10.23.106.159:2181/mesos --quorum=2
root     11601  9117  0 12:09 pts/0    00:00:00 grep --color=auto mesos

Slaves Installation and Configuration
Steps for configuring mesos slave are very simple and not different from masters installation. The difference we wont install zookeeper on each slave and will start mesos slaves in a slave mode and will tell the daemon to join the mesos masters. Do the following for each slave:

root@pt-mesos-node1 [ ~ ]# cat /etc/systemd/system/mesos-slave.service 
[Unit]
Description=Photon instance running as a Mesos slave
After=network-online.target,docker.service
 
[Service]
Restart=on-failure
RestartSec=10
TimeoutStartSec=0
ExecStartPre=/usr/bin/rm -f /tmp/mesos/meta/slaves/latest
ExecStart=/bin/bash -c "/usr/sbin/mesos-slave \
	--master=zk://192.168.0.1:2181,192.168.0.2:2181,192.168.0.3:2181/mesos \
        --hostname=$(/usr/bin/hostname) \
        --log_dir=/var/log/mesos_slave \
        --containerizers=docker,mesos \
        --docker=$(which docker) \
        --executor_registration_timeout=5mins \
        --ip=$(ifconfig ens160 | grep inet | awk -F'addr:' '{print $2}' | awk '{print $1}')"
 
[Install]
WantedBy=multi-user.target

Please make sure to replace the NIC name under –ip setting. Start the mesos-slave service on each node.

root@pt-mesos-node1 [ ~ ]# systemctl start mesos-slave

Now you should have ready mesus cluster with 3 masters, 3 zookeepers and 3 slaves.
Screen Shot 2015-12-24 at 2.22.27 PM

In my next post I will explain how to configure Mesos-DNS and Marathon.

Installing DCOS CLI for Mesos

To install the DCOS CLI:

Install virtualenv. The Python tool virtualenv is used to manage the DCOS CLI’s environment.

$ sudo pip install virtualenv

Tip: On some older Python versions, ignore any ‘Insecure Platform’ warnings. For more information, see https://virtualenv.pypa.io/en/latest/installation.html.

From the command line, create a new directory named dcos and navigate into it.

$ mkdir dcos
$ cd dcos
$ curl -O https://downloads.mesosphere.io/dcos-cli/install.sh

Run the DCOS CLI install script, where <hosturl> is the hostname of your master node prefixed with http://:

$ bash install.sh <install_dir> <mesos-master-host>

For example, if the hostname of your mesos master node is mesos-master.example.com:

$ bash install.sh . http://mesos-master.example.com

Follow the on-screen DCOS CLI instructions and enter the Mesosphere verification code. You can ignore any Python ‘Insecure Platform’ warnings.

Confirm whether you want to add DCOS to your system PATH:

$ Modify your bash profile to add DCOS to your PATH? [yes/no]

Since DCOS CLI is used for DCOS cluster, need to reconfigure Marathon and Mesos masters urls with the following commands:

dcos config set core.mesos_master_url http://<mesos-master-host>:5050
dcos config set marathon.url http://<marathon-host>:8080