DC/OS除了高效、低成本集群/数据中心管理外,在大数据分析和有状态服务有明显优势。

而有状态服务主要通过 dcos-commons 提供。

如:http://192.168.0.250/metadata

dcos-commons Simplifying stateful services for Kafka, Cassandra, HDFS, Spark, and TensorFlow with DC/OS. dcos-commons文档

DC/OS HDFS版本迭代更新也是放在dcos-commons仓库里的。

dcos-hdfs-01

因为HDFS是DC/OS官方支持的持久化存储方案,所以采用解决容器和状态持续化服务。

实施这套部署方案

dcos-hdfs-02

DC/OS HDFS 提供以下功能:

  • Single-command installation for rapid provisioning
  • Persistent storage volumes for enhanced data durability
  • Runtime configuration and software updates for high availability
  • Health checks and metrics for monitoring
  • Distributed storage scale out
  • HA name service with Quorum Journaling and ZooKeeper failure detection。

HDFS节点配置信息

    "journal_node": {
		"cpus": 0.5,
		"mem": 4096,
		"disk": 10240,
		"disk_type": "ROOT",
		"strategy": "parallel"
	},
    "name_node": {
		"cpus": 0.5,
		"mem": 4096,
		"disk": 10240,
		"disk_type": "ROOT"
	},
    "zkfc_node": {
		"cpus": 0.5,
		"mem": 4096
	},
	"data_node": {
	    "count": 3,
		"cpus": 0.5,
		"mem": 4096,
		"disk": 10240,
		"disk_type": "ROOT",
		"strategy": "parallel"
	}

HDFS的一些端口信息,三类:名字节点、日志节点、数据节点

{
    "hdfs": {
		"name_node_rpc_port": 9001,
		"name_node_http_port": 9002,
		"journal_node_rpc_port": 8485,
		"journal_node_http_port": 8480,
		"data_node_rpc_port": 9005,
		"data_node_http_port": 9006,
		"data_node_ipc_port": 9007,
		"permissions_enabled": false,
		"name_node_heartbeat_recheck_interval": 60000,
		"compress_image": true,
		"image_compression_codec": "org.apache.hadoop.io.compress.SnappyCodec"
   }
}

HDFS部署时可能出现的错误,主要是SSL配置问题:

2018/04/23 14:21:26 No $MESOS_SANDBOX/.ssl directory found. Cannot install certificate. Error: stat /var/lib/mesos/slave/slaves/e1d2e6c5-6a6e-455d-96cc-f2b17213c33f-S2/frameworks/e1d2e6c5-6a6e-455d-96cc-f2b17213c33f-0000/executors/hdfs.6e37052e-46be-11e8-83f7-16e3007733d8/runs/ac880e14-2b5e-4dda-b6ab-f1b784264f8b/.ssl: no such file or directory
2018/04/23 14:21:26 SDK Bootstrap successful.
Exception in thread "main" java.lang.NullPointerException
	at java.util.Base64$Decoder.decode(Base64.java:549)
	at com.mesosphere.sdk.hdfs.scheduler.Main.getHDFSUserAuthMappings(Main.java:122)
	at com.mesosphere.sdk.hdfs.scheduler.Main.createSchedulerBuilder(Main.java:61)
	at com.mesosphere.sdk.hdfs.scheduler.Main.main(Main.java:49)
I0423 14:21:30.079073    13 executor.cpp:933] Command exited with status 1 (pid: 15)
I0423 14:21:31.081287    10 checker_process.cpp:244] Stopped HTTP health check for task 'hdfs.6e37052e-46be-11e8-83f7-16e3007733d8'
I0423 14:21:31.082377    14 process.cpp:1068] Failed to accept socket: future discarded

可参考SSL in Mesos




发表评论

OpenID

电子邮件地址不会被公开。 必填项已用*标注

Anonymous

电子邮件地址不会被公开。 必填项已用*标注