Skip to content

HA configuration performs incorrectly #67

Open
@wolf31o2

Description

@wolf31o2

Problem:
I am running HDFS within my Mesos cluster. It is fully HA. I have configured a matcher to point to both NameNodes. However, when the first listed NameNode is in standby mode, the standby_namenode is never used.

Expected behavior:
Connection to the namenode NameNode succeeds, finds its in standby mode, and attempts to send to standby_namenode which is now the active NameNode.

Actual results:

2018-06-12 19:28:48 +0000 [warn]: #0 [out_webhdfs] webhdfs check request failed. (namenode: name-0-node.hdfs.mesos:9002, error: {"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error"}})

This is using td-agent 3.1.1 (fluentd 1.0.2) with the shipped fluent-plugin-webhdfs 1.2.2 plugin.

Forcing a NameNode failover caused logs to start flowing, again. However, this required manual intervention and I think the driver should do the correct thing in this state.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions