分布式部署副本未生效 #2716

shuai-smart · 2024-12-30T08:44:58Z

Problem Type (问题类型)

others (please edit later)

Before submit

我已经确认现有的 Issues 与 FAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

Server Version: 1.5.0 (Apache Release Version)
Backend: hstore

Your Question (问题描述)

hadoop01->110
hadoop02->115
hadoop03->140

1.server(110)
2.pd (110,115,140)
3.store(110,115,140)

110pd配置

spring:
  application:
    name: hugegraph-pd

management:
  metrics:
    export:
      prometheus:
        enabled: true
  endpoints:
    web:
      exposure:
        include: "*"

logging:
  config: 'file:./conf/log4j2.xml'
license:
  verify-path: ./conf/verify-license.json
  license-path: ./conf/hugegraph.license
grpc:
  port: 8787
  host: hadoop01

server:
  port: 8620

pd:
  data-path: ./pd_data
  patrol-interval: 1800
  initial-store-count: 3
  initial-store-list: hadoop01:8500,hadoop02:8500,hadoop03:8500

raft:
  address: hadoop01:8610
  peers-list: hadoop01:8610,hadoop02:8610,hadoop03:8610

store:
  max-down-time: 172800
  monitor_data_enabled: true
  monitor_data_interval: 1 minute
  monitor_data_retention: 1 day

partition:
  default-shard-count: 2
  store-max-shard-count: 5

115pd配置

spring:
  application:
    name: hugegraph-pd

management:
  metrics:
    export:
      prometheus:
        enabled: true
  endpoints:
    web:
      exposure:
        include: "*"

logging:
  config: 'file:./conf/log4j2.xml'
license:
  verify-path: ./conf/verify-license.json
  license-path: ./conf/hugegraph.license
grpc:
  port: 8787
  host: hadoop02

server:
  port: 8620

pd:
  data-path: ./pd_data
  patrol-interval: 1800
  initial-store-count: 3
  initial-store-list: hadoop01:8500,hadoop02:8500,hadoop03:8500

raft:
  address: hadoop02:8610
  peers-list: hadoop01:8610,hadoop02:8610,hadoop03:8787

store:
  max-down-time: 172800
  monitor_data_enabled: true
  monitor_data_interval: 1 minute
  monitor_data_retention: 1 day

partition:
  default-shard-count: 2
  store-max-shard-count: 5

140pd配置

spring:
  application:
    name: hugegraph-pd

management:
  metrics:
    export:
      prometheus:
        enabled: true
  endpoints:
    web:
      exposure:
        include: "*"

logging:
  config: 'file:./conf/log4j2.xml'
license:
  verify-path: ./conf/verify-license.json
  license-path: ./conf/hugegraph.license
grpc:
  port: 8787
  host: hadoop03

server:
  port: 8620

pd:
  data-path: ./pd_data
  patrol-interval: 1800
  initial-store-count: 3
  initial-store-list: hadoop01:8500,hadoop02:8500,hadoop03:8500

raft:
  address: hadoop03:8610
  peers-list: hadoop01:8610,hadoop02:8610,hadoop03:8610

store:
  max-down-time: 172800
  monitor_data_enabled: true
  monitor_data_interval: 1 minute
  monitor_data_retention: 1 day

partition:
  default-shard-count: 2
  store-max-shard-count: 5

110 store配置

pdserver:
  address: hadoop01:8787,hadoop02:8787,hadoop03:8787

management:
  metrics:
    export:
      prometheus:
        enabled: true
  endpoints:
    web:
      exposure:
        include: "*"

grpc:
  host: hadoop01
  port: 8500
  netty-server:
    max-inbound-message-size: 1000MB
raft:
  disruptorBufferSize: 1024
  address: hadoop01:8510
  max-log-file-size: 600000000000
  snapshotInterval: 1800
server:
  port: 8520

app:
  data-path: ./storage

spring:
  application:
    name: store-node-grpc-server
  profiles:
    active: default
    include: pd

logging:
  config: 'file:./conf/log4j2.xml'
  level:
    root: info

115store配置

pdserver:
  address: hadoop01:8787,hadoop02:8787,hadoop03:8787

management:
  metrics:
    export:
      prometheus:
        enabled: true
  endpoints:
    web:
      exposure:
        include: "*"

grpc:
  host: hadoop02
  port: 8500
  netty-server:
    max-inbound-message-size: 1000MB
raft:
  disruptorBufferSize: 1024
  address: hadoop02:8510
  max-log-file-size: 600000000000
  snapshotInterval: 1800
server:
  port: 8520

app:
  data-path: ./storage

spring:
  application:
    name: store-node-grpc-server
  profiles:
    active: default
    include: pd

logging:
  config: 'file:./conf/log4j2.xml'
  level:
    root: info

140store配置

pdserver:
  address: hadoop01:8787,hadoop02:8787,hadoop03:8787

management:
  metrics:
    export:
      prometheus:
        enabled: true
  endpoints:
    web:
      exposure:
        include: "*"

grpc:
  host: hadoop03
  port: 8500
  netty-server:
    max-inbound-message-size: 1000MB
raft:
  disruptorBufferSize: 1024
  address: hadoop03:8510
  max-log-file-size: 600000000000
  snapshotInterval: 1800
server:
  port: 8520

app:
  data-path: ./storage

spring:
  application:
    name: store-node-grpc-server
  profiles:
    active: default
    include: pd

logging:
  config: 'file:./conf/log4j2.xml'
  level:
    root: info

110server hugegraph.properties配置

gremlin.graph=org.apache.hugegraph.HugeFactory
hstore.partition_count=3

vertex.cache_type=l2
edge.cache_type=l2


backend=hstore
serializer=binary

store=hugegraph


pd.peers=hadoop01:8787,hadoop02:8787,hadoop03:8787


task.scheduler_type=distributed
task.schedule_period=10
task.retry=0
task.wait_timeout=10


search.text_analyzer=jieba
search.text_analyzer_mode=INDEX

110server rest-server.properties配置

restserver.url=http://hadoop01:8083
gremlinserver.url=http://hadoop01:8184

graphs=./conf/graphs

batch.max_write_ratio=80
batch.max_write_threads=0

arthas.telnet_port=8562
arthas.http_port=8561
arthas.ip=127.0.0.1
arthas.disabled_commands=jad

rpc.server_host=hadoop01
rpc.server_port=8092

server.id=server-1
server.role=master

log.slow_query_threshold=1000

memory_monitor.threshold=0.85
memory_monitor.period=2000

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

/v1/partitions
{
  "message": "OK",
  "data": {
    "partitions": [
      {
        "id": 0,
        "version": 0,
        "graphName": "hugegraph/g",
        "startKey": 0,
        "endKey": 21845,
        "workState": "PState_Normal",
        "shards": [
          {
            "address": "hadoop03:8500",
            "storeId": "6354478704347015657",
            "role": "Leader",
            "state": "SState_Normal",
            "progress": 0,
            "committedIndex": "3",
            "partitionId": 0
          }
        ],
        "timestamp": "2024-12-30 16:18:32"
      },
      {
        "id": 1,
        "version": 0,
        "graphName": "hugegraph/g",
        "startKey": 21845,
        "endKey": 43690,
        "workState": "PState_Normal",
        "shards": [
          {
            "address": "hadoop02:8500",
            "storeId": "8980730234736961059",
            "role": "Leader",
            "state": "SState_Normal",
            "progress": 0,
            "committedIndex": "2",
            "partitionId": 1
          }
        ],
        "timestamp": "2024-12-30 16:18:32"
      },
      {
        "id": 2,
        "version": 0,
        "graphName": "hugegraph/g",
        "startKey": 43690,
        "endKey": 65535,
        "workState": "PState_Normal",
        "shards": [
          {
            "address": "hadoop01:8500",
            "storeId": "5761019429836672228",
            "role": "Leader",
            "state": "SState_Normal",
            "progress": 0,
            "committedIndex": "3",
            "partitionId": 2
          }
        ],
        "timestamp": "2024-12-30 16:18:32"
      }
    ]
  },
  "status": 0
}

----------------------------------------------------------------------------------

v1/graphs
{
  "message": "OK",
  "data": {
    "graphs": [
      {
        "graphName": "hugegraph",
        "partitionCount": 3,
        "state": "PState_Normal",
        "partitions": [
          {
            "partitionId": 0,
            "graphName": "hugegraph",
            "workState": "PState_Normal",
            "startKey": 0,
            "endKey": 21845,
            "shards": [
              {
                "partitionId": 0,
                "storeId": 6.354478704347015e+18,
                "state": "SState_Normal",
                "role": "Leader",
                "progress": 0
              }
            ],
            "dataSize": 1
          },
          {
            "partitionId": 1,
            "graphName": "hugegraph",
            "workState": "PState_Normal",
            "startKey": 21845,
            "endKey": 43690,
            "shards": [
              {
                "partitionId": 1,
                "storeId": 8.980730234736962e+18,
                "state": "SState_Normal",
                "role": "Leader",
                "progress": 0
              }
            ],
            "dataSize": 1
          },
          {
            "partitionId": 2,
            "graphName": "hugegraph",
            "workState": "PState_Normal",
            "startKey": 43690,
            "endKey": 65535,
            "shards": [
              {
                "partitionId": 2,
                "storeId": 5.761019429836672e+18,
                "state": "SState_Normal",
                "role": "Leader",
                "progress": 0
              }
            ],
            "dataSize": 1
          }
        ],
        "dataSize": 3,
        "nodeCount": 0,
        "edgeCount": 0,
        "keyCount": 55
      }
    ]
  },
  "status": 0
}

----------------------------------------------------------------------------------

/v1/stores
{
  "message": "OK",
  "data": {
    "stores": [
      {
        "storeId": 5.761019429836672e+18,
        "address": "hadoop01:8500",
        "raftAddress": "hadoop01:8510",
        "version": "",
        "state": "Up",
        "deployPath": "/ssd01/build/bigdata/hugegraph/hugegraph/apache-hugegraph-incubating-1.5.0/apache-hugegraph-store-incubating-1.5.0/lib/hg-store-node-1.5.0.jar",
        "dataPath": "./storage",
        "startTimeStamp": 1735546322025,
        "registedTimeStamp": 1735546322025,
        "lastHeartBeat": 1735548184409,
        "capacity": 944990375936,
        "available": 449771778048,
        "partitionCount": 1,
        "graphSize": 1,
        "keyCount": 17,
        "leaderCount": 1,
        "serviceName": "hadoop01:8500-store",
        "serviceVersion": "",
        "serviceCreatedTimeStamp": 1735546321000,
        "partitions": [
          {
            "partitionId": 2,
            "graphName": "hugegraph",
            "role": "Leader",
            "workState": "PState_Normal",
            "dataSize": 4
          }
        ]
      },
      {
        "storeId": 8.980730234736962e+18,
        "address": "hadoop02:8500",
        "raftAddress": "hadoop02:8510",
        "version": "",
        "state": "Up",
        "deployPath": "/home/ws/apache-hugegraph-store-incubating-1.5.0/lib/hg-store-node-1.5.0.jar",
        "dataPath": "./storage",
        "startTimeStamp": 1735546322710,
        "registedTimeStamp": 1735546322710,
        "lastHeartBeat": 1735548184413,
        "capacity": 1963882692608,
        "available": 1024558968832,
        "partitionCount": 1,
        "graphSize": 1,
        "keyCount": 17,
        "leaderCount": 1,
        "serviceName": "hadoop02:8500-store",
        "serviceVersion": "",
        "serviceCreatedTimeStamp": 1735546322000,
        "partitions": [
          {
            "partitionId": 1,
            "graphName": "hugegraph",
            "role": "Leader",
            "workState": "PState_Normal",
            "dataSize": 3
          }
        ]
      },
      {
        "storeId": 6.354478704347015e+18,
        "address": "hadoop03:8500",
        "raftAddress": "hadoop03:8510",
        "version": "",
        "state": "Up",
        "deployPath": "/opt/software/ws/apache-hugegraph-incubating-1.5.0/apache-hugegraph-store-incubating-1.5.0/lib/hg-store-node-1.5.0.jar",
        "dataPath": "./storage",
        "startTimeStamp": 1735546321846,
        "registedTimeStamp": 1735546321846,
        "lastHeartBeat": 1735548184350,
        "capacity": 993795248128,
        "available": 233263923200,
        "partitionCount": 1,
        "graphSize": 1,
        "keyCount": 19,
        "leaderCount": 1,
        "serviceName": "hadoop03:8500-store",
        "serviceVersion": "",
        "serviceCreatedTimeStamp": 1735546116000,
        "partitions": [
          {
            "partitionId": 0,
            "graphName": "hugegraph",
            "role": "Leader",
            "workState": "PState_Normal",
            "dataSize": 3
          }
        ]
      }
    ],
    "stateCountMap": {
      "Up": 3
    },
    "numOfService": 3,
    "numOfNormalService": 3
  },
  "status": 0
}

dosubot · 2024-12-30T08:47:03Z

@shuai-smart 你好！我是一个能够帮助你解决 bug、回答问题并成为贡献者的机器人。我在这里帮助你，等待人类维护者的到来。

The issue with the distributed deployment of replicas not taking effect in your HugeGraph setup might be related to incorrect configuration settings. Here are a few things to check and adjust:

Replica Configuration: Ensure that the replica configuration is correctly set with valid integer values for replica counts and that both the datacenter name and the replica count are specified correctly. For example, a valid configuration would look like dc1:2, dc2:1 ^[1].
Partition and Shard Settings: Verify that the partition and shard settings in your configuration are correctly set. The default-shard-count and store-max-shard-count should align with your deployment needs. The initial-store-count and initial-store-list should accurately reflect the number of stores and their addresses.
Raft Configuration: Check the raft configuration to ensure that the address and peers-list are correctly set across all nodes. Any discrepancies here can lead to issues in the distributed setup.
PD and Store Configuration: Make sure that the pdserver addresses in the store configurations are consistent and correctly point to all PD nodes. Similarly, ensure that the pd configurations in each node are consistent and correctly list all store nodes.
Logs and Monitoring: Review the logs for any errors or warnings that might indicate configuration issues. Also, ensure that monitoring is enabled and check the metrics to see if there are any anomalies.

By ensuring these configurations are correct, you should be able to resolve the issue with the distributed deployment of replicas not taking effect.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

JackyYangPassion · 2024-12-30T12:10:34Z

PD 配置项

# replicas 设置要奇数
default-shard-count = 2  设置问题

@shuai-smart

JackyYangPassion · 2024-12-30T15:29:05Z

#2611
这个监控相关的 DashBoard 可以参考，有助于分析定位问题

github-actions · 2025-01-14T21:04:22Z

Due to the lack of activity, the current issue is marked as stale and will be closed after 20 days, any update will remove the stale label

dosubot bot added pd PD module store Store module labels Dec 30, 2024

github-project-automation bot added this to HugeGraph PD-Store Tasks Dec 30, 2024

github-project-automation bot moved this to In progress in HugeGraph PD-Store Tasks Dec 30, 2024

github-actions bot added the inactive label Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

分布式部署副本未生效 #2716

分布式部署副本未生效 #2716

shuai-smart commented Dec 30, 2024 •

edited

Loading

dosubot bot commented Dec 30, 2024

JackyYangPassion commented Dec 30, 2024 •

edited

Loading

JackyYangPassion commented Dec 30, 2024 •

edited

Loading

github-actions bot commented Jan 14, 2025

分布式部署副本未生效 #2716

分布式部署副本未生效 #2716

Comments

shuai-smart commented Dec 30, 2024 • edited Loading

Problem Type (问题类型)

Before submit

Environment (环境信息)

Your Question (问题描述)

Vertex/Edge example (问题点 / 边数据举例)

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

dosubot bot commented Dec 30, 2024

JackyYangPassion commented Dec 30, 2024 • edited Loading

PD 配置项

JackyYangPassion commented Dec 30, 2024 • edited Loading

github-actions bot commented Jan 14, 2025

shuai-smart commented Dec 30, 2024 •

edited

Loading

JackyYangPassion commented Dec 30, 2024 •

edited

Loading

JackyYangPassion commented Dec 30, 2024 •

edited

Loading