get_text() cannot correctly partition blocks on Chinese documents

### Description of the bug

Complete text in the source file, in get_text() method is divided into two blocks. On English documents, the text blocks are well divided.

**source file:** 
![image](https://github.com/pymupdf/PyMuPDF/assets/68219213/d1243ee7-908d-4b50-8691-0d9d939993ee)
**get_text() result:**

```
内容提要
 :
本文从实证上研究中国金融发展和经济增长之间的关系。由于金融
发展主要包括金融中介体发展和股票市场发展两部分
 ,
本文依次研究中国金融中介
体发展和经济增长之间的实证关系、中国股票市场发展和经济增长之间的实证关系
以及中国金融中介体发展和股票市场发展之间的实证关系。本文的结论是
 ,
在中国
---------------
金融中介体发展和经济增长之间有显著的、很强的正相关关系
 ,
这意味着我国金融中
介体的发展有可能促进经济增长
 ,
同时也意味着金融中介体的发展不能滞后于经济
增长
 ;
在中国股票市场发展和经济增长之间有不显著的负相关关系
 ,
这意味着我国股
票市场发展对经济增长的作用是极其有限的
 ,
即使有那么一点点
 ,
也是不利的
 ;
在中
国金融中介体发展和股票市场发展之间有显著的正相关关系
 ,
这意味着在现阶段的
---------------
我国
 ,
股票市场的发展并不排斥金融中介体的发展。
```

[source pdf]([1.pdf](https://github.com/pymupdf/PyMuPDF/files/13829865/1.pdf))


Thanks for your help!

### How to reproduce the bug

```
def get_text_pdf(input_pdf):
    pdf = fitz.open(input_pdf)
    for page in pdf:    
        d = page.get_text("dict", sort=True)["blocks"]
        for i in d:
            for k, v in i.items():
                if k == "lines":
                    for i in v:
                        for k1, v1 in i.items():
                            if k1 == "spans":
                                for j in v1:
                                    print(j["text"])
            print("---------------")
```

### PyMuPDF version

1.23.9rc1

### Operating system

Linux

### Python version

3.9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

get_text() cannot correctly partition blocks on Chinese documents #2974

Description of the bug

How to reproduce the bug

PyMuPDF version

Operating system

Python version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

get_text() cannot correctly partition blocks on Chinese documents #2974

Description

Description of the bug

How to reproduce the bug

PyMuPDF version

Operating system

Python version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions