增加返回json结果的参数 by jweden33 · Pull Request #153 · RapidAI/TableStructureRec

jweden33 · 2025-06-03T13:11:02Z

可返回json格式结果

Copilot

Pull request overview

This PR refactors WiredTableRecognition in wired_table_rec/main.py to return a JSON-friendly tuple result instead of the previous WiredTableOutput dataclass. It also replaces the TSRCycleCenterNet/TSRUnet model backends with TableLineRecognition/TableLineRecognitionPlus, adds a new process_ocr_result method for trimming empty first rows/columns, and expands the fill_blank_rec (renamed to re_rec) method to optionally perform actual OCR re-recognition on blank cells.

Changes:

WiredTableInput and WiredTableOutput dataclasses and ModelType enum are removed; WiredTableRecognition.__init__ now accepts table_model_path and version directly
__call__ now returns a 6-element tuple (table_str, elapse, sorted_polygons, sorted_logi_points, sorted_ocr_boxes_res, adjust_dict) instead of a WiredTableOutput dataclass
New process_ocr_result method and (unused) adjust_table_cells method added; re_rec extended with an optional re-recognition path

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-06T08:27:25Z

wired_table_rec/main.py

-    table_results = table_rec(args.img_path, ocr_result)
-    print(table_results.pred_html)
-    print(f"cost: {table_results.elapse:.5f}")
+    table_str, elapse = table_rec(args.img_path, ocr_result)


In main(), line 343 unpacks the return value of table_rec(args.img_path, ocr_result) into only 2 variables (table_str, elapse), but __call__ returns a 6-element tuple on success: (table_str, table_elapse, sorted_polygons, sorted_logi_points, sorted_ocr_boxes_res, adjust_dict). This will raise a ValueError: too many values to unpack at runtime whenever the call succeeds.

Suggested change

table_str, elapse = table_rec(args.img_path, ocr_result)

table_str, elapse, _, _, _, _ = table_rec(args.img_path, ocr_result)

Copilot · 2026-03-06T08:27:26Z

wired_table_rec/main.py

+            pad_img = cv2.copyMakeBorder(
+                crop_img, 5, 5, 100, 100, cv2.BORDER_CONSTANT, value=(255, 255, 255)
+            )
+            rec_res, _ = self.ocr(pad_img, use_det=False, use_cls=True, use_rec=True)


In re_rec, when rec_again=True, the code calls self.ocr(...) on line 291. However, self.ocr can be None if rapidocr_onnxruntime is not installed (set to None in __init__). While the guard at the top of __call__ raises an error if self.ocr is None and ocr_result is None, it does NOT prevent re_rec with rec_again=True from being called when ocr_result is supplied externally but self.ocr is None. This will result in a TypeError: 'NoneType' object is not callable at line 291.

Copilot · 2026-03-06T08:27:26Z

wired_table_rec/main.py

-            t_rec_ocr_list = self.transform_res(cell_box_det_map, polygons, logi_points)
+            t_rec_ocr_list_dict = self.transform_res(cell_box_det_map, polygons, logi_points)
+            # 第一行或者第一列为空时，调整代码
+            #adjust_dict = self.adjust_table_cells(t_rec_ocr_list_dict)


The commented-out line 104 (#adjust_dict = self.adjust_table_cells(t_rec_ocr_list_dict)) should be cleaned up. Leaving commented-out code in production code is a maintainability concern, especially when the active alternative (process_ocr_result) is right below it.

Suggested change

#adjust_dict = self.adjust_table_cells(t_rec_ocr_list_dict)

Copilot · 2026-03-06T08:27:26Z

wired_table_rec/main.py

+            t_rec_ocr_list_dict = self.transform_res(cell_box_det_map, polygons, logi_points)
+            # 第一行或者第一列为空时，调整代码
+            #adjust_dict = self.adjust_table_cells(t_rec_ocr_list_dict)
+            adjust_dict = self.process_ocr_result(t_rec_ocr_list_dict)
            # 将每个单元格中的ocr识别结果排序和同行合并，输出的html能完整保留文字的换行格式
-            t_rec_ocr_list = self.sort_and_gather_ocr_res(t_rec_ocr_list)
+            t_rec_ocr_list = self.sort_and_gather_ocr_res(t_rec_ocr_list_dict)


The process_ocr_result method mutates the t_logic_box lists in the entries of the ocr_result list passed to it (lines 177–178 and 190–192: entry['t_logic_box'][0] -= 1 etc.). Since t_rec_ocr_list_dict is also referenced as t_rec_ocr_list_dict and then passed to sort_and_gather_ocr_res on line 107, mutating these entries in-place after returning from process_ocr_result (whose return value adjust_dict is stored separately) means the modifications affect the same objects that sort_and_gather_ocr_res will read. This unintended mutation could lead to double-adjustment of the logic boxes.

Copilot · 2026-03-06T08:27:27Z

wired_table_rec/main.py

+            ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][0] != 0 or entry['t_logic_box'][1] != 0]
+            # 调整剩余字典的行数
+            for entry in ocr_result:
+                entry['t_logic_box'][0] -= 1
+                entry['t_logic_box'][1] -= 1
+
+        # 删除第一列的字典，并调整其余字典的列数
+        first_col_empty = [entry for entry in ocr_result if
+                           entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0 and entry['t_ocr_res'][0][
+                               1] == '']
+
+        if len(first_col_empty) == len(
+                [entry for entry in ocr_result if entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0]):
+            # 如果第一列的所有单元格都为空，删除第一列
+            ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][2] != 0 or entry['t_logic_box'][3] != 0]
+            # 调整剩余字典的列数
+            for entry in ocr_result:
+                entry['t_logic_box'][2] -= 1
+                entry['t_logic_box'][3] -= 1


In process_ocr_result, line 174 removes all entries where t_logic_box[0] != 0 or t_logic_box[1] != 0. This condition using or is incorrect: it will remove entries that have t_logic_box[0] == 0 (start of first row) as long as t_logic_box[1] != 0, keeping only the entries where BOTH are 0. The correct logic to filter out first-row entries would be entry['t_logic_box'][0] != 0 (entries whose row start is not 0). Similarly, line 188 uses entry['t_logic_box'][2] != 0 or entry['t_logic_box'][3] != 0 instead of entry['t_logic_box'][2] != 0.

Copilot · 2026-03-06T08:27:27Z

wired_table_rec/main.py

                    sorted_polygons,
                    logi_points[idx_list],
-                    time.perf_counter() - s,
+                    [],


The early returns in __call__ are inconsistent in the number of elements they return compared to the successful path:

Line 74 returns a 5-element tuple: ("", 0.0, None, None, None)

Lines 89-95 (when need_ocr=False) return a 5-element tuple: ("", elapsed, sorted_polygons, logi_points[idx_list], [])

Line 125 returns a 5-element tuple: ("", 0.0, None, None, None)

Lines 126-134 (success path) return a 6-element tuple including adjust_dict

Callers must handle varying-length tuples, making unpacking error-prone. All return paths should consistently return the same number of elements.

Suggested change

[],

[],

None,

Copilot · 2026-03-06T08:27:27Z

wired_table_rec/main.py

        ocr_result: Optional[List[Union[List[List[float]], str, str]]] = None,
        **kwargs,
-    ) -> WiredTableOutput:
+    ) -> Tuple[str, float, Any, Any, Any]:


The __call__ method return type annotation is Tuple[str, float, Any, Any, Any] (5 elements), but the actual successful return statement on lines 126-134 returns a tuple with 6 elements: table_str, table_elapse, sorted_polygons, sorted_logi_points, sorted_ocr_boxes_res, and adjust_dict. The annotation should be Tuple[str, float, Any, Any, Any, Any] (6 elements) to match the actual return.

Suggested change

) -> Tuple[str, float, Any, Any, Any]:

) -> Tuple[str, float, Any, Any, Any, Any]:

Copilot · 2026-03-06T08:27:27Z

wired_table_rec/main.py

+        first_row_empty = [entry for entry in ocr_result if
+                           entry['t_logic_box'][0] == 0 and entry['t_logic_box'][1] == 0 and entry['t_ocr_res'][0][
+                               1] == '']
+
+        if len(first_row_empty) == len(
+                [entry for entry in ocr_result if entry['t_logic_box'][0] == 0 and entry['t_logic_box'][1] == 0]):
+            # 如果第一行的所有单元格都为空，删除第一行
+            ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][0] != 0 or entry['t_logic_box'][1] != 0]
+            # 调整剩余字典的行数
+            for entry in ocr_result:
+                entry['t_logic_box'][0] -= 1
+                entry['t_logic_box'][1] -= 1
+
+        # 删除第一列的字典，并调整其余字典的列数
+        first_col_empty = [entry for entry in ocr_result if
+                           entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0 and entry['t_ocr_res'][0][
+                               1] == '']
+
+        if len(first_col_empty) == len(
+                [entry for entry in ocr_result if entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0]):
+            # 如果第一列的所有单元格都为空，删除第一列
+            ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][2] != 0 or entry['t_logic_box'][3] != 0]


The process_ocr_result method checks for first-row emptiness using entry['t_logic_box'][0] == 0 and entry['t_logic_box'][1] == 0. However, t_logic_box is [row_start, row_end, col_start, col_end], so t_logic_box[1] is row_end. Cells in the first row that span multiple rows (e.g. a merged cell spanning rows 0–1) have row_start=0 but row_end=1, so they would be excluded from first_row_empty. This means the condition len(first_row_empty) == len([...]) would fail for merged-cell first rows, causing the first row to never be removed. The check should likely use only entry['t_logic_box'][0] == 0 to identify all cells whose row starts at row 0. The same issue applies to the first-column check on line 182 (t_logic_box[2] == 0 and t_logic_box[3] == 0).

Suggested change

first_row_empty = [entry for entry in ocr_result if

entry['t_logic_box'][0] == 0 and entry['t_logic_box'][1] == 0 and entry['t_ocr_res'][0][

1] == '']

if len(first_row_empty) == len(

[entry for entry in ocr_result if entry['t_logic_box'][0] == 0 and entry['t_logic_box'][1] == 0]):

# 如果第一行的所有单元格都为空，删除第一行

ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][0] != 0 or entry['t_logic_box'][1] != 0]

# 调整剩余字典的行数

for entry in ocr_result:

entry['t_logic_box'][0] -= 1

entry['t_logic_box'][1] -= 1

# 删除第一列的字典，并调整其余字典的列数

first_col_empty = [entry for entry in ocr_result if

entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0 and entry['t_ocr_res'][0][

1] == '']

if len(first_col_empty) == len(

[entry for entry in ocr_result if entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0]):

# 如果第一列的所有单元格都为空，删除第一列

ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][2] != 0 or entry['t_logic_box'][3] != 0]

first_row_empty = [

entry

for entry in ocr_result

if entry['t_logic_box'][0] == 0 and entry['t_ocr_res'][0][1] == ''

]

if len(first_row_empty) == len(

[entry for entry in ocr_result if entry['t_logic_box'][0] == 0]

):

# 如果第一行的所有单元格都为空，删除第一行

ocr_result = [

entry for entry in ocr_result if entry['t_logic_box'][0] != 0

]

# 调整剩余字典的行数

for entry in ocr_result:

entry['t_logic_box'][0] -= 1

entry['t_logic_box'][1] -= 1

# 删除第一列的字典，并调整其余字典的列数

first_col_empty = [

entry

for entry in ocr_result

if entry['t_logic_box'][2] == 0 and entry['t_ocr_res'][0][1] == ''

]

if len(first_col_empty) == len(

[entry for entry in ocr_result if entry['t_logic_box'][2] == 0]

):

# 如果第一列的所有单元格都为空，删除第一列

ocr_result = [

entry for entry in ocr_result if entry['t_logic_box'][2] != 0

]

Copilot · 2026-03-06T08:27:28Z

wired_table_rec/main.py

+    def adjust_table_cells(self, t_rec_ocr_list_dict):
+        """
+        调整表格单元格，去掉第一行和/或第一列的单元格，
+        并更新剩余单元格的行列起始和结束位置。
+
+        参数:
+            t_rec_ocr_list_dict (list): 原始表格单元格识别结果，格式为
+                [
+                    {
+                        "t_box": [xmin, ymin, xmax, ymax],
+                        "t_logic_box": [row_start, row_end, col_start, col_end],
+                        "t_ocr_res": [[box, text], ...]
+                    },
+                    ...
+                ]
+
+        返回:
+            list: 调整后的表格单元格识别结果，格式与输入相同。
+        """
+        # 新的结果列表
+        adjusted_result = []
+
+        # 记录是否第一行和第一列的单元格已被删除
+        remove_first_row = False
+        remove_first_col = False
+
+        # 检查并移除第一行
+        if all(cell and not cell[1] for cell in t_rec_ocr_list_dict[0].get("t_ocr_res", [])):
+            remove_first_row = True
+
+        # 检查并移除第一列
+        if all(row.get("t_ocr_res") and not row["t_ocr_res"][0][1] for row in t_rec_ocr_list_dict):
+            remove_first_col = True
+
+        # 遍历原始结果进行调整
+        for i, row in enumerate(t_rec_ocr_list_dict):
+            adjusted_row = []
+
+            # 如果是第一行并且需要删除，跳过这行
+            if remove_first_row and i == 0:
+                continue
+
+            for j, cell in enumerate(row.get("t_ocr_res", [])):
+                # 如果是第一列并且需要删除，跳过这一列
+                if remove_first_col and j == 0:
+                    continue
+
+                # 更新当前单元格的逻辑位置
+                adjusted_cell = {
+                    "t_box": row.get("t_box"),
+                    "t_logic_box": [
+                        row["t_logic_box"][0] - 1 if i > 0 else row["t_logic_box"][0],
+                        row["t_logic_box"][1] - 1 if i > 0 else row["t_logic_box"][1],
+                        row["t_logic_box"][2] - 1 if j > 0 else row["t_logic_box"][2],
+                        row["t_logic_box"][3] - 1 if j > 0 else row["t_logic_box"][3]
+                    ],
+                    "t_ocr_res": cell
+                }
+                adjusted_row.append(adjusted_cell)
+
+            if adjusted_row:
+                adjusted_result.append(adjusted_row)
+
+        return adjusted_result
+


The adjust_table_cells method is defined but never called (the only call to it, on line 104, is commented out in favor of process_ocr_result). This is dead code that adds unnecessary complexity to the file. It should either be removed or its purpose compared with process_ocr_result to decide which one to keep.

Suggested change

def adjust_table_cells(self, t_rec_ocr_list_dict):

"""

调整表格单元格，去掉第一行和/或第一列的单元格，

并更新剩余单元格的行列起始和结束位置。

参数:

t_rec_ocr_list_dict (list): 原始表格单元格识别结果，格式为

[

{

"t_box": [xmin, ymin, xmax, ymax],

"t_logic_box": [row_start, row_end, col_start, col_end],

"t_ocr_res": [[box, text], ...]

},

...

]

返回:

list: 调整后的表格单元格识别结果，格式与输入相同。

"""

# 新的结果列表

adjusted_result = []

# 记录是否第一行和第一列的单元格已被删除

remove_first_row = False

remove_first_col = False

# 检查并移除第一行

if all(cell and not cell[1] for cell in t_rec_ocr_list_dict[0].get("t_ocr_res", [])):

remove_first_row = True

# 检查并移除第一列

if all(row.get("t_ocr_res") and not row["t_ocr_res"][0][1] for row in t_rec_ocr_list_dict):

remove_first_col = True

# 遍历原始结果进行调整

for i, row in enumerate(t_rec_ocr_list_dict):

adjusted_row = []

# 如果是第一行并且需要删除，跳过这行

if remove_first_row and i == 0:

continue

for j, cell in enumerate(row.get("t_ocr_res", [])):

# 如果是第一列并且需要删除，跳过这一列

if remove_first_col and j == 0:

continue

# 更新当前单元格的逻辑位置

adjusted_cell = {

"t_box": row.get("t_box"),

"t_logic_box": [

row["t_logic_box"][0] - 1 if i > 0 else row["t_logic_box"][0],

row["t_logic_box"][1] - 1 if i > 0 else row["t_logic_box"][1],

row["t_logic_box"][2] - 1 if j > 0 else row["t_logic_box"][2],

row["t_logic_box"][3] - 1 if j > 0 else row["t_logic_box"][3]

],

"t_ocr_res": cell

}

adjusted_row.append(adjusted_cell)

if adjusted_row:

adjusted_result.append(adjusted_row)

return adjusted_result

Update main.py

9aed350

可返回json格式结果

SWHL requested a review from Copilot March 6, 2026 08:24

Copilot started reviewing on behalf of SWHL March 6, 2026 08:24 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

增加返回json结果的参数#153

增加返回json结果的参数#153
jweden33 wants to merge 1 commit intoRapidAI:mainfrom
jweden33:main

jweden33 commented Jun 3, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Copilot AI Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	table_str, elapse = table_rec(args.img_path, ocr_result)
	table_str, elapse, _, _, _, _ = table_rec(args.img_path, ocr_result)

	) -> Tuple[str, float, Any, Any, Any]:
	) -> Tuple[str, float, Any, Any, Any, Any]:

Uh oh!

Conversation

jweden33 commented Jun 3, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants