Skip to content

增加返回json结果的参数#153

Open
jweden33 wants to merge 1 commit intoRapidAI:mainfrom
jweden33:main
Open

增加返回json结果的参数#153
jweden33 wants to merge 1 commit intoRapidAI:mainfrom
jweden33:main

Conversation

@jweden33
Copy link

@jweden33 jweden33 commented Jun 3, 2025

可返回json格式结果

可返回json格式结果
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors WiredTableRecognition in wired_table_rec/main.py to return a JSON-friendly tuple result instead of the previous WiredTableOutput dataclass. It also replaces the TSRCycleCenterNet/TSRUnet model backends with TableLineRecognition/TableLineRecognitionPlus, adds a new process_ocr_result method for trimming empty first rows/columns, and expands the fill_blank_rec (renamed to re_rec) method to optionally perform actual OCR re-recognition on blank cells.

Changes:

  • WiredTableInput and WiredTableOutput dataclasses and ModelType enum are removed; WiredTableRecognition.__init__ now accepts table_model_path and version directly
  • __call__ now returns a 6-element tuple (table_str, elapse, sorted_polygons, sorted_logi_points, sorted_ocr_boxes_res, adjust_dict) instead of a WiredTableOutput dataclass
  • New process_ocr_result method and (unused) adjust_table_cells method added; re_rec extended with an optional re-recognition path

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

table_results = table_rec(args.img_path, ocr_result)
print(table_results.pred_html)
print(f"cost: {table_results.elapse:.5f}")
table_str, elapse = table_rec(args.img_path, ocr_result)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In main(), line 343 unpacks the return value of table_rec(args.img_path, ocr_result) into only 2 variables (table_str, elapse), but __call__ returns a 6-element tuple on success: (table_str, table_elapse, sorted_polygons, sorted_logi_points, sorted_ocr_boxes_res, adjust_dict). This will raise a ValueError: too many values to unpack at runtime whenever the call succeeds.

Suggested change
table_str, elapse = table_rec(args.img_path, ocr_result)
table_str, elapse, _, _, _, _ = table_rec(args.img_path, ocr_result)

Copilot uses AI. Check for mistakes.
pad_img = cv2.copyMakeBorder(
crop_img, 5, 5, 100, 100, cv2.BORDER_CONSTANT, value=(255, 255, 255)
)
rec_res, _ = self.ocr(pad_img, use_det=False, use_cls=True, use_rec=True)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In re_rec, when rec_again=True, the code calls self.ocr(...) on line 291. However, self.ocr can be None if rapidocr_onnxruntime is not installed (set to None in __init__). While the guard at the top of __call__ raises an error if self.ocr is None and ocr_result is None, it does NOT prevent re_rec with rec_again=True from being called when ocr_result is supplied externally but self.ocr is None. This will result in a TypeError: 'NoneType' object is not callable at line 291.

Copilot uses AI. Check for mistakes.
t_rec_ocr_list = self.transform_res(cell_box_det_map, polygons, logi_points)
t_rec_ocr_list_dict = self.transform_res(cell_box_det_map, polygons, logi_points)
# 第一行或者第一列为空时,调整代码
#adjust_dict = self.adjust_table_cells(t_rec_ocr_list_dict)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out line 104 (#adjust_dict = self.adjust_table_cells(t_rec_ocr_list_dict)) should be cleaned up. Leaving commented-out code in production code is a maintainability concern, especially when the active alternative (process_ocr_result) is right below it.

Suggested change
#adjust_dict = self.adjust_table_cells(t_rec_ocr_list_dict)

Copilot uses AI. Check for mistakes.
Comment on lines +102 to +107
t_rec_ocr_list_dict = self.transform_res(cell_box_det_map, polygons, logi_points)
# 第一行或者第一列为空时,调整代码
#adjust_dict = self.adjust_table_cells(t_rec_ocr_list_dict)
adjust_dict = self.process_ocr_result(t_rec_ocr_list_dict)
# 将每个单元格中的ocr识别结果排序和同行合并,输出的html能完整保留文字的换行格式
t_rec_ocr_list = self.sort_and_gather_ocr_res(t_rec_ocr_list)
t_rec_ocr_list = self.sort_and_gather_ocr_res(t_rec_ocr_list_dict)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process_ocr_result method mutates the t_logic_box lists in the entries of the ocr_result list passed to it (lines 177–178 and 190–192: entry['t_logic_box'][0] -= 1 etc.). Since t_rec_ocr_list_dict is also referenced as t_rec_ocr_list_dict and then passed to sort_and_gather_ocr_res on line 107, mutating these entries in-place after returning from process_ocr_result (whose return value adjust_dict is stored separately) means the modifications affect the same objects that sort_and_gather_ocr_res will read. This unintended mutation could lead to double-adjustment of the logic boxes.

Copilot uses AI. Check for mistakes.
Comment on lines +174 to +192
ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][0] != 0 or entry['t_logic_box'][1] != 0]
# 调整剩余字典的行数
for entry in ocr_result:
entry['t_logic_box'][0] -= 1
entry['t_logic_box'][1] -= 1

# 删除第一列的字典,并调整其余字典的列数
first_col_empty = [entry for entry in ocr_result if
entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0 and entry['t_ocr_res'][0][
1] == '']

if len(first_col_empty) == len(
[entry for entry in ocr_result if entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0]):
# 如果第一列的所有单元格都为空,删除第一列
ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][2] != 0 or entry['t_logic_box'][3] != 0]
# 调整剩余字典的列数
for entry in ocr_result:
entry['t_logic_box'][2] -= 1
entry['t_logic_box'][3] -= 1
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In process_ocr_result, line 174 removes all entries where t_logic_box[0] != 0 or t_logic_box[1] != 0. This condition using or is incorrect: it will remove entries that have t_logic_box[0] == 0 (start of first row) as long as t_logic_box[1] != 0, keeping only the entries where BOTH are 0. The correct logic to filter out first-row entries would be entry['t_logic_box'][0] != 0 (entries whose row start is not 0). Similarly, line 188 uses entry['t_logic_box'][2] != 0 or entry['t_logic_box'][3] != 0 instead of entry['t_logic_box'][2] != 0.

Copilot uses AI. Check for mistakes.
sorted_polygons,
logi_points[idx_list],
time.perf_counter() - s,
[],
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The early returns in __call__ are inconsistent in the number of elements they return compared to the successful path:

  • Line 74 returns a 5-element tuple: ("", 0.0, None, None, None)
  • Lines 89-95 (when need_ocr=False) return a 5-element tuple: ("", elapsed, sorted_polygons, logi_points[idx_list], [])
  • Line 125 returns a 5-element tuple: ("", 0.0, None, None, None)
  • Lines 126-134 (success path) return a 6-element tuple including adjust_dict

Callers must handle varying-length tuples, making unpacking error-prone. All return paths should consistently return the same number of elements.

Suggested change
[],
[],
None,

Copilot uses AI. Check for mistakes.
ocr_result: Optional[List[Union[List[List[float]], str, str]]] = None,
**kwargs,
) -> WiredTableOutput:
) -> Tuple[str, float, Any, Any, Any]:
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __call__ method return type annotation is Tuple[str, float, Any, Any, Any] (5 elements), but the actual successful return statement on lines 126-134 returns a tuple with 6 elements: table_str, table_elapse, sorted_polygons, sorted_logi_points, sorted_ocr_boxes_res, and adjust_dict. The annotation should be Tuple[str, float, Any, Any, Any, Any] (6 elements) to match the actual return.

Suggested change
) -> Tuple[str, float, Any, Any, Any]:
) -> Tuple[str, float, Any, Any, Any, Any]:

Copilot uses AI. Check for mistakes.
Comment on lines +167 to +188
first_row_empty = [entry for entry in ocr_result if
entry['t_logic_box'][0] == 0 and entry['t_logic_box'][1] == 0 and entry['t_ocr_res'][0][
1] == '']

if len(first_row_empty) == len(
[entry for entry in ocr_result if entry['t_logic_box'][0] == 0 and entry['t_logic_box'][1] == 0]):
# 如果第一行的所有单元格都为空,删除第一行
ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][0] != 0 or entry['t_logic_box'][1] != 0]
# 调整剩余字典的行数
for entry in ocr_result:
entry['t_logic_box'][0] -= 1
entry['t_logic_box'][1] -= 1

# 删除第一列的字典,并调整其余字典的列数
first_col_empty = [entry for entry in ocr_result if
entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0 and entry['t_ocr_res'][0][
1] == '']

if len(first_col_empty) == len(
[entry for entry in ocr_result if entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0]):
# 如果第一列的所有单元格都为空,删除第一列
ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][2] != 0 or entry['t_logic_box'][3] != 0]
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process_ocr_result method checks for first-row emptiness using entry['t_logic_box'][0] == 0 and entry['t_logic_box'][1] == 0. However, t_logic_box is [row_start, row_end, col_start, col_end], so t_logic_box[1] is row_end. Cells in the first row that span multiple rows (e.g. a merged cell spanning rows 0–1) have row_start=0 but row_end=1, so they would be excluded from first_row_empty. This means the condition len(first_row_empty) == len([...]) would fail for merged-cell first rows, causing the first row to never be removed. The check should likely use only entry['t_logic_box'][0] == 0 to identify all cells whose row starts at row 0. The same issue applies to the first-column check on line 182 (t_logic_box[2] == 0 and t_logic_box[3] == 0).

Suggested change
first_row_empty = [entry for entry in ocr_result if
entry['t_logic_box'][0] == 0 and entry['t_logic_box'][1] == 0 and entry['t_ocr_res'][0][
1] == '']
if len(first_row_empty) == len(
[entry for entry in ocr_result if entry['t_logic_box'][0] == 0 and entry['t_logic_box'][1] == 0]):
# 如果第一行的所有单元格都为空,删除第一行
ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][0] != 0 or entry['t_logic_box'][1] != 0]
# 调整剩余字典的行数
for entry in ocr_result:
entry['t_logic_box'][0] -= 1
entry['t_logic_box'][1] -= 1
# 删除第一列的字典,并调整其余字典的列数
first_col_empty = [entry for entry in ocr_result if
entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0 and entry['t_ocr_res'][0][
1] == '']
if len(first_col_empty) == len(
[entry for entry in ocr_result if entry['t_logic_box'][2] == 0 and entry['t_logic_box'][3] == 0]):
# 如果第一列的所有单元格都为空,删除第一列
ocr_result = [entry for entry in ocr_result if entry['t_logic_box'][2] != 0 or entry['t_logic_box'][3] != 0]
first_row_empty = [
entry
for entry in ocr_result
if entry['t_logic_box'][0] == 0 and entry['t_ocr_res'][0][1] == ''
]
if len(first_row_empty) == len(
[entry for entry in ocr_result if entry['t_logic_box'][0] == 0]
):
# 如果第一行的所有单元格都为空,删除第一行
ocr_result = [
entry for entry in ocr_result if entry['t_logic_box'][0] != 0
]
# 调整剩余字典的行数
for entry in ocr_result:
entry['t_logic_box'][0] -= 1
entry['t_logic_box'][1] -= 1
# 删除第一列的字典,并调整其余字典的列数
first_col_empty = [
entry
for entry in ocr_result
if entry['t_logic_box'][2] == 0 and entry['t_ocr_res'][0][1] == ''
]
if len(first_col_empty) == len(
[entry for entry in ocr_result if entry['t_logic_box'][2] == 0]
):
# 如果第一列的所有单元格都为空,删除第一列
ocr_result = [
entry for entry in ocr_result if entry['t_logic_box'][2] != 0
]

Copilot uses AI. Check for mistakes.
Comment on lines +196 to +260
def adjust_table_cells(self, t_rec_ocr_list_dict):
"""
调整表格单元格,去掉第一行和/或第一列的单元格,
并更新剩余单元格的行列起始和结束位置。

参数:
t_rec_ocr_list_dict (list): 原始表格单元格识别结果,格式为
[
{
"t_box": [xmin, ymin, xmax, ymax],
"t_logic_box": [row_start, row_end, col_start, col_end],
"t_ocr_res": [[box, text], ...]
},
...
]

返回:
list: 调整后的表格单元格识别结果,格式与输入相同。
"""
# 新的结果列表
adjusted_result = []

# 记录是否第一行和第一列的单元格已被删除
remove_first_row = False
remove_first_col = False

# 检查并移除第一行
if all(cell and not cell[1] for cell in t_rec_ocr_list_dict[0].get("t_ocr_res", [])):
remove_first_row = True

# 检查并移除第一列
if all(row.get("t_ocr_res") and not row["t_ocr_res"][0][1] for row in t_rec_ocr_list_dict):
remove_first_col = True

# 遍历原始结果进行调整
for i, row in enumerate(t_rec_ocr_list_dict):
adjusted_row = []

# 如果是第一行并且需要删除,跳过这行
if remove_first_row and i == 0:
continue

for j, cell in enumerate(row.get("t_ocr_res", [])):
# 如果是第一列并且需要删除,跳过这一列
if remove_first_col and j == 0:
continue

# 更新当前单元格的逻辑位置
adjusted_cell = {
"t_box": row.get("t_box"),
"t_logic_box": [
row["t_logic_box"][0] - 1 if i > 0 else row["t_logic_box"][0],
row["t_logic_box"][1] - 1 if i > 0 else row["t_logic_box"][1],
row["t_logic_box"][2] - 1 if j > 0 else row["t_logic_box"][2],
row["t_logic_box"][3] - 1 if j > 0 else row["t_logic_box"][3]
],
"t_ocr_res": cell
}
adjusted_row.append(adjusted_cell)

if adjusted_row:
adjusted_result.append(adjusted_row)

return adjusted_result

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The adjust_table_cells method is defined but never called (the only call to it, on line 104, is commented out in favor of process_ocr_result). This is dead code that adds unnecessary complexity to the file. It should either be removed or its purpose compared with process_ocr_result to decide which one to keep.

Suggested change
def adjust_table_cells(self, t_rec_ocr_list_dict):
"""
调整表格单元格去掉第一行和/或第一列的单元格
并更新剩余单元格的行列起始和结束位置
参数:
t_rec_ocr_list_dict (list): 原始表格单元格识别结果格式为
[
{
"t_box": [xmin, ymin, xmax, ymax],
"t_logic_box": [row_start, row_end, col_start, col_end],
"t_ocr_res": [[box, text], ...]
},
...
]
返回:
list: 调整后的表格单元格识别结果格式与输入相同
"""
# 新的结果列表
adjusted_result = []
# 记录是否第一行和第一列的单元格已被删除
remove_first_row = False
remove_first_col = False
# 检查并移除第一行
if all(cell and not cell[1] for cell in t_rec_ocr_list_dict[0].get("t_ocr_res", [])):
remove_first_row = True
# 检查并移除第一列
if all(row.get("t_ocr_res") and not row["t_ocr_res"][0][1] for row in t_rec_ocr_list_dict):
remove_first_col = True
# 遍历原始结果进行调整
for i, row in enumerate(t_rec_ocr_list_dict):
adjusted_row = []
# 如果是第一行并且需要删除,跳过这行
if remove_first_row and i == 0:
continue
for j, cell in enumerate(row.get("t_ocr_res", [])):
# 如果是第一列并且需要删除,跳过这一列
if remove_first_col and j == 0:
continue
# 更新当前单元格的逻辑位置
adjusted_cell = {
"t_box": row.get("t_box"),
"t_logic_box": [
row["t_logic_box"][0] - 1 if i > 0 else row["t_logic_box"][0],
row["t_logic_box"][1] - 1 if i > 0 else row["t_logic_box"][1],
row["t_logic_box"][2] - 1 if j > 0 else row["t_logic_box"][2],
row["t_logic_box"][3] - 1 if j > 0 else row["t_logic_box"][3]
],
"t_ocr_res": cell
}
adjusted_row.append(adjusted_cell)
if adjusted_row:
adjusted_result.append(adjusted_row)
return adjusted_result

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants