Skip to content

rca add: oss error #624#940

Open
wayyoungboy wants to merge 6 commits into
oceanbase:masterfrom
wayyoungboy:oss-error
Open

rca add: oss error #624#940
wayyoungboy wants to merge 6 commits into
oceanbase:masterfrom
wayyoungboy:oss-error

Conversation

@wayyoungboy

Copy link
Copy Markdown
Member

rca add: oss error #624
close #624

Comment thread src/handler/gather/gather_dbms_xplan.py Outdated
if len(resp["error"]) == 0:
file_size = os.path.getsize(resp["gather_pack_path"])
self.gather_tuples.append((node.get("ip"), False, resp["error"], file_size, int(time.time() - st), resp["gather_pack_path"]))
# recycle *_obdiag_*.trac in observer log dir

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为何要删除这部分代码

Comment thread dev_helper.sh Outdated
mkdir -p ./dependencies/bin
# download obstack
if [ -f ./dependencies/bin/obstack_aarch64 ]; then
# download obstack and obadmin

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obadmin这个包带进去会有多大

@wayyoungboy

wayyoungboy commented Jun 7, 2026

Copy link
Copy Markdown
Member Author

会使包变得很大,暂缓


This will cause the package to become very large, so please wait

@wayyoungboy

Copy link
Copy Markdown
Member Author

Review note: this should stay on hold until the packaging-size question is resolved, and there are also functional issues in plugins/rca/oss_error.py that need fixing before merge:

  • node_arch = ssh_client.exec_cmd("arch") is compared directly with "aarch64" / "x86_64"; exec_cmd commonly returns a trailing newline, so this should use .strip() or it may always fall back to x86.
  • os.path.exists(obadmin_remote_path) checks the local filesystem for a remote path, so it does not actually tell whether /tmp/.../ob_admin exists on the target node.
  • remote_obadmin_data_dir is an absolute path, but tar_full_name = "{0}/{1}.tar.gz".format(remote_dir, remote_obadmin_data_dir) produces a duplicated path like /tmp/obadmin_tmp_xxx//tmp/obadmin_tmp_xxx/obadmin_node.tar.gz, causing download to look at the wrong remote file.

I only ran syntax validation here: python3 -m py_compile plugins/rca/oss_error.py passed.

@wayyoungboy

wayyoungboy commented Jun 7, 2026

Copy link
Copy Markdown
Member Author

Additional review findings in plugins/rca/oss_error.py:

  • Lines 58-62 create the remote tmp dir only when ls remote_dir does not return No such file or directory; the condition is reversed. On a fresh run the tmp dir is missing and will not be created before upload/chmod.
  • The packaging helper downloads ob_admin_aarch64 / ob_admin_x86_64, but the RCA code builds local paths as obadmin_aarch64 / obadmin_x86_64. Unless packaging renames these elsewhere, upload will look for a file that does not exist.

These are in addition to the previous path and remote-existence issues, and they make this PR unsafe to merge before a focused fix and an observer integration rerun.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: 对象存储问题排查支持

2 participants