Skip to content

Conversation

@bolkedebruin
Copy link
Contributor

This commit introduces major enhancements to make msgraphfs more intuitive to use.

  • Support URL-based filesystem paths (msgd://site/drive/path)

  • Enable single filesystem instance to access multiple SharePoint sites

  • Add automatic site and drive discovery from URLs

  • Support msgd://, sharepoint://, and onedrive:// protocols

  • Add AZURE_* environment variable fallback support

  • Implement robust OAuth2 token management with automatic refresh

  • Support both client credentials and delegated authentication flows

  • Implement lazy HTTP client initialization to prevent fork issues

  • Add process ID tracking to detect forks and reinitialize clients

  • Make filesystem safe for multi-process environments like Airflow

  • Resolve threading and subprocess compatibility problems

  • Add comprehensive file permissions API (get_permissions method)

  • Extended metadata with SharePoint-specific fields (weburl, fields)

  • Include permission analysis with user, group, and link breakdown

  • Support expand parameters for additional metadata retrieval

  • Replace content.py with proper pytest fixtures in conftest.py

  • Add comprehensive live credential testing capabilities

  • Implement environment variable configuration for test credentials

  • Add OAuth2 integration tests and URL parsing validation

  • Extensive documentation updates with usage examples

  • Multiple protocol registration for fsspec integration

  • Improved error messages and debugging information

  • Add support for Python 3.10+

The changes maintain full backward compatibility while adding:

  • Session-scoped pytest fixtures for better test performance
  • Lazy initialization patterns for resource management
  • Comprehensive URL parsing for flexible path specifications
  • Multi-tenancy support with filesystem instance caching

@bolkedebruin bolkedebruin force-pushed the feature/comprehensive-enhancements branch from 6a527a6 to b9bc2d1 Compare October 4, 2025 09:51
This commit introduces major enhancements to make msgraphfs more intuitive
to use.

- Support URL-based filesystem paths (msgd://site/drive/path)
- Enable single filesystem instance to access multiple SharePoint sites
- Add automatic site and drive discovery from URLs
- Support msgd://, sharepoint://, and onedrive:// protocols

- Add AZURE_* environment variable fallback support
- Implement robust OAuth2 token management with automatic refresh
- Support both client credentials and delegated authentication flows

- Implement lazy HTTP client initialization to prevent fork issues
- Add process ID tracking to detect forks and reinitialize clients
- Make filesystem safe for multi-process environments like Airflow
- Resolve threading and subprocess compatibility problems

- Add comprehensive file permissions API (get_permissions method)
- Extended metadata with SharePoint-specific fields (weburl, fields)
- Include permission analysis with user, group, and link breakdown
- Support expand parameters for additional metadata retrieval

- Replace content.py with proper pytest fixtures in conftest.py
- Add comprehensive live credential testing capabilities
- Implement environment variable configuration for test credentials
- Add OAuth2 integration tests and URL parsing validation

- Extensive documentation updates with usage examples
- Multiple protocol registration for fsspec integration
- Improved error messages and debugging information
- Add support for Python 3.10+

The changes maintain full backward compatibility while adding:
- Session-scoped pytest fixtures for better test performance
- Lazy initialization patterns for resource management
- Comprehensive URL parsing for flexible path specifications
- Multi-tenancy support with filesystem instance caching
@bolkedebruin bolkedebruin force-pushed the feature/comprehensive-enhancements branch from b9bc2d1 to 8a4f9af Compare October 4, 2025 09:55
@bolkedebruin
Copy link
Contributor Author

ping @lmignon PTAL

@lmignon
Copy link
Member

lmignon commented Oct 7, 2025

thank you @bolkedebruin I will carefully review your proposal.

@bolkedebruin
Copy link
Contributor Author

@lmignon if you'd like I can disable codecov or you can add a token so you can track test coverage. https://docs.codecov.com/docs/codecov-tokens

@lmignon
Copy link
Member

lmignon commented Oct 8, 2025

@lmignon if you'd like I can disable codecov or you can add a token so you can track test coverage. https://docs.codecov.com/docs/codecov-tokens

The token is added. Nevertheless code coverage will only works once this branch will be merged into the main branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be removed. IMO tests must be executed with the latest versions of the dependencies according to the rules declared on the dependencies into the pyproject.yml file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the concern about the lockfile. However, I'd gently suggest keeping uv.lock is actually considered best practice in the Python community, similar to how package-lock.json works in npm or Cargo.lock in Rust.

The lockfile doesn't prevent anyone from working with the project - it just ensures consistency. uv's documentation recommends committing the lockfile for applications and libraries with development dependencies.
That said, it's your project so I you would like to see it removed I will do so.

@bolkedebruin
Copy link
Contributor Author

@lmignon drive_id issues should be fixed now.

@bolkedebruin
Copy link
Contributor Author

ping @lmignon

@lmignon
Copy link
Member

lmignon commented Oct 15, 2025

@bolkedebruin Thank you for the last change. This resolves the compatibility issue. Regarding the presence of the UV lock file, I would prefer to delete it. In the context of specific client projects, I share this best practice of working with fixed versions of the various modules in order to ensure the traceability and reproducibility of the system. In the context of a particular library, I prefer that CI be based solely on any version restrictions imposed on dependencies, rather than on explicitly locking those dependencies to a given version. This makes it possible to detect any incompatibilities introduced in a dependency between two CI runs. Then I will merge the PR.
Once the PR is merged, I'll need to find a way to run all the tests properly. In my case, the client_id used requires me to go through the entire Oauth2 process. In the tests you added, it seems that this is not necessary. Perhaps I have configured my client incorrectly in Microsoft... The underlying question I am asking myself is whether it is possible to use a client_id acting as an application without a real user behind it (no oauth2 dance) and at the same time allow the same client_id to perform authentication delegation... (what I am saying is probably not very clear...)

@lmignon lmignon mentioned this pull request Oct 20, 2025
@lmignon
Copy link
Member

lmignon commented Oct 20, 2025

@bolkedebruin Thank you for all your work. The work continue in #7. Everything will be merged once all tests (offline and online) are running on GitHub...

@lmignon lmignon closed this Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants