-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Add graphrag-storage. #2127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add graphrag-storage. #2127
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
0e19173
Add graphrag-storage.
dworthen 7e41278
Fix integration tests.
dworthen 4b7e5bc
Implement copilot feedback.
dworthen 9b05924
Remove create_storage_from_config helper.
dworthen bb7e367
Merge branch 'v3/main' into graphrag-storage
dworthen 0b6b593
Merge branch 'v3/main' into graphrag-storage
dworthen cd87faa
update factory method for dynamic imports and use enums for factory r…
dworthen 37c0316
Add encoding to storage config
dworthen 4ecf218
cleanup blob container name validation
dworthen 74cdc5a
Remove using kwargs to swallow unknown factory config parameters. Unk…
dworthen ecc4c77
fix integration tests.
dworthen 287b6b1
cleanup storage config for handling azure services.
dworthen 3b01e27
fix integration tests.
dworthen 0529dfe
update storage config.
dworthen 3a47233
updates
dworthen 3dd1c28
update readme
dworthen ef97d7b
cleanup
dworthen 4b7d2d7
cleanup config
dworthen 1460c81
cleanup
dworthen File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,86 @@ | ||
| # GraphRAG Storage | ||
|
|
||
| ## Basic | ||
|
|
||
| ```python | ||
| import asyncio | ||
| from graphrag_storage import StorageConfig, create_storage, StorageType | ||
|
|
||
| async def run(): | ||
| storage = create_storage( | ||
| StorageConfig( | ||
| type=StorageType.File | ||
| base_dir="output" | ||
| ) | ||
| ) | ||
|
|
||
| await storage.set("my_key", "value") | ||
| print(await storage.get("my_key")) | ||
|
|
||
| if __name__ == "__main__": | ||
| asyncio.run(run()) | ||
| ``` | ||
|
|
||
| ## Custom Storage | ||
|
|
||
| ```python | ||
| import asyncio | ||
| from typing import Any | ||
| from graphrag_storage import Storage, StorageConfig, create_storage, register_storage | ||
|
|
||
| class MyStorage(Storage): | ||
| def __init__(self, some_setting: str, optional_setting: str = "default setting", **kwargs: Any): | ||
| # Validate settings and initialize | ||
| ... | ||
|
|
||
| #Implement rest of interface | ||
| ... | ||
|
|
||
| register_storage("MyStorage", MyStorage) | ||
|
|
||
| async def run(): | ||
| storage = create_storage( | ||
| StorageConfig( | ||
| type="MyStorage" | ||
| some_setting="My Setting" | ||
| ) | ||
| ) | ||
| # Or use the factory directly to instantiate with a dict instead of using | ||
| # StorageConfig + create_factory | ||
| # from graphrag_storage.storage_factory import storage_factory | ||
| # storage = storage_factory.create(strategy="MyStorage", init_args={"some_setting": "My Setting"}) | ||
|
|
||
| await storage.set("my_key", "value") | ||
| print(await storage.get("my_key")) | ||
|
|
||
| if __name__ == "__main__": | ||
| asyncio.run(run()) | ||
| ``` | ||
|
|
||
| ### Details | ||
|
|
||
| By default, the `create_storage` comes with the following storage providers registered that correspond to the entries in the `StorageType` enum. | ||
|
|
||
| - `FileStorage` | ||
| - `AzureBlobStorage` | ||
| - `AzureCosmosStorage` | ||
| - `MemoryStorage` | ||
|
|
||
| The preregistration happens dynamically, e.g., `FileStorage` is only imported and registered if you request a `FileStorage` with `create_storage(StorageType.File, ...)`. There is no need to manually import and register builtin storage providers when using `create_storage`. | ||
|
|
||
| If you want a clean factory with no preregistered storage providers then directly import `storage_factory` and bypass using `create_storage`. The downside is that `storage_factory.create` uses a dict for init args instead of the strongly typed `StorageConfig` used with `create_storage`. | ||
|
|
||
| ```python | ||
| from graphrag_storage.storage_factory import storage_factory | ||
| from graphrag_storage.file_storage import FileStorage | ||
|
|
||
| # storage_factory has no preregistered providers so you must register any | ||
| # providers you plan on using. | ||
| # May also register a custom implementation, see above for example. | ||
| storage_factory.register("my_storage_key", FileStorage) | ||
|
|
||
| storage = storage_factory.create(strategy="my_storage_key", init_args={"base_dir": "...", "other_settings": "..."}) | ||
|
|
||
| ... | ||
|
|
||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # Copyright (c) 2024 Microsoft Corporation. | ||
| # Licensed under the MIT License | ||
|
|
||
| """The GraphRAG Storage package.""" | ||
|
|
||
| from graphrag_storage.storage import Storage | ||
| from graphrag_storage.storage_config import StorageConfig | ||
| from graphrag_storage.storage_factory import ( | ||
| create_storage, | ||
| register_storage, | ||
| ) | ||
| from graphrag_storage.storage_type import StorageType | ||
|
|
||
| __all__ = [ | ||
| "Storage", | ||
| "StorageConfig", | ||
| "StorageType", | ||
| "create_storage", | ||
| "register_storage", | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.