Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
95ca338
Initial plan for issue
Copilot May 30, 2025
8e5d6fe
Implement standard logging module and integrate with existing loggers
Copilot May 30, 2025
8d67400
Add test cases and improve documentation for standard logging
Copilot May 30, 2025
424b72e
Apply ruff formatting and add semversioner file for logging improvements
Copilot May 30, 2025
221a991
Remove custom logger classes and refactor to use standard logging only
Copilot May 30, 2025
d444a81
Apply ruff formatting to resolve CI/CD test failures
Copilot May 30, 2025
02dd063
Add semversioner file and fix linting issues
Copilot May 30, 2025
25055fe
ruff fixes
jgbradley1 May 30, 2025
8a28cb6
fix spelling error
jgbradley1 May 30, 2025
c2f2cff
Remove StandardProgressLogger and refactor to use standard logging
Copilot May 30, 2025
8c6c1f6
Remove LoggerFactory and custom loggers, refactor to use standard log…
Copilot May 30, 2025
16b9eb7
Fix pyright error: use logger.info() instead of calling logger as fun…
Copilot May 30, 2025
09c32ae
ruff fixes
jgbradley1 May 31, 2025
948c7d1
Remove deprecated logger files that were marked as deprecated placeho…
Copilot May 31, 2025
de7afaf
Replace custom get_logger with standard Python logging
Copilot May 31, 2025
e42e7d2
Fix linting issues found by ruff check --fix
Copilot May 31, 2025
a48f597
apply ruff check fixes
jgbradley1 May 31, 2025
ca0b6a4
add word to dictionary
jgbradley1 May 31, 2025
1eb008c
Fix type checker error in ModelManager.__new__ method
Copilot May 31, 2025
f56ed09
Refactor multiple logging.getLogger() calls to use single logger per …
Copilot May 31, 2025
a168501
Remove progress_logger parameter from build_index() and logger parame…
Copilot May 31, 2025
f73a7a7
Remove logger parameter from run_pipeline and standardize logger naming
Copilot May 31, 2025
8dcfddf
Replace logger parameter with log_level parameter in CLI commands
Copilot May 31, 2025
f76d197
Fix import ordering in notebook files to pass poetry poe check
Copilot May 31, 2025
1ca728b
Remove --logger parameter from smoke test command
Copilot May 31, 2025
e65891a
Fix Windows CI/CD issue with log file cleanup in tests
Copilot May 31, 2025
ee695d3
Add StreamHandler to root logger in __main__.py for CLI logging
Copilot May 31, 2025
14acebe
Only add StreamHandler if root logger doesn't have existing StreamHan…
Copilot May 31, 2025
2f7abc1
Fix import ordering in notebook files to pass ruff checks
Copilot May 31, 2025
fe1a860
Replace logging.StreamHandler with colorlog.StreamHandler for coloriz…
Copilot Jun 1, 2025
b45ecb0
Regenerate poetry.lock file after adding colorlog dependency
Copilot Jun 1, 2025
e0cb059
Fix import ordering in notebook files to pass ruff checks
Copilot Jun 1, 2025
38776a8
move printing of dataframes to debug level
jgbradley1 Jun 2, 2025
d761133
remove colorlog for now
jgbradley1 Jun 2, 2025
e427bda
Refactor workflow callbacks to inherit from logging.Handler
Copilot Jun 2, 2025
394683d
Fix linting issues in workflow callback handlers
Copilot Jun 2, 2025
c777f02
Fix pyright type errors in blob and file workflow callbacks
Copilot Jun 2, 2025
f923c90
Refactor pipeline logging to use pure logging.Handler subclasses
Copilot Jun 2, 2025
60121e8
Rename workflow callback classes to workflow logger classes and move …
Copilot Jun 2, 2025
4e2f59b
update dictionary
jgbradley1 Jun 2, 2025
d29d058
apply ruff fixes
jgbradley1 Jun 2, 2025
b2d0ed4
fix function name
jgbradley1 Jun 2, 2025
2e18bf4
simplify logger code
jgbradley1 Jun 2, 2025
c4f1bf3
update
jgbradley1 Jun 2, 2025
53cf87e
Merge branch 'main' into copilot/fix-1955
jgbradley1 Jun 2, 2025
e81b113
Remove error, warning, and log methods from WorkflowCallbacks and rep…
Copilot Jun 2, 2025
a8bda86
ruff fixes
jgbradley1 Jun 2, 2025
a8b6a7b
Fix pyright errors by removing WorkflowCallbacks from strategy type s…
Copilot Jun 3, 2025
e5a4b86
Remove ConsoleWorkflowLogger and apply consistent formatter to all ha…
Copilot Jun 3, 2025
0f127e3
apply ruff fixes
jgbradley1 Jun 3, 2025
b7578eb
Refactor pipeline_logger.py to use standard FileHandler and remove Fi…
Copilot Jun 3, 2025
de2f3e3
Remove conditional azure import checks from blob_workflow_logger.py
Copilot Jun 3, 2025
586f8b7
Fix pyright type checking errors in mock_provider.py and utils.py
Copilot Jun 3, 2025
d1e23e9
Run ruff check --fix to fix import ordering in notebooks
Copilot Jun 3, 2025
110a6a7
Merge configure_logging and create_pipeline_logger into init_loggers …
Copilot Jun 3, 2025
57c41dc
Remove configure_logging and create_pipeline_logger functions, replac…
Copilot Jun 3, 2025
e7e0449
apply ruff fixes
jgbradley1 Jun 3, 2025
e393874
cleanup unused code
jgbradley1 Jun 3, 2025
40eeec7
Update init_loggers to accept GraphRagConfig instead of ReportingConfig
Copilot Jun 3, 2025
dd70fca
apply ruff check fixes
jgbradley1 Jun 3, 2025
9ddb011
Fix test failures by providing valid GraphRagConfig with required mod…
Copilot Jun 3, 2025
5a0b938
apply ruff fixes
jgbradley1 Jun 3, 2025
d8b0733
remove logging_workflow_callback
jgbradley1 Jun 3, 2025
70cc88d
cleanup logging messages
jgbradley1 Jun 5, 2025
acabd59
Add logging to track progress of pandas DataFrame apply operation in …
Copilot Jun 5, 2025
1f5424b
cleanup logger logic throughout codebase
jgbradley1 Jun 5, 2025
aee81c4
update
jgbradley1 Jun 5, 2025
04227ab
more cleanup of old loggers
jgbradley1 Jun 5, 2025
e4607ba
small logger cleanup
jgbradley1 Jun 5, 2025
a830713
final code cleanup and added loggers to query
jgbradley1 Jun 6, 2025
8129534
add verbose logging to query
jgbradley1 Jun 6, 2025
64cde03
minor code cleanup
jgbradley1 Jun 6, 2025
30d49f2
Fix broken unit tests for chunk_text and standard_logging
Copilot Jun 6, 2025
0ea076d
apply ruff fixes
jgbradley1 Jun 6, 2025
d9bd984
Fix test_chunk_text by mocking progress_ticker function instead of Pr…
Copilot Jun 6, 2025
389c5f1
remove unnecessary logger
jgbradley1 Jun 6, 2025
412b53a
Merge branch 'copilot/fix-1955' of github.com:microsoft/graphrag into…
jgbradley1 Jun 6, 2025
4e2c107
remove rich and fix type annotation
jgbradley1 Jun 6, 2025
5ee88c7
revert test formatting changes my by copilot
jgbradley1 Jun 6, 2025
ee4a0e6
promote graphrag logs to root logger
jgbradley1 Jun 10, 2025
9295133
add correct semversioner file
jgbradley1 Jun 11, 2025
33961dd
revert change to file
jgbradley1 Jun 11, 2025
00410c1
revert formatting changes that have no effect
jgbradley1 Jun 11, 2025
0df17b2
Merge branch 'main' into copilot/fix-1955
jgbradley1 Jun 18, 2025
9fd438c
fix changes after merge with main
jgbradley1 Jun 18, 2025
790bbbf
revert unnecessary copilot changes
jgbradley1 Jun 18, 2025
8f50dd6
remove whitespace
jgbradley1 Jun 18, 2025
4a8c7bf
Merge branch 'main' into copilot/fix-1955
jgbradley1 Jun 23, 2025
7a72def
cleanup docstring
jgbradley1 Jun 23, 2025
d8a2acf
simplify some logic with less code
jgbradley1 Jul 9, 2025
259d4aa
update poetry lock file
jgbradley1 Jul 9, 2025
23e9789
ruff fixes
jgbradley1 Jul 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .semversioner/next-release/patch-20250611170907043237.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"type": "patch",
"description": "cleaned up logging to follow python standards."
}
2 changes: 2 additions & 0 deletions dictionary.txt
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ itertuples
isin
nocache
nbconvert
levelno

# HTML
nbsp
Expand Down Expand Up @@ -186,6 +187,7 @@ Verdantis's
# English
skippable
upvote
unconfigured

# Misc
Arxiv
Expand Down
4 changes: 2 additions & 2 deletions docs/config/env_vars.md
Original file line number Diff line number Diff line change
Expand Up @@ -178,11 +178,11 @@ This section controls the cache mechanism used by the pipeline. This is used to

### Reporting

This section controls the reporting mechanism used by the pipeline, for common events and error messages. The default is to write reports to a file in the output directory. However, you can also choose to write reports to the console or to an Azure Blob Storage container.
This section controls the reporting mechanism used by the pipeline, for common events and error messages. The default is to write reports to a file in the output directory. However, you can also choose to write reports to an Azure Blob Storage container.

| Parameter | Description | Type | Required or Optional | Default |
| --------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----- | -------------------- | ------- |
| `GRAPHRAG_REPORTING_TYPE` | The type of reporter to use. Options are `file`, `console`, or `blob` | `str` | optional | `file` |
| `GRAPHRAG_REPORTING_TYPE` | The type of reporter to use. Options are `file` or `blob` | `str` | optional | `file` |
| `GRAPHRAG_REPORTING_STORAGE_ACCOUNT_BLOB_URL` | The Azure Storage blob endpoint to use when in `blob` mode and using managed identity. Will have the format `https://<storage_account_name>.blob.core.windows.net` | `str` | optional | None |
| `GRAPHRAG_REPORTING_CONNECTION_STRING` | The Azure Storage connection string to use when in `blob` mode. | `str` | optional | None |
| `GRAPHRAG_REPORTING_CONTAINER_NAME` | The Azure Storage container name to use when in `blob` mode. | `str` | optional | None |
Expand Down
4 changes: 2 additions & 2 deletions docs/config/yaml.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,11 +149,11 @@ This section controls the cache mechanism used by the pipeline. This is used to

### reporting

This section controls the reporting mechanism used by the pipeline, for common events and error messages. The default is to write reports to a file in the output directory. However, you can also choose to write reports to the console or to an Azure Blob Storage container.
This section controls the reporting mechanism used by the pipeline, for common events and error messages. The default is to write reports to a file in the output directory. However, you can also choose to write reports to an Azure Blob Storage container.

#### Fields

- `type` **file|console|blob** - The reporting type to use. Default=`file`
- `type` **file|blob** - The reporting type to use. Default=`file`
- `base_dir` **str** - The base directory to write reports to, relative to the root.
- `connection_string` **str** - (blob only) The Azure Storage connection string.
- `container_name` **str** - (blob only) The Azure Storage container name.
Expand Down
7 changes: 7 additions & 0 deletions graphrag/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,10 @@
# Licensed under the MIT License

"""The GraphRAG package."""

import logging

from graphrag.logger.standard_logging import init_console_logger

logger = logging.getLogger(__name__)
init_console_logger()
30 changes: 13 additions & 17 deletions graphrag/api/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

import logging

from graphrag.callbacks.reporting import create_pipeline_reporter
from graphrag.callbacks.noop_workflow_callbacks import NoopWorkflowCallbacks
from graphrag.callbacks.workflow_callbacks import WorkflowCallbacks
from graphrag.config.enums import IndexingMethod
from graphrag.config.models.graph_rag_config import GraphRagConfig
Expand All @@ -19,10 +19,9 @@
from graphrag.index.typing.pipeline_run_result import PipelineRunResult
from graphrag.index.typing.workflow import WorkflowFunction
from graphrag.index.workflows.factory import PipelineFactory
from graphrag.logger.base import ProgressLogger
from graphrag.logger.null_progress import NullProgressLogger
from graphrag.logger.standard_logging import init_loggers

log = logging.getLogger(__name__)
logger = logging.getLogger(__name__)


async def build_index(
Expand All @@ -31,7 +30,6 @@ async def build_index(
is_update_run: bool = False,
memory_profile: bool = False,
callbacks: list[WorkflowCallbacks] | None = None,
progress_logger: ProgressLogger | None = None,
) -> list[PipelineRunResult]:
"""Run the pipeline with the given configuration.

Expand All @@ -45,26 +43,25 @@ async def build_index(
Whether to enable memory profiling.
callbacks : list[WorkflowCallbacks] | None default=None
A list of callbacks to register.
progress_logger : ProgressLogger | None default=None
The progress logger.

Returns
-------
list[PipelineRunResult]
The list of pipeline run results
"""
logger = progress_logger or NullProgressLogger()
# create a pipeline reporter and add to any additional callbacks
callbacks = callbacks or []
callbacks.append(create_pipeline_reporter(config.reporting, None))
init_loggers(config=config)

workflow_callbacks = create_callback_chain(callbacks, logger)
# Create callbacks for pipeline lifecycle events if provided
workflow_callbacks = (
create_callback_chain(callbacks) if callbacks else NoopWorkflowCallbacks()
)

outputs: list[PipelineRunResult] = []

if memory_profile:
log.warning("New pipeline does not yet support memory profiling.")
logger.warning("New pipeline does not yet support memory profiling.")

logger.info("Initializing indexing pipeline...")
# todo: this could propagate out to the cli for better clarity, but will be a breaking api change
method = _get_method(method, is_update_run)
pipeline = PipelineFactory.create_pipeline(config, method)
Expand All @@ -75,15 +72,14 @@ async def build_index(
pipeline,
config,
callbacks=workflow_callbacks,
logger=logger,
is_update_run=is_update_run,
):
outputs.append(output)
if output.errors and len(output.errors) > 0:
logger.error(output.workflow)
logger.error("Workflow %s completed with errors", output.workflow)
else:
logger.success(output.workflow)
logger.info(str(output.result))
logger.info("Workflow %s completed successfully", output.workflow)
logger.debug(str(output.result))

workflow_callbacks.pipeline_end(outputs)
return outputs
Expand Down
16 changes: 9 additions & 7 deletions graphrag/api/prompt_tune.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
Backwards compatibility is not guaranteed at this time.
"""

import logging
from typing import Annotated

import annotated_types
Expand All @@ -20,7 +21,7 @@
from graphrag.config.defaults import graphrag_config_defaults
from graphrag.config.models.graph_rag_config import GraphRagConfig
from graphrag.language_model.manager import ModelManager
from graphrag.logger.base import ProgressLogger
from graphrag.logger.standard_logging import init_loggers
from graphrag.prompt_tune.defaults import MAX_TOKEN_COUNT, PROMPT_TUNING_MODEL_ID
from graphrag.prompt_tune.generator.community_report_rating import (
generate_community_report_rating,
Expand All @@ -47,11 +48,12 @@
from graphrag.prompt_tune.loader.input import load_docs_in_chunks
from graphrag.prompt_tune.types import DocSelectionType

logger = logging.getLogger(__name__)


@validate_call(config={"arbitrary_types_allowed": True})
async def generate_indexing_prompts(
config: GraphRagConfig,
logger: ProgressLogger,
chunk_size: PositiveInt = graphrag_config_defaults.chunks.size,
overlap: Annotated[
int, annotated_types.Gt(-1)
Expand All @@ -71,8 +73,6 @@ async def generate_indexing_prompts(
Parameters
----------
- config: The GraphRag configuration.
- logger: The logger to use for progress updates.
- root: The root directory.
- output_path: The path to store the prompts.
- chunk_size: The chunk token size to use for input text units.
- limit: The limit of chunks to load.
Expand All @@ -89,6 +89,8 @@ async def generate_indexing_prompts(
-------
tuple[str, str, str]: entity extraction prompt, entity summarization prompt, community summarization prompt
"""
init_loggers(config=config)

# Retrieve documents
logger.info("Chunking documents...")
doc_list = await load_docs_in_chunks(
Expand Down Expand Up @@ -187,9 +189,9 @@ async def generate_indexing_prompts(
language=language,
)

logger.info(f"\nGenerated domain: {domain}") # noqa: G004
logger.info(f"\nDetected language: {language}") # noqa: G004
logger.info(f"\nGenerated persona: {persona}") # noqa: G004
logger.debug("Generated domain: %s", domain)
logger.debug("Detected language: %s", language)
logger.debug("Generated persona: %s", persona)

return (
extract_graph_prompt,
Expand Down
Loading
Loading