Skip to content

Commit 0e25de3

Browse files
Merge pull request microsoft#62 from microsoft/jamesqa
Updated Sample Data README and tiktoken version.
2 parents b88d4f6 + 360b0d1 commit 0e25de3

File tree

2 files changed

+5
-5
lines changed

2 files changed

+5
-5
lines changed

requirements-dev.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ azure-ai-documentintelligence==1.0.0b2
33
Markdown==3.4.4
44
requests==2.32.3
55
tqdm==4.66.1
6-
tiktoken==0.4.0
6+
tiktoken
77
langchain==0.2.12
88
bs4==0.0.1
99
urllib3==2.2.2

scripts/SAMPLE_DATA.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
- Copy and paste the contents from the scripts/.env.sample file.
1212
- Replace the values for `<AZURE_OPENAI_RESOURCE>` and `<AZURE_OPENAI_KEY>` with the name of the Azure OpenAI resource and either KEY 1 or KEY 2.
1313
- Save the .env file.
14-
- Within the scripts folder, create a config file `config.json`. The format will be a list of JSON objects, with each object specifying a configuration of local data path and target search service and index. Assuming you used "Deploy to Azure" to deploy this solution accelerator, these values can be found within the resources themselves. Copy and paste the following script block into the config.json file and update accordingly.
14+
- Within the scripts folder, create a config file `config.json`. The format will be a list of JSON objects, with each object specifying a configuration of local data path and target search service and index. Assuming you used "Deploy to Azure" to deploy this solution accelerator, these values can be found within the resources themselves. If you did not change the Search Index name, the default value is: promissory-notes-index. Copy and paste the following script block into the config.json file and update accordingly.
1515

1616
```
1717
[
@@ -21,7 +21,7 @@
2121
"subscription_id": "<subscription id>",
2222
"resource_group": "<resource group name>",
2323
"search_service_name": "<search service name to use>",
24-
"index_name": "promissory-notes-index",
24+
"index_name": "<search index name to use>",
2525
"chunk_size": 1024,
2626
"token_overlap": 128,
2727
"semantic_config_name": "default",
@@ -36,8 +36,8 @@
3636
- Create a virtual environment for the sample data preparation
3737
- Open a terminal window.
3838
- Create the virtual environment: `python -m venv scriptsenv`
39-
- Activate the virtual environment: `.\scriptsenv\bin\activate`
40-
- Install the necessary packages listed in scripts/requirements-dev.txt, e.g. `pip install --user -r requirements-dev.txt`
39+
- Activate the virtual environment: `.\scriptsenv\Scripts\activate`
40+
- Install the necessary packages listed in scripts/requirements-dev.txt, e.g. `pip install -r requirements-dev.txt`
4141
- Create the index and ingest PDF data with Form Recognizer
4242
- Replace `<form-rec-resource-name>` with the name of the existing or recently created Azure Document Intelligence (Form Recognizer) resource and replace `<form-rec-key>` with key 1 or key 2 of the existing or recently created Azure Document Intelligence (Form Recognizer) resource:
4343
`python data_preparation.py --config config.json --njobs=1 --form-rec-resource <form-rec-resource-name> --form-rec-key <form-rec-key>`

0 commit comments

Comments
 (0)