From 0234817b989adfd32b29fcf05bbe9b65521605f6 Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Tue, 12 Aug 2025 14:17:53 +0530 Subject: [PATCH 01/13] Readme file Update --- docs/LocalSetupGuide.md | 370 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 370 insertions(+) create mode 100644 docs/LocalSetupGuide.md diff --git a/docs/LocalSetupGuide.md b/docs/LocalSetupGuide.md new file mode 100644 index 00000000..ec7c8088 --- /dev/null +++ b/docs/LocalSetupGuide.md @@ -0,0 +1,370 @@ +# Local Setup and Development Guide + +This guide provides instructions for setting up the Content Processing Solution Accelerator locally for development and testing. The solution consists of three main components that work together to process multi-modal documents. + +## Table of Contents + +- [Local Setup: Quick Start](#local-setup-quick-start) +- [Development Environment](#development-environment) +- [Deploy with Azure Developer CLI](#deploy-with-azure-developer-cli) +- [Troubleshooting](#troubleshooting) + +## Local Setup: Quick Start + +Follow these steps to set up and run the application locally for development: + +### Prerequisites + +Ensure you have the following installed: + +• **Git** - [Download Git](https://git-scm.com/downloads) +• **Docker Desktop** - [Download Docker Desktop](https://www.docker.com/products/docker-desktop/) +• **Azure CLI** - [Install Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli) +• **Azure Developer CLI (azd)** - [Install Azure Developer CLI](https://learn.microsoft.com/en-us/azure/developer/azure-developer-cli/install-azd) +• **Python 3.11+** - [Download Python](https://www.python.org/downloads/) +• **Node.js 18+** - [Download Node.js](https://nodejs.org/) + +### 1. Clone the Repository + +Navigate to your development folder and clone the repository: + +```bash +git clone https://github.com/microsoft/content-processing-solution-accelerator.git +cd content-processing-solution-accelerator +``` + +### 2. Azure Authentication + +Login to Azure and set your subscription: + +```bash +# Login to Azure +az login + +# Set your subscription +az account set --subscription "your-subscription-id" + +# Login with Azure Developer CLI +azd auth login +``` + +### 3. Configure Environment Variables + +Copy the environment sample files and update them with your Azure resource values: + +```bash +# Copy environment files +cp .env.sample .env +cp src/ContentProcessor/.env.sample src/ContentProcessor/.env +cp src/ContentProcessorAPI/.env.sample src/ContentProcessorAPI/.env +cp src/ContentProcessorWeb/.env.sample src/ContentProcessorWeb/.env +``` + +Update the `.env` files with your Azure resource information: + +```bash +# Root .env file +AZURE_OPENAI_ENDPOINT=https://your-openai-resource.openai.azure.com/ +AZURE_OPENAI_API_KEY=your-openai-api-key +AZURE_OPENAI_MODEL=gpt-4o +AZURE_CONTENT_UNDERSTANDING_ENDPOINT=https://your-content-understanding-endpoint +AZURE_STORAGE_CONNECTION_STRING=your-storage-connection-string +AZURE_COSMOS_CONNECTION_STRING=your-cosmos-connection-string +``` + +### 4. Start the Application + +Run the startup script to install dependencies and start all components: + +**Windows:** +```cmd +start.cmd +``` + +**Linux/Mac:** +```bash +chmod +x start.sh +./start.sh +``` + +Alternatively, you can start each component manually: + +**Backend API:** +```bash +cd src/ContentProcessorAPI +python -m venv venv +source venv/bin/activate # Windows: venv\Scripts\activate +pip install -r requirements.txt +uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload +``` + +**Content Processor:** +```bash +cd src/ContentProcessor +python -m venv venv +source venv/bin/activate # Windows: venv\Scripts\activate +pip install -r requirements.txt +python src/main.py +``` + +**Web Frontend:** +```bash +cd src/ContentProcessorWeb +npm install +npm start +``` + +### 5. Access the Application + +Once all components are running, open your browser and navigate to: + +• **Web Interface:** [http://localhost:3000](http://localhost:3000) +• **API Documentation:** [http://localhost:8000/docs](http://localhost:8000/docs) +• **API Health Check:** [http://localhost:8000/health](http://localhost:8000/health) + +## Development Environment + +For advanced development and customization, you can set up each component individually: + +### Content Processor API (Backend) + +The REST API provides endpoints for file upload, processing management, and schema operations. + +```bash +cd src/ContentProcessorAPI + +# Create and activate virtual environment +python -m venv venv +source venv/bin/activate # Windows: venv\Scripts\activate + +# Install dependencies +pip install -r requirements.txt + +# Start the API server with hot reload +uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload +``` + +### Content Processor (Background Service) + +The background processing engine handles document extraction and transformation. + +```bash +cd src/ContentProcessor + +# Create and activate virtual environment +python -m venv venv +source venv/bin/activate # Windows: venv\Scripts\activate + +# Install dependencies +pip install -r requirements.txt + +# Start the processor +python src/main.py +``` + +### Content Processor Web (Frontend) + +The React/TypeScript frontend provides the user interface. + +```bash +cd src/ContentProcessorWeb + +# Install dependencies +npm install + +# Start development server +npm start +``` + +### Using Docker for Development + +For containerized development, create a `docker-compose.dev.yml` file: + +```yaml +version: '3.8' +services: + content-processor-api: + build: + context: ./src/ContentProcessorAPI + dockerfile: Dockerfile + ports: + - "8000:8000" + environment: + - APP_ENV=development + volumes: + - ./src/ContentProcessorAPI:/app + command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload + + content-processor-web: + build: + context: ./src/ContentProcessorWeb + dockerfile: Dockerfile + ports: + - "3000:3000" + environment: + - REACT_APP_API_BASE_URL=http://localhost:8000 + volumes: + - ./src/ContentProcessorWeb:/app + command: npm start + + content-processor: + build: + context: ./src/ContentProcessor + dockerfile: Dockerfile + environment: + - APP_ENV=development + volumes: + - ./src/ContentProcessor:/app +``` + +Run with Docker Compose: +```bash +docker-compose -f docker-compose.dev.yml up --build +``` + +## Deploy with Azure Developer CLI + +Follow these steps to deploy the application to Azure using Azure Developer CLI: + +### Prerequisites + +• Ensure you have the [Azure Developer CLI](https://learn.microsoft.com/en-us/azure/developer/azure-developer-cli/install-azd) installed. +• Ensure you have an Azure subscription with appropriate permissions. +• Check [Azure OpenAI quota availability](./quota_check.md) before deployment. + +### 1. Initialize the Project + +Initialize the project for Azure deployment: + +```bash +# Initialize azd project +azd init + +# Select the content-processing template when prompted +``` + +### 2. Configure Environment + +Set up your environment variables: + +```bash +# Set environment name +azd env set AZURE_ENV_NAME "your-environment-name" + +# Set Azure location +azd env set AZURE_LOCATION "eastus" + +# Set OpenAI deployment parameters +azd env set AZURE_OPENAI_GPT_DEPLOYMENT_CAPACITY "10" +azd env set AZURE_OPENAI_GPT_MODEL_NAME "gpt-4o" +``` + +### 3. Deploy Infrastructure and Applications + +Deploy both infrastructure and applications: + +```bash +# Provision Azure resources and deploy applications +azd up +``` + +This command will: +• Create all required Azure resources +• Build and deploy the container applications +• Configure networking and security settings + +### 4. Verify Deployment + +Once deployment is complete, verify the application is running: + +```bash +# Get deployment information +azd show + +# Open the deployed web application +azd browse +``` + +### 5. Redeploy Application Code + +To deploy code changes without reprovisioning infrastructure: + +```bash +# Deploy only application code changes +azd deploy +``` + +### 6. Clean Up Resources + +To remove all deployed resources: + +```bash +# Delete all Azure resources +azd down +``` + +## Troubleshooting + +### Common Issues + +**Python Module Not Found:** +```bash +# Ensure virtual environment is activated +source venv/bin/activate # Windows: venv\Scripts\activate +pip install -r requirements.txt +``` + +**Node.js Dependencies Issues:** +```bash +# Clear npm cache and reinstall +npm cache clean --force +rm -rf node_modules package-lock.json +npm install +``` + +**Port Conflicts:** +```bash +# Check what's using the port +netstat -tulpn | grep :8000 # Linux/Mac +netstat -ano | findstr :8000 # Windows + +# Kill the process or change the port +``` + +**Azure Authentication Issues:** +```bash +# Re-authenticate +az logout +az login +azd auth login +``` + +**CORS Issues:** +• Ensure API CORS settings include the web app URL +• Check browser network tab for CORS errors +• Verify API is running on the expected port + +**Environment Variables Not Loading:** +• Verify `.env` file is in the correct directory +• Check file permissions (especially on Linux/macOS) +• Ensure no extra spaces in variable assignments + +### Debug Mode + +Enable detailed logging by setting these environment variables: + +```bash +APP_LOGGING_LEVEL=DEBUG +APP_LOGGING_ENABLE=True +``` + +### Getting Help + +• Check the [Technical Architecture](./TechnicalArchitecture.md) documentation +• Review the [API Documentation](./API.md) for endpoint details +• Submit issues to the [GitHub repository](https://github.com/microsoft/content-processing-solution-accelerator/issues) +• Check existing issues for similar problems + +--- + +For additional support, please refer to the [main README](../README.md) or the [Deployment Guide](./DeploymentGuide.md) for production deployment instructions. From 0e4b39f347ed5851e013d3bdb5fac65f1dbb948f Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Tue, 12 Aug 2025 17:06:55 +0530 Subject: [PATCH 02/13] local readme file update --- docs/LocalSetupGuide.md | 458 ++++++++++++++++------------------------ 1 file changed, 182 insertions(+), 276 deletions(-) diff --git a/docs/LocalSetupGuide.md b/docs/LocalSetupGuide.md index ec7c8088..f67544c4 100644 --- a/docs/LocalSetupGuide.md +++ b/docs/LocalSetupGuide.md @@ -1,307 +1,216 @@ -# Local Setup and Development Guide +# Guide to Local Development -This guide provides instructions for setting up the Content Processing Solution Accelerator locally for development and testing. The solution consists of three main components that work together to process multi-modal documents. +## Requirements -## Table of Contents +• Python 3.11 or higher + PIP +• Node.js 18+ and npm +• Azure CLI, and an Azure Subscription +• Docker Desktop (optional, for containerized development) +• Visual Studio Code IDE (recommended) -- [Local Setup: Quick Start](#local-setup-quick-start) -- [Development Environment](#development-environment) -- [Deploy with Azure Developer CLI](#deploy-with-azure-developer-cli) -- [Troubleshooting](#troubleshooting) +## Local Setup -## Local Setup: Quick Start +**Note for macOS Developers:** If you are using macOS on Apple Silicon (ARM64), you may experience compatibility issues with some Azure services. We recommend testing thoroughly and using alternative approaches if needed. -Follow these steps to set up and run the application locally for development: +The easiest way to run this accelerator is in a VS Code Dev Container, which will open the project in your local VS Code using the [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers): -### Prerequisites - -Ensure you have the following installed: - -• **Git** - [Download Git](https://git-scm.com/downloads) -• **Docker Desktop** - [Download Docker Desktop](https://www.docker.com/products/docker-desktop/) -• **Azure CLI** - [Install Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli) -• **Azure Developer CLI (azd)** - [Install Azure Developer CLI](https://learn.microsoft.com/en-us/azure/developer/azure-developer-cli/install-azd) -• **Python 3.11+** - [Download Python](https://www.python.org/downloads/) -• **Node.js 18+** - [Download Node.js](https://nodejs.org/) - -### 1. Clone the Repository - -Navigate to your development folder and clone the repository: - -```bash -git clone https://github.com/microsoft/content-processing-solution-accelerator.git -cd content-processing-solution-accelerator -``` - -### 2. Azure Authentication - -Login to Azure and set your subscription: - -```bash -# Login to Azure -az login +1. Start Docker Desktop (install it if not already installed) +2. Open the project: [Open in Dev Containers](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/microsoft/content-processing-solution-accelerator) +3. In the VS Code window that opens, once the project files show up (this may take several minutes), open a terminal window -# Set your subscription -az account set --subscription "your-subscription-id" +## Detailed Development Container Setup Instructions -# Login with Azure Developer CLI -azd auth login -``` +The solution contains a [development container](https://code.visualstudio.com/docs/remote/containers) with all the required tooling to develop and deploy the accelerator. To deploy the Content Processing Solution Accelerator using the provided development container you will also need: -### 3. Configure Environment Variables +• [Visual Studio Code](https://code.visualstudio.com/) +• [Remote containers extension for Visual Studio Code](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) -Copy the environment sample files and update them with your Azure resource values: +If you are running this on Windows, we recommend you clone this repository in [WSL](https://code.visualstudio.com/docs/remote/wsl): ```bash -# Copy environment files -cp .env.sample .env -cp src/ContentProcessor/.env.sample src/ContentProcessor/.env -cp src/ContentProcessorAPI/.env.sample src/ContentProcessorAPI/.env -cp src/ContentProcessorWeb/.env.sample src/ContentProcessorWeb/.env +git clone https://github.com/microsoft/content-processing-solution-accelerator ``` -Update the `.env` files with your Azure resource information: +Open the cloned repository in Visual Studio Code and connect to the development container: ```bash -# Root .env file -AZURE_OPENAI_ENDPOINT=https://your-openai-resource.openai.azure.com/ -AZURE_OPENAI_API_KEY=your-openai-api-key -AZURE_OPENAI_MODEL=gpt-4o -AZURE_CONTENT_UNDERSTANDING_ENDPOINT=https://your-content-understanding-endpoint -AZURE_STORAGE_CONNECTION_STRING=your-storage-connection-string -AZURE_COSMOS_CONNECTION_STRING=your-cosmos-connection-string +code . ``` -### 4. Start the Application - -Run the startup script to install dependencies and start all components: - -**Windows:** -```cmd -start.cmd +!!! tip + Visual Studio Code should recognize the available development container and ask you to open the folder using it. For additional details on connecting to remote containers, please see the [Open an existing folder in a container](https://code.visualstudio.com/docs/remote/containers#_quick-start-open-an-existing-folder-in-a-container) quickstart. + +When you start the development container for the first time, the container will be built. This usually takes a few minutes. Please use the development container for all further steps. + +The files for the dev container are located in `/.devcontainer/` folder. + +## Local Deployment and Debugging + +1. **Clone the repository.** + +2. **Log into the Azure CLI:** + • Check your login status using: `az account show` + • If not logged in, use: `az login` + • To specify a tenant, use: `az login --tenant ` + +3. **Create a Resource Group:** + • You can create it either through the Azure Portal or the Azure CLI: + ```bash + az group create --name --location EastUS2 + ``` + +4. **Deploy the Bicep template:** + • You can use the Bicep extension for VSCode (Right-click the `.bicep` file, then select "Show deployment pane") or use the Azure CLI: + ```bash + az deployment group create -g -f infra/main.bicep --query 'properties.outputs' + ``` + + **Note:** You will be prompted for a `principalId`, which is the ObjectID of your user in Entra ID. To find it, use the Azure Portal or run: + ```bash + az ad signed-in-user show --query id -o tsv + ``` + + You will also be prompted for locations for Azure OpenAI and Azure AI Content Understanding services. This is to allow separate regions where there may be service quota restrictions. + + **Additional Notes:** + + **Role Assignments in Bicep Deployment:** + + The main.bicep deployment includes the assignment of the appropriate roles to Azure OpenAI and Cosmos services. If you want to modify an existing implementation—for example, to use resources deployed as part of the simple deployment for local debugging—you will need to add your own credentials to access the Cosmos and Azure OpenAI services. You can add these permissions using the following commands: + + ```bash + az cosmosdb sql role assignment create --resource-group --account-name --role-definition-name "Cosmos DB Built-in Data Contributor" --principal-id --scope /subscriptions//resourceGroups//providers/Microsoft.DocumentDB/databaseAccounts/ + + az role assignment create --assignee --role "Cognitive Services OpenAI User" --scope /subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/ + ``` + + **Using a Different Database in Cosmos:** + + You can set the solution up to use a different database in Cosmos. For example, you can name it something like `contentprocess-dev`. To do this: + + i. Change the environment variable `AZURE_COSMOS_DATABASE` to the new database name. + ii. You will need to create the database in the Cosmos DB account. You can do this from the Data Explorer pane in the portal, click on the drop down labeled "+ New Container" and provide all the necessary details. + +5. **Create `.env` files:** + • Navigate to the root folder and each component folder (`src/ContentProcessor`, `src/ContentProcessorAPI`, `src/ContentProcessorWeb`) and create `.env` files based on the provided `.env.sample` files. + +6. **Fill in the `.env` files:** + • Use the output from the deployment or check the Azure Portal under "Deployments" in the resource group. + +7. **(Optional) Set up virtual environments:** + • If you are using `venv`, create and activate your virtual environment for both the backend components: + + **Content Processor API:** + ```bash + cd src/ContentProcessorAPI + python -m venv venv + source venv/bin/activate # Windows: venv\Scripts\activate + ``` + + **Content Processor:** + ```bash + cd src/ContentProcessor + python -m venv venv + source venv/bin/activate # Windows: venv\Scripts\activate + ``` + +8. **Install requirements - Backend components:** + • In each of the backend folders, open a terminal and run: + ```bash + pip install -r requirements.txt + ``` + +9. **Install requirements - Frontend:** + • In the frontend folder: + ```bash + cd src/ContentProcessorWeb + npm install + ``` + +10. **Run the application:** + • From the `src/ContentProcessorAPI` directory: + ```bash + uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload + ``` + + • In a new terminal from the `src/ContentProcessor` directory: + ```bash + python src/main.py + ``` + + • In a new terminal from the `src/ContentProcessorWeb` directory: + ```bash + npm start + ``` + +11. **Open a browser and navigate to `http://localhost:3000`** + +12. **To see Swagger API documentation, you can navigate to `http://localhost:8000/docs`** + +## Debugging the Solution Locally + +You can debug the API backend running locally with VSCode using the following launch.json entry: + +```json +{ + "name": "Python Debugger: Content Processor API", + "type": "debugpy", + "request": "launch", + "cwd": "${workspaceFolder}/src/ContentProcessorAPI", + "module": "uvicorn", + "args": ["app.main:app", "--reload"], + "jinja": true +} ``` -**Linux/Mac:** -```bash -chmod +x start.sh -./start.sh +To debug the Content Processor service, add the following launch.json entry: + +```json +{ + "name": "Python Debugger: Content Processor", + "type": "debugpy", + "request": "launch", + "cwd": "${workspaceFolder}/src/ContentProcessor", + "program": "src/main.py", + "jinja": true +} ``` -Alternatively, you can start each component manually: +For debugging the React frontend, you can use the browser's developer tools or set up debugging in VS Code with the appropriate extensions. -**Backend API:** -```bash -cd src/ContentProcessorAPI -python -m venv venv -source venv/bin/activate # Windows: venv\Scripts\activate -pip install -r requirements.txt -uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload -``` +## Alternative: Deploy with Azure Developer CLI -**Content Processor:** -```bash -cd src/ContentProcessor -python -m venv venv -source venv/bin/activate # Windows: venv\Scripts\activate -pip install -r requirements.txt -python src/main.py -``` - -**Web Frontend:** -```bash -cd src/ContentProcessorWeb -npm install -npm start -``` - -### 5. Access the Application - -Once all components are running, open your browser and navigate to: - -• **Web Interface:** [http://localhost:3000](http://localhost:3000) -• **API Documentation:** [http://localhost:8000/docs](http://localhost:8000/docs) -• **API Health Check:** [http://localhost:8000/health](http://localhost:8000/health) - -## Development Environment - -For advanced development and customization, you can set up each component individually: - -### Content Processor API (Backend) - -The REST API provides endpoints for file upload, processing management, and schema operations. - -```bash -cd src/ContentProcessorAPI - -# Create and activate virtual environment -python -m venv venv -source venv/bin/activate # Windows: venv\Scripts\activate - -# Install dependencies -pip install -r requirements.txt - -# Start the API server with hot reload -uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload -``` - -### Content Processor (Background Service) - -The background processing engine handles document extraction and transformation. - -```bash -cd src/ContentProcessor - -# Create and activate virtual environment -python -m venv venv -source venv/bin/activate # Windows: venv\Scripts\activate - -# Install dependencies -pip install -r requirements.txt - -# Start the processor -python src/main.py -``` - -### Content Processor Web (Frontend) - -The React/TypeScript frontend provides the user interface. - -```bash -cd src/ContentProcessorWeb - -# Install dependencies -npm install - -# Start development server -npm start -``` - -### Using Docker for Development - -For containerized development, create a `docker-compose.dev.yml` file: - -```yaml -version: '3.8' -services: - content-processor-api: - build: - context: ./src/ContentProcessorAPI - dockerfile: Dockerfile - ports: - - "8000:8000" - environment: - - APP_ENV=development - volumes: - - ./src/ContentProcessorAPI:/app - command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload - - content-processor-web: - build: - context: ./src/ContentProcessorWeb - dockerfile: Dockerfile - ports: - - "3000:3000" - environment: - - REACT_APP_API_BASE_URL=http://localhost:8000 - volumes: - - ./src/ContentProcessorWeb:/app - command: npm start - - content-processor: - build: - context: ./src/ContentProcessor - dockerfile: Dockerfile - environment: - - APP_ENV=development - volumes: - - ./src/ContentProcessor:/app -``` - -Run with Docker Compose: -```bash -docker-compose -f docker-compose.dev.yml up --build -``` - -## Deploy with Azure Developer CLI - -Follow these steps to deploy the application to Azure using Azure Developer CLI: +If you prefer to use Azure Developer CLI for a more automated deployment: ### Prerequisites - • Ensure you have the [Azure Developer CLI](https://learn.microsoft.com/en-us/azure/developer/azure-developer-cli/install-azd) installed. -• Ensure you have an Azure subscription with appropriate permissions. • Check [Azure OpenAI quota availability](./quota_check.md) before deployment. -### 1. Initialize the Project - -Initialize the project for Azure deployment: - -```bash -# Initialize azd project -azd init - -# Select the content-processing template when prompted -``` - -### 2. Configure Environment +### Deployment Steps -Set up your environment variables: +1. **Initialize the project:** + ```bash + azd init + ``` -```bash -# Set environment name -azd env set AZURE_ENV_NAME "your-environment-name" - -# Set Azure location -azd env set AZURE_LOCATION "eastus" - -# Set OpenAI deployment parameters -azd env set AZURE_OPENAI_GPT_DEPLOYMENT_CAPACITY "10" -azd env set AZURE_OPENAI_GPT_MODEL_NAME "gpt-4o" -``` +2. **Configure environment:** + ```bash + azd env set AZURE_ENV_NAME "your-environment-name" + azd env set AZURE_LOCATION "eastus" + azd env set AZURE_OPENAI_GPT_DEPLOYMENT_CAPACITY "10" + azd env set AZURE_OPENAI_GPT_MODEL_NAME "gpt-4o" + ``` -### 3. Deploy Infrastructure and Applications +3. **Deploy infrastructure and applications:** + ```bash + azd up + ``` -Deploy both infrastructure and applications: - -```bash -# Provision Azure resources and deploy applications -azd up -``` - -This command will: -• Create all required Azure resources -• Build and deploy the container applications -• Configure networking and security settings - -### 4. Verify Deployment - -Once deployment is complete, verify the application is running: - -```bash -# Get deployment information -azd show - -# Open the deployed web application -azd browse -``` - -### 5. Redeploy Application Code - -To deploy code changes without reprovisioning infrastructure: - -```bash -# Deploy only application code changes -azd deploy -``` - -### 6. Clean Up Resources - -To remove all deployed resources: - -```bash -# Delete all Azure resources -azd down -``` +4. **Verify deployment:** + ```bash + azd show + azd browse + ``` ## Troubleshooting @@ -327,8 +236,6 @@ npm install # Check what's using the port netstat -tulpn | grep :8000 # Linux/Mac netstat -ano | findstr :8000 # Windows - -# Kill the process or change the port ``` **Azure Authentication Issues:** @@ -336,12 +243,11 @@ netstat -ano | findstr :8000 # Windows # Re-authenticate az logout az login -azd auth login ``` **CORS Issues:** • Ensure API CORS settings include the web app URL -• Check browser network tab for CORS errors +• Check browser network tab for CORS errors • Verify API is running on the expected port **Environment Variables Not Loading:** @@ -351,7 +257,7 @@ azd auth login ### Debug Mode -Enable detailed logging by setting these environment variables: +Enable detailed logging by setting these environment variables in your `.env` files: ```bash APP_LOGGING_LEVEL=DEBUG From 8182884ccb6dee547396e0f6ea4c110c78c01e48 Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Tue, 12 Aug 2025 21:53:10 +0530 Subject: [PATCH 03/13] File changes --- docs/LocalSetupGuide.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/docs/LocalSetupGuide.md b/docs/LocalSetupGuide.md index f67544c4..87afbc70 100644 --- a/docs/LocalSetupGuide.md +++ b/docs/LocalSetupGuide.md @@ -2,11 +2,11 @@ ## Requirements -• Python 3.11 or higher + PIP -• Node.js 18+ and npm -• Azure CLI, and an Azure Subscription -• Docker Desktop (optional, for containerized development) -• Visual Studio Code IDE (recommended) +• **Python 3.11 or higher** + PIP +• **Node.js 18+** and npm +• **Azure CLI** and an Azure Subscription +• **Docker Desktop** (optional, for containerized development) +• **Visual Studio Code IDE** (recommended) ## Local Setup @@ -89,6 +89,7 @@ The files for the dev container are located in `/.devcontainer/` folder. You can set the solution up to use a different database in Cosmos. For example, you can name it something like `contentprocess-dev`. To do this: i. Change the environment variable `AZURE_COSMOS_DATABASE` to the new database name. + ii. You will need to create the database in the Cosmos DB account. You can do this from the Data Explorer pane in the portal, click on the drop down labeled "+ New Container" and provide all the necessary details. 5. **Create `.env` files:** From 2a3017a6a84aa073a8948f4039c5733d6a49a1a1 Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Tue, 12 Aug 2025 21:58:54 +0530 Subject: [PATCH 04/13] file updates --- docs/LocalSetupGuide.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/LocalSetupGuide.md b/docs/LocalSetupGuide.md index 87afbc70..bf2bf55e 100644 --- a/docs/LocalSetupGuide.md +++ b/docs/LocalSetupGuide.md @@ -2,11 +2,11 @@ ## Requirements -• **Python 3.11 or higher** + PIP -• **Node.js 18+** and npm -• **Azure CLI** and an Azure Subscription -• **Docker Desktop** (optional, for containerized development) -• **Visual Studio Code IDE** (recommended) +- Python 3.11 or higher + PIP +- Node.js 18+ and npm +- Azure CLI and an Azure Subscription +- Docker Desktop (optional, for containerized development) +- Visual Studio Code IDE (recommended) ## Local Setup From 7ce95188b5b3a6d6a790757c2c8f9348a93d8c24 Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Wed, 13 Aug 2025 10:15:14 +0530 Subject: [PATCH 05/13] removed cli changes --- docs/LocalSetupGuide.md | 34 ---------------------------------- 1 file changed, 34 deletions(-) diff --git a/docs/LocalSetupGuide.md b/docs/LocalSetupGuide.md index bf2bf55e..6b4e7816 100644 --- a/docs/LocalSetupGuide.md +++ b/docs/LocalSetupGuide.md @@ -179,40 +179,6 @@ To debug the Content Processor service, add the following launch.json entry: For debugging the React frontend, you can use the browser's developer tools or set up debugging in VS Code with the appropriate extensions. -## Alternative: Deploy with Azure Developer CLI - -If you prefer to use Azure Developer CLI for a more automated deployment: - -### Prerequisites -• Ensure you have the [Azure Developer CLI](https://learn.microsoft.com/en-us/azure/developer/azure-developer-cli/install-azd) installed. -• Check [Azure OpenAI quota availability](./quota_check.md) before deployment. - -### Deployment Steps - -1. **Initialize the project:** - ```bash - azd init - ``` - -2. **Configure environment:** - ```bash - azd env set AZURE_ENV_NAME "your-environment-name" - azd env set AZURE_LOCATION "eastus" - azd env set AZURE_OPENAI_GPT_DEPLOYMENT_CAPACITY "10" - azd env set AZURE_OPENAI_GPT_MODEL_NAME "gpt-4o" - ``` - -3. **Deploy infrastructure and applications:** - ```bash - azd up - ``` - -4. **Verify deployment:** - ```bash - azd show - azd browse - ``` - ## Troubleshooting ### Common Issues From ba77923930bd703c7ace1be9fbb79002114e1e8c Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Tue, 16 Dec 2025 20:29:21 +0530 Subject: [PATCH 06/13] Update LocalSetupGuide with detailed setup instructions for backend and frontend components --- docs/LocalSetupGuide.md | 472 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 448 insertions(+), 24 deletions(-) diff --git a/docs/LocalSetupGuide.md b/docs/LocalSetupGuide.md index 6b4e7816..67148dab 100644 --- a/docs/LocalSetupGuide.md +++ b/docs/LocalSetupGuide.md @@ -102,51 +102,300 @@ The files for the dev container are located in `/.devcontainer/` folder. • If you are using `venv`, create and activate your virtual environment for both the backend components: **Content Processor API:** + + PowerShell: + ```powershell + cd src\ContentProcessorAPI + python -m venv .venv + .venv\Scripts\Activate.ps1 + ``` + + Command Prompt: + ```cmd + cd src\ContentProcessorAPI + python -m venv .venv + .venv\Scripts\activate.bat + ``` + + Git Bash / Linux / macOS: ```bash cd src/ContentProcessorAPI - python -m venv venv - source venv/bin/activate # Windows: venv\Scripts\activate + python -m venv .venv + source .venv/bin/activate ``` **Content Processor:** + + PowerShell: + ```powershell + cd src\ContentProcessor + python -m venv .venv + .venv\Scripts\Activate.ps1 + ``` + + Command Prompt: + ```cmd + cd src\ContentProcessor + python -m venv .venv + .venv\Scripts\activate.bat + ``` + + Git Bash / Linux / macOS: ```bash cd src/ContentProcessor - python -m venv venv - source venv/bin/activate # Windows: venv\Scripts\activate + python -m venv .venv + source .venv/bin/activate + ``` + + **Note for PowerShell Users:** If you get an error about scripts being disabled, run: + ```powershell + Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser ``` 8. **Install requirements - Backend components:** - • In each of the backend folders, open a terminal and run: + + **ContentProcessorAPI:** + + Navigate to `src/ContentProcessorAPI` and install dependencies: ```bash + cd src\ContentProcessorAPI pip install -r requirements.txt ``` + + **If you encounter compilation errors** on Windows (cffi, pydantic-core, or cryptography): + + These packages often fail to build from source on Windows. Use this workaround to install precompiled wheels: + + ```powershell + # Create temporary requirements without problematic packages + Get-Content requirements.txt | Where-Object { $_ -notmatch "cffi==1.17.1|pydantic==2.11.7|pydantic-core==2.33.2" } | Out-File temp_requirements.txt -Encoding utf8 + + # Install other dependencies first + pip install -r temp_requirements.txt + + # Install problematic packages with newer precompiled versions + pip install cffi==2.0.0 pydantic==2.12.5 pydantic-core==2.41.5 + + # Upgrade typing-extensions if needed + pip install --upgrade "typing-extensions>=4.14.1" "typing-inspection>=0.4.2" + + # Clean up temporary file + Remove-Item temp_requirements.txt + ``` + + **ContentProcessor:** + + Navigate to `src/ContentProcessor` and install dependencies: + ```bash + cd src\ContentProcessor + pip install -r requirements.txt + ``` + + **If you encounter errors**, upgrade problematic packages: + ```powershell + pip install --upgrade cffi cryptography pydantic pydantic-core numpy pandas + ``` + + **Note:** Python 3.11+ has better precompiled wheel support. Avoid Python 3.12 as some packages may not be compatible yet. -9. **Install requirements - Frontend:** - • In the frontend folder: +9. **Configure environment variables:** + + **ContentProcessorAPI:** + + Create a `.env` file in `src/ContentProcessorAPI/app/` directory with the following content: ```bash - cd src/ContentProcessorWeb - npm install + # App Configuration endpoint from your Azure deployment + APP_CONFIG_ENDPOINT=https://.azconfig.io + + # Cosmos DB endpoint from your Azure deployment + AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ + AZURE_COSMOS_DATABASE=contentprocess + + # Local development settings - CRITICAL for local authentication + APP_ENV=dev + APP_AUTH_ENABLED=False + AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True ``` + + **ContentProcessor:** + + Create a `.env.dev` file (note the `.dev` suffix) in `src/ContentProcessor/src/` directory: + ```bash + # App Configuration endpoint + APP_CONFIG_ENDPOINT=https://.azconfig.io + + # Cosmos DB endpoint + AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ + AZURE_COSMOS_DATABASE=contentprocess + + # Local development settings + APP_ENV=dev + APP_AUTH_ENABLED=False + AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True + + # Logging settings + APP_LOGGING_LEVEL=INFO + APP_LOGGING_ENABLE=True + ``` + + **ContentProcessorWeb:** + + Update the `.env` file in `src/ContentProcessorWeb/` directory: + ```bash + REACT_APP_API_BASE_URL=http://localhost:8000 + REACT_APP_AUTH_ENABLED=false + REACT_APP_CONSOLE_LOG_ENABLED=true + ``` + + **Important Notes:** + - Replace `` and `` with your actual Azure resource names from deployment + - `APP_ENV=dev` is **REQUIRED** for local development - it enables Azure CLI credential usage instead of Managed Identity + - ContentProcessor requires `.env.dev` (not `.env`) in the `src/` subdirectory + - Get your resource names from Azure Portal or by running: `az resource list -g ` + +10. **Assign Azure RBAC roles:** + Before running the application locally, you need proper Azure permissions: + + ```bash + # Get your Azure principal ID (user object ID) + az ad signed-in-user show --query id -o tsv + + # Get your subscription ID + az account show --query id -o tsv + + # Assign App Configuration Data Reader role + az role assignment create --role "App Configuration Data Reader" \ + --assignee \ + --scope /subscriptions//resourceGroups//providers/Microsoft.AppConfiguration/configurationStores/ + + # Assign Cosmos DB Data Contributor role + az role assignment create --role "Cosmos DB Built-in Data Contributor" \ + --assignee \ + --scope /subscriptions//resourceGroups//providers/Microsoft.DocumentDB/databaseAccounts/ + + # Assign Storage Queue Data Contributor role (for full file processing) + az role assignment create --role "Storage Queue Data Contributor" \ + --assignee \ + --scope /subscriptions//resourceGroups//providers/Microsoft.Storage/storageAccounts/ + + # Assign Cognitive Services User role (for Content Understanding) + az role assignment create --role "Cognitive Services User" \ + --assignee \ + --scope /subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/ + ``` + + **Note:** Azure role assignments can take 5-10 minutes to propagate. If you get "Forbidden" errors when starting the API, wait a few minutes and try again. + +11. **Install requirements - Frontend:** + • Navigate to the frontend folder: + ```bash + cd src\ContentProcessorWeb + ``` + + • Install dependencies with `--legacy-peer-deps` flag (required for @azure/msal-react compatibility): + ```powershell + npm install --legacy-peer-deps + ``` + + • Install additional required FluentUI packages: + ```powershell + npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps + ``` + + **Note:** Always use the `--legacy-peer-deps` flag for npm commands in this project to avoid dependency conflicts. -10. **Run the application:** - • From the `src/ContentProcessorAPI` directory: +12. **Configure CORS for local development:** + + The FastAPI backend needs CORS configuration to allow requests from the React frontend during local development. + + Edit `src/ContentProcessorAPI/app/main.py` and add the CORS middleware configuration: + + ```python + from fastapi.middleware.cors import CORSMiddleware + ``` + + Then after the line `app = FastAPI(redirect_slashes=False)`, add: + + ```python + # Configure CORS for local development + app.add_middleware( + CORSMiddleware, + allow_origins=["http://localhost:3000"], # Frontend URL + allow_credentials=True, + allow_methods=["*"], # Allow all HTTP methods + allow_headers=["*"], # Allow all headers + ) + ``` + + **Note:** This CORS configuration is only needed for local development. Azure deployment handles CORS at the infrastructure level. + +13. **Run the application:** +13. **Run the application:** + + Open three separate terminal windows and run each component: + + **Terminal 1 - API (ContentProcessorAPI):** + + PowerShell: + ```powershell + cd src\ContentProcessorAPI + .venv\Scripts\Activate.ps1 + python -m uvicorn app.main:app --reload --port 8000 + ``` + + Command Prompt: + ```cmd + cd src\ContentProcessorAPI + .venv\Scripts\activate.bat + python -m uvicorn app.main:app --reload --port 8000 + ``` + + Git Bash / Linux / macOS: ```bash - uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload + cd src/ContentProcessorAPI + source .venv/bin/activate + python -m uvicorn app.main:app --reload --port 8000 + ``` + + **Terminal 2 - Background Processor (ContentProcessor):** + + PowerShell: + ```powershell + cd src\ContentProcessor + .venv\Scripts\Activate.ps1 + python src/main.py ``` - • In a new terminal from the `src/ContentProcessor` directory: + Command Prompt: + ```cmd + cd src\ContentProcessor + .venv\Scripts\activate.bat + python src/main.py + ``` + + Git Bash / Linux / macOS: ```bash + cd src/ContentProcessor + source .venv/bin/activate python src/main.py ``` - • In a new terminal from the `src/ContentProcessorWeb` directory: + **Terminal 3 - Frontend (ContentProcessorWeb):** ```bash + cd src\ContentProcessorWeb npm start ``` + + **Troubleshooting startup:** + - If you get "Forbidden" errors from App Configuration or Cosmos DB, ensure your Azure role assignments have propagated (wait 5-10 minutes after creating them) + - If you see "ManagedIdentityCredential" errors, verify `.env` files have `APP_ENV=dev` set + - If frontend shows "Unable to connect to the server", verify you added CORS configuration in `main.py` (step 12) and restart the API + - Storage Queue errors in ContentProcessor are expected if you haven't assigned the Storage Queue Data Contributor role - the processor will keep retrying + - Content Understanding 401 errors are expected if you haven't assigned the Cognitive Services User role -11. **Open a browser and navigate to `http://localhost:3000`** +14. **Open a browser and navigate to `http://localhost:3000`** -12. **To see Swagger API documentation, you can navigate to `http://localhost:8000/docs`** +15. **To see Swagger API documentation, you can navigate to `http://localhost:8000/docs`** ## Debugging the Solution Locally @@ -184,20 +433,98 @@ For debugging the React frontend, you can use the browser's developer tools or s ### Common Issues **Python Module Not Found:** + +PowerShell: +```powershell +# Ensure virtual environment is activated +.venv\Scripts\Activate.ps1 +pip install -r requirements.txt +``` + +Command Prompt: +```cmd +# Ensure virtual environment is activated +.venv\Scripts\activate.bat +pip install -r requirements.txt +``` + +Git Bash / Linux / macOS: ```bash # Ensure virtual environment is activated -source venv/bin/activate # Windows: venv\Scripts\activate +source .venv/bin/activate pip install -r requirements.txt ``` +**Python Dependency Compilation Errors (Windows):** + +If you see errors like "Microsoft Visual C++ 14.0 is required" or "error: metadata-generation-failed" when installing cffi, pydantic-core, or cryptography: + +```powershell +# Create temporary requirements excluding problematic packages +Get-Content requirements.txt | Where-Object { $_ -notmatch "cffi==1.17.1|pydantic==2.11.7|pydantic-core==2.33.2" } | Out-File temp_requirements.txt -Encoding utf8 + +# Install other dependencies first +pip install -r temp_requirements.txt + +# Install problematic packages with newer precompiled versions +pip install cffi==2.0.0 pydantic==2.12.5 pydantic-core==2.41.5 + +# Upgrade typing-extensions if needed +pip install --upgrade "typing-extensions>=4.14.1" "typing-inspection>=0.4.2" + +# Clean up +Remove-Item temp_requirements.txt +``` + +**Explanation:** Older versions of cffi (1.17.1) and pydantic-core (2.33.2) require compilation from source, which fails on Windows without Visual Studio build tools. Newer versions have precompiled wheels that install without compilation. + +**pydantic_core ImportError:** + +If you see "PyO3 modules compiled for CPython 3.8 or older may only be initialized once" or "ImportError: pydantic_core._pydantic_core": +```powershell +# Uninstall and reinstall with compatible versions +pip uninstall -y pydantic pydantic-core +pip install pydantic==2.12.5 pydantic-core==2.41.5 +pip install --upgrade "typing-extensions>=4.14.1" +``` + +**Explanation:** Version mismatch between pydantic and pydantic-core causes runtime errors. The compatible versions above work reliably together. + +**pandas/numpy Import Errors:** + +If you see "Error importing numpy from its source directory": +```powershell +# Force reinstall all requirements to resolve conflicts +pip install --upgrade --force-reinstall -r requirements.txt +``` + **Node.js Dependencies Issues:** + +PowerShell: +```powershell +# Clear npm cache and reinstall with legacy peer deps +npm cache clean --force +Remove-Item -Recurse -Force node_modules -ErrorAction SilentlyContinue +Remove-Item -Force package-lock.json -ErrorAction SilentlyContinue +npm install --legacy-peer-deps + +# Install missing FluentUI packages if needed +npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps +``` + +Bash / Linux / macOS: ```bash -# Clear npm cache and reinstall +# Clear npm cache and reinstall with legacy peer deps npm cache clean --force rm -rf node_modules package-lock.json -npm install +npm install --legacy-peer-deps + +# Install missing FluentUI packages if needed +npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps ``` +**Explanation:** The `--legacy-peer-deps` flag is required due to peer dependency conflicts with @azure/msal-react. Some FluentUI packages may not be included in the initial install and need to be added separately. + **Port Conflicts:** ```bash # Check what's using the port @@ -206,21 +533,118 @@ netstat -ano | findstr :8000 # Windows ``` **Azure Authentication Issues:** + +If you get "Forbidden" errors when accessing App Configuration or Cosmos DB: +```bash +# Check your current Azure account +az account show + +# Get your principal ID for role assignments +az ad signed-in-user show --query id -o tsv + +# Verify you have the correct role assignments +az role assignment list --assignee $(az ad signed-in-user show --query id -o tsv) --resource-group + +# Refresh your access token +az account get-access-token --resource https://azconfig.io + +# If roles are missing, assign them (replace with your ID from above) +az role assignment create --role "App Configuration Data Reader" \ + --assignee \ + --scope /subscriptions//resourceGroups//providers/Microsoft.AppConfiguration/configurationStores/ + +az role assignment create --role "Cosmos DB Built-in Data Contributor" \ + --assignee \ + --scope /subscriptions//resourceGroups//providers/Microsoft.DocumentDB/databaseAccounts/ + +az role assignment create --role "Storage Queue Data Contributor" \ + --assignee \ + --scope /subscriptions//resourceGroups//providers/Microsoft.Storage/storageAccounts/ + +az role assignment create --role "Cognitive Services User" \ + --assignee \ + --scope /subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/ +``` + +**Note:** Role assignments can take 5-10 minutes to propagate through Azure AD. If you just assigned roles, wait a few minutes before retrying. + +**Cognitive Services Permission Errors:** + +If you see "401 Client Error: PermissionDenied" for Content Understanding service: +```bash +# Assign Cognitive Services User role +az role assignment create --role "Cognitive Services User" \ + --assignee \ + --scope /subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/ +``` + +This error occurs when processing documents. Wait 5-10 minutes after assigning the role, then restart the ContentProcessor service. + +**ManagedIdentityCredential Errors:** + +If you see "ManagedIdentityCredential authentication unavailable" or "No managed identity endpoint found": +```bash +# Ensure your .env files have these settings: +APP_ENV=dev +AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True + +# This tells the app to use Azure CLI credentials instead of Managed Identity +``` + +**Locations to check:** +- `src/ContentProcessorAPI/app/.env` +- `src/ContentProcessor/src/.env.dev` (note: must be `.env.dev` in the `src/` subdirectory, not `.env` in root) + +**Explanation:** Managed Identity is used in Azure deployments but doesn't work locally. Setting `APP_ENV=dev` switches to Azure CLI credential authentication. + +**General authentication reset:** ```bash -# Re-authenticate +# Re-authenticate with Azure CLI az logout az login ``` **CORS Issues:** -• Ensure API CORS settings include the web app URL -• Check browser network tab for CORS errors -• Verify API is running on the expected port + +If the frontend loads but shows "Unable to connect to the server" error: + +1. Verify CORS is configured in `src/ContentProcessorAPI/app/main.py`: +```python +from fastapi.middleware.cors import CORSMiddleware + +app = FastAPI(redirect_slashes=False) + +# Configure CORS for local development +app.add_middleware( + CORSMiddleware, + allow_origins=["http://localhost:3000"], + allow_credentials=True, + allow_methods=["*"], + allow_headers=["*"], +) +``` + +2. Restart the API service after adding CORS configuration +3. Check browser console (F12) for CORS errors +4. Verify API is running on port 8000 and frontend on port 3000 + +**Explanation:** CORS (Cross-Origin Resource Sharing) blocks requests between different origins by default. The frontend (localhost:3000) needs explicit permission to call the API (localhost:8000). + +**PowerShell Script Execution Policy Error:** + +If you get "cannot be loaded because running scripts is disabled" when activating venv: +```powershell +Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser +``` **Environment Variables Not Loading:** -• Verify `.env` file is in the correct directory +• Verify `.env` file is in the correct directory: + - ContentProcessorAPI: `src/ContentProcessorAPI/app/.env` + - ContentProcessor: `src/ContentProcessor/src/.env.dev` (must be `.env.dev`, not `.env`) + - ContentProcessorWeb: `src/ContentProcessorWeb/.env` • Check file permissions (especially on Linux/macOS) • Ensure no extra spaces in variable assignments +• Restart the service after changing `.env` files ### Debug Mode From ca1f89f620b68bec88f8ddaa91409d941f6d275c Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Tue, 16 Dec 2025 21:30:22 +0530 Subject: [PATCH 07/13] Fix formatting issue in LocalSetupGuide by removing duplicate line --- docs/LocalSetupGuide.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/LocalSetupGuide.md b/docs/LocalSetupGuide.md index 67148dab..d46542e6 100644 --- a/docs/LocalSetupGuide.md +++ b/docs/LocalSetupGuide.md @@ -329,7 +329,6 @@ The files for the dev container are located in `/.devcontainer/` folder. **Note:** This CORS configuration is only needed for local development. Azure deployment handles CORS at the infrastructure level. -13. **Run the application:** 13. **Run the application:** Open three separate terminal windows and run each component: From 4ea0168105e9c49d1e9cf126ebf815842b1dccf9 Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Wed, 17 Dec 2025 11:19:54 +0530 Subject: [PATCH 08/13] Add Azure prerequisites section to LocalSetupGuide --- docs/LocalSetupGuide.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/docs/LocalSetupGuide.md b/docs/LocalSetupGuide.md index d46542e6..94c3c290 100644 --- a/docs/LocalSetupGuide.md +++ b/docs/LocalSetupGuide.md @@ -8,6 +8,17 @@ - Docker Desktop (optional, for containerized development) - Visual Studio Code IDE (recommended) +## Azure Prerequisites + +To run this solution locally, you need the following Azure roles assigned to your user account on the resource group or individual resources: + +- **App Configuration Data Reader** - To read configuration from Azure App Configuration +- **Cosmos DB Built-in Data Contributor** - To read/write data in Cosmos DB +- **Storage Queue Data Contributor** - To process messages from Azure Storage Queue +- **Cognitive Services User** - To use Azure Content Understanding service + +These roles will be assigned in step 10 of the setup process below. + ## Local Setup **Note for macOS Developers:** If you are using macOS on Apple Silicon (ARM64), you may experience compatibility issues with some Azure services. We recommend testing thoroughly and using alternative approaches if needed. From 72aa3b85c505bcec2f40c689b773e73320564526 Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Wed, 17 Dec 2025 11:39:30 +0530 Subject: [PATCH 09/13] Add comprehensive Local Development Setup Guide --- docs/LocalSetupGuide_NEW.md | 688 ++++++++++++++++++++++++++++++++++++ 1 file changed, 688 insertions(+) create mode 100644 docs/LocalSetupGuide_NEW.md diff --git a/docs/LocalSetupGuide_NEW.md b/docs/LocalSetupGuide_NEW.md new file mode 100644 index 00000000..26e114fb --- /dev/null +++ b/docs/LocalSetupGuide_NEW.md @@ -0,0 +1,688 @@ +# Local Development Setup Guide + +This guide provides comprehensive instructions for setting up the Content Processing Solution Accelerator for local development across Windows and Linux platforms. + +## Important Setup Notes + +### Multi-Service Architecture + +This application consists of three separate services that run independently: + +1. **ContentProcessorAPI** - REST API server for the frontend +2. **ContentProcessor** - Background processor that handles document processing from Azure Storage Queue +3. **ContentProcessorWeb** - React-based user interface + +> ⚠️ **Critical**: Each service must run in its own terminal/console window +> +> - Do NOT close terminals while services are running +> - Open 3 separate terminal windows for local development +> - Each service will occupy its terminal and show live logs +> +> **Terminal Organization:** +> - Terminal 1: ContentProcessorAPI - HTTP server on port 8000 +> - Terminal 2: ContentProcessor - Runs continuously, polls Azure Storage Queue +> - Terminal 3: ContentProcessorWeb - Development server on port 3000 + +### Path Conventions + +All paths in this guide are relative to the repository root directory: + +``` +content-processing-solution-accelerator/ ← Repository root (start here) +├── src/ +│ ├── ContentProcessorAPI/ +│ │ ├── .venv/ ← Virtual environment +│ │ └── app/ +│ │ ├── main.py ← API entry point +│ │ └── .env ← API config file +│ ├── ContentProcessor/ +│ │ ├── .venv/ ← Virtual environment +│ │ └── src/ +│ │ ├── main.py ← Processor entry point +│ │ └── .env.dev ← Processor config file +│ └── ContentProcessorWeb/ +│ ├── node_modules/ +│ └── .env ← Frontend config file +└── docs/ ← Documentation (you are here) +``` + +Before starting any step, ensure you are in the repository root directory: + +```powershell +# Verify you're in the correct location +pwd # Linux/macOS - should show: .../content-processing-solution-accelerator +Get-Location # Windows PowerShell - should show: ...\content-processing-solution-accelerator + +# If not, navigate to repository root +cd path/to/content-processing-solution-accelerator +``` + +### Configuration Files + +This project uses separate `.env` files in each service directory with different configuration requirements: + +- **ContentProcessorAPI**: `src/ContentProcessorAPI/app/.env` - Azure App Configuration URL, Cosmos DB endpoint +- **ContentProcessor**: `src/ContentProcessor/src/.env.dev` - Azure App Configuration URL, Cosmos DB endpoint (note `.dev` suffix) +- **ContentProcessorWeb**: `src/ContentProcessorWeb/.env` - API base URL, authentication settings + +When copying `.env` samples, always navigate to the specific service directory first. + +## Step 1: Prerequisites - Install Required Tools + +### Windows Development + +```powershell +# Install Python 3.11+ and Git +winget install Python.Python.3.11 +winget install Git.Git + +# Install Node.js for frontend +winget install OpenJS.NodeJS.LTS + +# Verify installations +python --version # Should show Python 3.11.x +node --version # Should show v18.x or higher +npm --version +``` + +### Linux Development + +#### Ubuntu/Debian + +```bash +# Install prerequisites +sudo apt update && sudo apt install python3.11 python3.11-venv python3-pip git curl nodejs npm -y + +# Verify installations +python3.11 --version +node --version +npm --version +``` + +#### RHEL/CentOS/Fedora + +```bash +# Install prerequisites +sudo dnf install python3.11 python3.11-devel git curl gcc nodejs npm -y + +# Verify installations +python3.11 --version +node --version +npm --version +``` + +### Clone the Repository + +```bash +git clone https://github.com/microsoft/content-processing-solution-accelerator.git +cd content-processing-solution-accelerator +``` + +## Step 2: Azure Authentication Setup + +Before configuring services, authenticate with Azure: + +```bash +# Login to Azure CLI +az login + +# Set your subscription +az account set --subscription "your-subscription-id" + +# Verify authentication +az account show +``` + +### Get Azure Resource Information + +After deploying Azure resources (using `azd up` or Bicep template), gather the following information: + +```bash +# List resources in your resource group +az resource list -g -o table + +# Get App Configuration endpoint +az appconfig show -n -g --query endpoint -o tsv + +# Get Cosmos DB endpoint +az cosmosdb show -n -g --query documentEndpoint -o tsv +``` + +Example resource names from deployment: +- App Configuration: `appcs-{suffix}.azconfig.io` +- Cosmos DB: `cosmos-{suffix}.documents.azure.com` +- Storage Account: `st{suffix}.queue.core.windows.net` +- Content Understanding: `aicu-{suffix}.cognitiveservices.azure.com` + +### Required Azure RBAC Permissions + +To run the application locally, your Azure account needs the following role assignments on the deployed resources: + +#### Get Your Principal ID + +```bash +# Get your principal ID for role assignments +PRINCIPAL_ID=$(az ad signed-in-user show --query id -o tsv) +echo $PRINCIPAL_ID + +# Get your subscription ID +SUBSCRIPTION_ID=$(az account show --query id -o tsv) +echo $SUBSCRIPTION_ID +``` + +#### Assign Required Roles + +```bash +# 1. App Configuration Data Reader +az role assignment create \ + --role "App Configuration Data Reader" \ + --assignee $PRINCIPAL_ID \ + --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.AppConfiguration/configurationStores/" + +# 2. Cosmos DB Built-in Data Contributor +az role assignment create \ + --role "Cosmos DB Built-in Data Contributor" \ + --assignee $PRINCIPAL_ID \ + --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.DocumentDB/databaseAccounts/" + +# 3. Storage Queue Data Contributor +az role assignment create \ + --role "Storage Queue Data Contributor" \ + --assignee $PRINCIPAL_ID \ + --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.Storage/storageAccounts/" + +# 4. Cognitive Services User +az role assignment create \ + --role "Cognitive Services User" \ + --assignee $PRINCIPAL_ID \ + --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.CognitiveServices/accounts/" +``` + +> **Note:** RBAC permission changes can take 5-10 minutes to propagate. If you encounter "Forbidden" errors after assigning roles, wait a few minutes and try again. + +## Step 3: ContentProcessorAPI Setup & Run Instructions + +> 📋 **Terminal Reminder**: Open a dedicated terminal window (Terminal 1) for the ContentProcessorAPI service. All commands in this section assume you start from the repository root directory. + +The ContentProcessorAPI provides REST endpoints for the frontend and handles API requests. + +### 3.1. Navigate to API Directory + +```bash +# From repository root +cd src/ContentProcessorAPI +``` + +### 3.2. Create Virtual Environment + +```powershell +# Create virtual environment +python -m venv .venv + +# Activate virtual environment +.venv\Scripts\Activate.ps1 # Windows PowerShell +# or +source .venv/bin/activate # Linux/macOS +``` + +**Note for PowerShell Users:** If you get an error about scripts being disabled, run: +```powershell +Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser +``` + +### 3.3. Install Dependencies + +```bash +pip install -r requirements.txt +``` + +**If you encounter compilation errors** on Windows (cffi, pydantic-core, or cryptography): + +These packages often fail to build from source on Windows. Use this workaround to install precompiled wheels: + +```powershell +# Create temporary requirements without problematic packages +Get-Content requirements.txt | Where-Object { $_ -notmatch "cffi==1.17.1|pydantic==2.11.7|pydantic-core==2.33.2" } | Out-File temp_requirements.txt -Encoding utf8 + +# Install other dependencies first +pip install -r temp_requirements.txt + +# Install problematic packages with newer precompiled versions +pip install cffi==2.0.0 pydantic==2.12.5 pydantic-core==2.41.5 + +# Upgrade typing-extensions if needed +pip install --upgrade "typing-extensions>=4.14.1" "typing-inspection>=0.4.2" + +# Clean up temporary file +Remove-Item temp_requirements.txt +``` + +### 3.4. Configure Environment Variables + +Create a `.env` file in the `src/ContentProcessorAPI/app/` directory: + +```bash +cd app + +# Create .env file +New-Item .env # Windows PowerShell +# or +touch .env # Linux/macOS +``` + +Add the following to the `.env` file: + +```bash +# App Configuration endpoint from your Azure deployment +APP_CONFIG_ENDPOINT=https://.azconfig.io + +# Cosmos DB endpoint from your Azure deployment +AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ +AZURE_COSMOS_DATABASE=contentprocess + +# Local development settings - CRITICAL for local authentication +APP_ENV=dev +APP_AUTH_ENABLED=False +AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True +``` + +> ⚠️ **Important**: +> - Replace `` and `` with your actual Azure resource names +> - `APP_ENV=dev` is **REQUIRED** for local development - it enables Azure CLI credential usage instead of Managed Identity +> - Get your resource names from the Azure Portal or by running: `az resource list -g ` + +### 3.5. Configure CORS for Local Development + +Edit `src/ContentProcessorAPI/app/main.py` and add the CORS middleware configuration. + +Add the import at the top: + +```python +from fastapi.middleware.cors import CORSMiddleware +``` + +Then after the line `app = FastAPI(redirect_slashes=False)`, add: + +```python +# Configure CORS for local development +app.add_middleware( + CORSMiddleware, + allow_origins=["http://localhost:3000"], # Frontend URL + allow_credentials=True, + allow_methods=["*"], # Allow all HTTP methods + allow_headers=["*"], # Allow all headers +) +``` + +> **Note:** This CORS configuration is only needed for local development. Azure deployment handles CORS at the infrastructure level. + +### 3.6. Run the API + +```bash +# Make sure you're in the ContentProcessorAPI directory with activated venv +cd .. # Go back to ContentProcessorAPI root if in app/ + +# Run with uvicorn +python -m uvicorn app.main:app --reload --port 8000 +``` + +The ContentProcessorAPI will start at: +- API: `http://localhost:8000` +- API Documentation: `http://localhost:8000/docs` + +**Keep this terminal open** - the API server will continue running and show request logs. + +## Step 4: ContentProcessor Setup & Run Instructions + +> 📋 **Terminal Reminder**: Open a second dedicated terminal window (Terminal 2) for the ContentProcessor. Keep Terminal 1 (API) running. All commands assume you start from the repository root directory. + +The ContentProcessor handles background document processing from Azure Storage Queue. + +### 4.1. Navigate to Processor Directory + +```bash +# From repository root +cd src/ContentProcessor +``` + +### 4.2. Create Virtual Environment + +```powershell +# Create virtual environment +python -m venv .venv + +# Activate virtual environment +.venv\Scripts\Activate.ps1 # Windows PowerShell +# or +source .venv/bin/activate # Linux/macOS +``` + +### 4.3. Install Dependencies + +```bash +pip install -r requirements.txt +``` + +**If you encounter errors**, upgrade problematic packages: + +```powershell +pip install --upgrade cffi cryptography pydantic pydantic-core numpy pandas +``` + +### 4.4. Configure Environment Variables + +Create a `.env.dev` file (note the `.dev` suffix) in the `src/ContentProcessor/src/` directory: + +```bash +cd src + +# Create .env.dev file +New-Item .env.dev # Windows PowerShell +# or +touch .env.dev # Linux/macOS +``` + +Add the following to the `.env.dev` file: + +```bash +# App Configuration endpoint +APP_CONFIG_ENDPOINT=https://.azconfig.io + +# Cosmos DB endpoint +AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ +AZURE_COSMOS_DATABASE=contentprocess + +# Local development settings +APP_ENV=dev +APP_AUTH_ENABLED=False +AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True + +# Logging settings +APP_LOGGING_LEVEL=INFO +APP_LOGGING_ENABLE=True +``` + +> ⚠️ **Important**: The `.env.dev` file must be located in `src/ContentProcessor/src/` directory, not in `src/ContentProcessor/` root. The application looks for the `.env.dev` file in the same directory as `main.py`. + +### 4.5. Run the Processor + +```bash +# Make sure you're in the src directory +python main.py +``` + +The ContentProcessor will start and begin polling the Azure Storage Queue for messages. + +**Expected behavior:** +- You may see Storage Queue authorization errors if roles haven't propagated (wait 5-10 minutes) +- The processor will show continuous polling activity +- Document processing will begin when files are uploaded via the frontend + +**Keep this terminal open** - the processor will continue running and show processing logs. + +## Step 5: ContentProcessorWeb Setup & Run Instructions + +> 📋 **Terminal Reminder**: Open a third dedicated terminal window (Terminal 3) for the ContentProcessorWeb. Keep Terminals 1 (API) and 2 (Processor) running. All commands assume you start from the repository root directory. + +The ContentProcessorWeb provides the React-based user interface. + +### 5.1. Navigate to Frontend Directory + +```bash +# From repository root +cd src/ContentProcessorWeb +``` + +### 5.2. Install Dependencies + +```bash +# Install dependencies with legacy peer deps flag +npm install --legacy-peer-deps + +# Install additional required FluentUI packages +npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps +``` + +> **Note:** Always use the `--legacy-peer-deps` flag for npm commands in this project to avoid dependency conflicts with @azure/msal-react. + +### 5.3. Configure Environment Variables + +Update the `.env` file in the `src/ContentProcessorWeb/` directory: + +```bash +REACT_APP_API_BASE_URL=http://localhost:8000 +REACT_APP_AUTH_ENABLED=false +REACT_APP_CONSOLE_LOG_ENABLED=true +``` + +### 5.4. Start Development Server + +```bash +npm start +``` + +The ContentProcessorWeb will start at: `http://localhost:3000` + +**Keep this terminal open** - the React development server will continue running with hot reload. + +## Step 6: Verify All Services Are Running + +Before using the application, confirm all three services are running in separate terminals: + +### Terminal Status Checklist + +| Terminal | Service | Command | Expected Output | URL | +|----------|---------|---------|-----------------|-----| +| Terminal 1 | ContentProcessorAPI | `python -m uvicorn app.main:app --reload --port 8000` | `Application startup complete` | http://localhost:8000 | +| Terminal 2 | ContentProcessor | `python main.py` | Polling messages, no fatal errors | N/A | +| Terminal 3 | ContentProcessorWeb | `npm start` | `Compiled successfully!` | http://localhost:3000 | + +### Quick Verification + +1. **Check Backend API**: + ```bash + # In a new terminal (Terminal 4) + curl http://localhost:8000/health + # Expected: {"message":"I'm alive!"} + ``` + +2. **Check Frontend**: + - Open browser to http://localhost:3000 + - Should see the Content Processing UI + - No "Unable to connect to the server" errors + +3. **Check Processor**: + - Look at Terminal 2 output + - Should see processing activity or queue polling + - No authorization errors (if roles have propagated) + +## Step 7: Next Steps + +Once all services are running (as confirmed in Step 6), you can: + +1. **Access the Application**: Open `http://localhost:3000` in your browser to explore the frontend UI +2. **Upload Documents**: Use the UI to upload documents for processing +3. **View API Documentation**: Navigate to `http://localhost:8000/docs` to explore API endpoints +4. **Check Processing Status**: Monitor Terminal 2 for document processing logs + +## Troubleshooting + +### Common Issues + +#### Python Compilation Errors (Windows) + +If you see errors like "Microsoft Visual C++ 14.0 is required" or "error: metadata-generation-failed" when installing cffi, pydantic-core, or cryptography: + +```powershell +# Create temporary requirements excluding problematic packages +Get-Content requirements.txt | Where-Object { $_ -notmatch "cffi==1.17.1|pydantic==2.11.7|pydantic-core==2.33.2" } | Out-File temp_requirements.txt -Encoding utf8 + +# Install other dependencies first +pip install -r temp_requirements.txt + +# Install problematic packages with newer precompiled versions +pip install cffi==2.0.0 pydantic==2.12.5 pydantic-core==2.41.5 + +# Upgrade typing-extensions if needed +pip install --upgrade "typing-extensions>=4.14.1" "typing-inspection>=0.4.2" + +# Clean up +Remove-Item temp_requirements.txt +``` + +**Explanation:** Older versions of cffi (1.17.1) and pydantic-core (2.33.2) require compilation from source, which fails on Windows without Visual Studio build tools. Newer versions have precompiled wheels that install without compilation. + +#### pydantic_core ImportError + +If you see "PyO3 modules compiled for CPython 3.8 or older may only be initialized once" or "ImportError: pydantic_core._pydantic_core": + +```powershell +# Uninstall and reinstall with compatible versions +pip uninstall -y pydantic pydantic-core +pip install pydantic==2.12.5 pydantic-core==2.41.5 +pip install --upgrade "typing-extensions>=4.14.1" +``` + +**Explanation:** Version mismatch between pydantic and pydantic-core causes runtime errors. The compatible versions above work reliably together. + +#### Node.js Dependencies Issues + +```powershell +# Clear npm cache and reinstall with legacy peer deps +npm cache clean --force +Remove-Item -Recurse -Force node_modules -ErrorAction SilentlyContinue +Remove-Item -Force package-lock.json -ErrorAction SilentlyContinue +npm install --legacy-peer-deps + +# Install missing FluentUI packages if needed +npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps +``` + +**Explanation:** The `--legacy-peer-deps` flag is required due to peer dependency conflicts with @azure/msal-react. Some FluentUI packages may not be included in the initial install and need to be added separately. + +#### Azure Authentication Issues + +If you get "Forbidden" errors when accessing App Configuration or Cosmos DB: + +```bash +# Check your current Azure account +az account show + +# Get your principal ID for role assignments +az ad signed-in-user show --query id -o tsv + +# Verify you have the correct role assignments +az role assignment list --assignee $(az ad signed-in-user show --query id -o tsv) --resource-group + +# Refresh your access token +az account get-access-token --resource https://azconfig.io +``` + +If roles are missing, assign them as shown in Step 2. + +> **Note:** Role assignments can take 5-10 minutes to propagate through Azure AD. If you just assigned roles, wait a few minutes before retrying. + +#### Cognitive Services Permission Errors + +If you see "401 Client Error: PermissionDenied" for Content Understanding service: + +```bash +# Assign Cognitive Services User role +az role assignment create --role "Cognitive Services User" \ + --assignee \ + --scope /subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/ +``` + +This error occurs when processing documents. Wait 5-10 minutes after assigning the role, then restart the ContentProcessor service. + +#### ManagedIdentityCredential Errors + +If you see "ManagedIdentityCredential authentication unavailable" or "No managed identity endpoint found": + +```bash +# Ensure your .env files have these settings: +APP_ENV=dev +AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True +``` + +**Locations to check:** +- `src/ContentProcessorAPI/app/.env` +- `src/ContentProcessor/src/.env.dev` (note: must be `.env.dev` in the `src/` subdirectory, not `.env` in root) + +**Explanation:** Managed Identity is used in Azure deployments but doesn't work locally. Setting `APP_ENV=dev` switches to Azure CLI credential authentication. + +#### CORS Issues + +If the frontend loads but shows "Unable to connect to the server" error: + +1. Verify CORS is configured in `src/ContentProcessorAPI/app/main.py`: + ```python + from fastapi.middleware.cors import CORSMiddleware + + app = FastAPI(redirect_slashes=False) + + # Configure CORS for local development + app.add_middleware( + CORSMiddleware, + allow_origins=["http://localhost:3000"], + allow_credentials=True, + allow_methods=["*"], + allow_headers=["*"], + ) + ``` + +2. Restart the API service (Terminal 1) after adding CORS configuration +3. Check browser console (F12) for CORS errors +4. Verify API is running on port 8000 and frontend on port 3000 + +**Explanation:** CORS (Cross-Origin Resource Sharing) blocks requests between different origins by default. The frontend (localhost:3000) needs explicit permission to call the API (localhost:8000). + +#### Environment Variables Not Loading + +- Verify `.env` file is in the correct directory: + - ContentProcessorAPI: `src/ContentProcessorAPI/app/.env` + - ContentProcessor: `src/ContentProcessor/src/.env.dev` (must be `.env.dev`, not `.env`) + - ContentProcessorWeb: `src/ContentProcessorWeb/.env` +- Check file permissions (especially on Linux/macOS) +- Ensure no extra spaces in variable assignments +- Restart the service after changing `.env` files + +#### PowerShell Script Execution Policy Error + +If you get "cannot be loaded because running scripts is disabled" when activating venv: + +```powershell +Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser +``` + +#### Port Conflicts + +```bash +# Check what's using the port +netstat -ano | findstr :8000 # Windows +netstat -tulpn | grep :8000 # Linux/Mac + +# Kill the process using the port if needed +# Windows: taskkill /PID /F +# Linux: kill -9 +``` + +### Debug Mode + +Enable detailed logging by setting these environment variables in your `.env` files: + +```bash +APP_LOGGING_LEVEL=DEBUG +APP_LOGGING_ENABLE=True +``` + +## Related Documentation + +- [Deployment Guide](./DeploymentGuide.md) - Production deployment instructions +- [Technical Architecture](./TechnicalArchitecture.md) - System architecture overview +- [API Documentation](./API.md) - API endpoint details +- [README](../README.md) - Project overview and getting started + +--- + +For additional support, please submit issues to the [GitHub repository](https://github.com/microsoft/content-processing-solution-accelerator/issues). From 4bfdaea290eaac807d83746d4277b78d806cb836 Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Wed, 17 Dec 2025 16:51:42 +0530 Subject: [PATCH 10/13] Remove outdated Local Development Setup Guide document --- docs/LocalSetupGuide.md | 1039 ++++++++++++++++++----------------- docs/LocalSetupGuide_NEW.md | 688 ----------------------- 2 files changed, 525 insertions(+), 1202 deletions(-) delete mode 100644 docs/LocalSetupGuide_NEW.md diff --git a/docs/LocalSetupGuide.md b/docs/LocalSetupGuide.md index 94c3c290..26e114fb 100644 --- a/docs/LocalSetupGuide.md +++ b/docs/LocalSetupGuide.md @@ -1,471 +1,515 @@ -# Guide to Local Development +# Local Development Setup Guide -## Requirements +This guide provides comprehensive instructions for setting up the Content Processing Solution Accelerator for local development across Windows and Linux platforms. -- Python 3.11 or higher + PIP -- Node.js 18+ and npm -- Azure CLI and an Azure Subscription -- Docker Desktop (optional, for containerized development) -- Visual Studio Code IDE (recommended) +## Important Setup Notes -## Azure Prerequisites +### Multi-Service Architecture -To run this solution locally, you need the following Azure roles assigned to your user account on the resource group or individual resources: +This application consists of three separate services that run independently: -- **App Configuration Data Reader** - To read configuration from Azure App Configuration -- **Cosmos DB Built-in Data Contributor** - To read/write data in Cosmos DB -- **Storage Queue Data Contributor** - To process messages from Azure Storage Queue -- **Cognitive Services User** - To use Azure Content Understanding service +1. **ContentProcessorAPI** - REST API server for the frontend +2. **ContentProcessor** - Background processor that handles document processing from Azure Storage Queue +3. **ContentProcessorWeb** - React-based user interface -These roles will be assigned in step 10 of the setup process below. +> ⚠️ **Critical**: Each service must run in its own terminal/console window +> +> - Do NOT close terminals while services are running +> - Open 3 separate terminal windows for local development +> - Each service will occupy its terminal and show live logs +> +> **Terminal Organization:** +> - Terminal 1: ContentProcessorAPI - HTTP server on port 8000 +> - Terminal 2: ContentProcessor - Runs continuously, polls Azure Storage Queue +> - Terminal 3: ContentProcessorWeb - Development server on port 3000 -## Local Setup +### Path Conventions -**Note for macOS Developers:** If you are using macOS on Apple Silicon (ARM64), you may experience compatibility issues with some Azure services. We recommend testing thoroughly and using alternative approaches if needed. +All paths in this guide are relative to the repository root directory: -The easiest way to run this accelerator is in a VS Code Dev Container, which will open the project in your local VS Code using the [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers): +``` +content-processing-solution-accelerator/ ← Repository root (start here) +├── src/ +│ ├── ContentProcessorAPI/ +│ │ ├── .venv/ ← Virtual environment +│ │ └── app/ +│ │ ├── main.py ← API entry point +│ │ └── .env ← API config file +│ ├── ContentProcessor/ +│ │ ├── .venv/ ← Virtual environment +│ │ └── src/ +│ │ ├── main.py ← Processor entry point +│ │ └── .env.dev ← Processor config file +│ └── ContentProcessorWeb/ +│ ├── node_modules/ +│ └── .env ← Frontend config file +└── docs/ ← Documentation (you are here) +``` + +Before starting any step, ensure you are in the repository root directory: + +```powershell +# Verify you're in the correct location +pwd # Linux/macOS - should show: .../content-processing-solution-accelerator +Get-Location # Windows PowerShell - should show: ...\content-processing-solution-accelerator -1. Start Docker Desktop (install it if not already installed) -2. Open the project: [Open in Dev Containers](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/microsoft/content-processing-solution-accelerator) -3. In the VS Code window that opens, once the project files show up (this may take several minutes), open a terminal window +# If not, navigate to repository root +cd path/to/content-processing-solution-accelerator +``` + +### Configuration Files + +This project uses separate `.env` files in each service directory with different configuration requirements: -## Detailed Development Container Setup Instructions +- **ContentProcessorAPI**: `src/ContentProcessorAPI/app/.env` - Azure App Configuration URL, Cosmos DB endpoint +- **ContentProcessor**: `src/ContentProcessor/src/.env.dev` - Azure App Configuration URL, Cosmos DB endpoint (note `.dev` suffix) +- **ContentProcessorWeb**: `src/ContentProcessorWeb/.env` - API base URL, authentication settings -The solution contains a [development container](https://code.visualstudio.com/docs/remote/containers) with all the required tooling to develop and deploy the accelerator. To deploy the Content Processing Solution Accelerator using the provided development container you will also need: +When copying `.env` samples, always navigate to the specific service directory first. -• [Visual Studio Code](https://code.visualstudio.com/) -• [Remote containers extension for Visual Studio Code](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) +## Step 1: Prerequisites - Install Required Tools -If you are running this on Windows, we recommend you clone this repository in [WSL](https://code.visualstudio.com/docs/remote/wsl): +### Windows Development + +```powershell +# Install Python 3.11+ and Git +winget install Python.Python.3.11 +winget install Git.Git + +# Install Node.js for frontend +winget install OpenJS.NodeJS.LTS + +# Verify installations +python --version # Should show Python 3.11.x +node --version # Should show v18.x or higher +npm --version +``` + +### Linux Development + +#### Ubuntu/Debian ```bash -git clone https://github.com/microsoft/content-processing-solution-accelerator +# Install prerequisites +sudo apt update && sudo apt install python3.11 python3.11-venv python3-pip git curl nodejs npm -y + +# Verify installations +python3.11 --version +node --version +npm --version ``` -Open the cloned repository in Visual Studio Code and connect to the development container: +#### RHEL/CentOS/Fedora ```bash -code . +# Install prerequisites +sudo dnf install python3.11 python3.11-devel git curl gcc nodejs npm -y + +# Verify installations +python3.11 --version +node --version +npm --version ``` -!!! tip - Visual Studio Code should recognize the available development container and ask you to open the folder using it. For additional details on connecting to remote containers, please see the [Open an existing folder in a container](https://code.visualstudio.com/docs/remote/containers#_quick-start-open-an-existing-folder-in-a-container) quickstart. +### Clone the Repository -When you start the development container for the first time, the container will be built. This usually takes a few minutes. Please use the development container for all further steps. +```bash +git clone https://github.com/microsoft/content-processing-solution-accelerator.git +cd content-processing-solution-accelerator +``` -The files for the dev container are located in `/.devcontainer/` folder. +## Step 2: Azure Authentication Setup -## Local Deployment and Debugging +Before configuring services, authenticate with Azure: -1. **Clone the repository.** +```bash +# Login to Azure CLI +az login -2. **Log into the Azure CLI:** - • Check your login status using: `az account show` - • If not logged in, use: `az login` - • To specify a tenant, use: `az login --tenant ` +# Set your subscription +az account set --subscription "your-subscription-id" -3. **Create a Resource Group:** - • You can create it either through the Azure Portal or the Azure CLI: - ```bash - az group create --name --location EastUS2 - ``` +# Verify authentication +az account show +``` -4. **Deploy the Bicep template:** - • You can use the Bicep extension for VSCode (Right-click the `.bicep` file, then select "Show deployment pane") or use the Azure CLI: - ```bash - az deployment group create -g -f infra/main.bicep --query 'properties.outputs' - ``` - - **Note:** You will be prompted for a `principalId`, which is the ObjectID of your user in Entra ID. To find it, use the Azure Portal or run: - ```bash - az ad signed-in-user show --query id -o tsv - ``` - - You will also be prompted for locations for Azure OpenAI and Azure AI Content Understanding services. This is to allow separate regions where there may be service quota restrictions. - - **Additional Notes:** - - **Role Assignments in Bicep Deployment:** - - The main.bicep deployment includes the assignment of the appropriate roles to Azure OpenAI and Cosmos services. If you want to modify an existing implementation—for example, to use resources deployed as part of the simple deployment for local debugging—you will need to add your own credentials to access the Cosmos and Azure OpenAI services. You can add these permissions using the following commands: - - ```bash - az cosmosdb sql role assignment create --resource-group --account-name --role-definition-name "Cosmos DB Built-in Data Contributor" --principal-id --scope /subscriptions//resourceGroups//providers/Microsoft.DocumentDB/databaseAccounts/ - - az role assignment create --assignee --role "Cognitive Services OpenAI User" --scope /subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/ - ``` - - **Using a Different Database in Cosmos:** - - You can set the solution up to use a different database in Cosmos. For example, you can name it something like `contentprocess-dev`. To do this: - - i. Change the environment variable `AZURE_COSMOS_DATABASE` to the new database name. - - ii. You will need to create the database in the Cosmos DB account. You can do this from the Data Explorer pane in the portal, click on the drop down labeled "+ New Container" and provide all the necessary details. - -5. **Create `.env` files:** - • Navigate to the root folder and each component folder (`src/ContentProcessor`, `src/ContentProcessorAPI`, `src/ContentProcessorWeb`) and create `.env` files based on the provided `.env.sample` files. - -6. **Fill in the `.env` files:** - • Use the output from the deployment or check the Azure Portal under "Deployments" in the resource group. - -7. **(Optional) Set up virtual environments:** - • If you are using `venv`, create and activate your virtual environment for both the backend components: - - **Content Processor API:** - - PowerShell: - ```powershell - cd src\ContentProcessorAPI - python -m venv .venv - .venv\Scripts\Activate.ps1 - ``` - - Command Prompt: - ```cmd - cd src\ContentProcessorAPI - python -m venv .venv - .venv\Scripts\activate.bat - ``` - - Git Bash / Linux / macOS: - ```bash - cd src/ContentProcessorAPI - python -m venv .venv - source .venv/bin/activate - ``` - - **Content Processor:** - - PowerShell: - ```powershell - cd src\ContentProcessor - python -m venv .venv - .venv\Scripts\Activate.ps1 - ``` - - Command Prompt: - ```cmd - cd src\ContentProcessor - python -m venv .venv - .venv\Scripts\activate.bat - ``` - - Git Bash / Linux / macOS: - ```bash - cd src/ContentProcessor - python -m venv .venv - source .venv/bin/activate - ``` - - **Note for PowerShell Users:** If you get an error about scripts being disabled, run: - ```powershell - Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser - ``` +### Get Azure Resource Information -8. **Install requirements - Backend components:** - - **ContentProcessorAPI:** - - Navigate to `src/ContentProcessorAPI` and install dependencies: - ```bash - cd src\ContentProcessorAPI - pip install -r requirements.txt - ``` - - **If you encounter compilation errors** on Windows (cffi, pydantic-core, or cryptography): - - These packages often fail to build from source on Windows. Use this workaround to install precompiled wheels: - - ```powershell - # Create temporary requirements without problematic packages - Get-Content requirements.txt | Where-Object { $_ -notmatch "cffi==1.17.1|pydantic==2.11.7|pydantic-core==2.33.2" } | Out-File temp_requirements.txt -Encoding utf8 - - # Install other dependencies first - pip install -r temp_requirements.txt - - # Install problematic packages with newer precompiled versions - pip install cffi==2.0.0 pydantic==2.12.5 pydantic-core==2.41.5 - - # Upgrade typing-extensions if needed - pip install --upgrade "typing-extensions>=4.14.1" "typing-inspection>=0.4.2" - - # Clean up temporary file - Remove-Item temp_requirements.txt - ``` - - **ContentProcessor:** - - Navigate to `src/ContentProcessor` and install dependencies: - ```bash - cd src\ContentProcessor - pip install -r requirements.txt - ``` - - **If you encounter errors**, upgrade problematic packages: - ```powershell - pip install --upgrade cffi cryptography pydantic pydantic-core numpy pandas - ``` - - **Note:** Python 3.11+ has better precompiled wheel support. Avoid Python 3.12 as some packages may not be compatible yet. - -9. **Configure environment variables:** - - **ContentProcessorAPI:** - - Create a `.env` file in `src/ContentProcessorAPI/app/` directory with the following content: - ```bash - # App Configuration endpoint from your Azure deployment - APP_CONFIG_ENDPOINT=https://.azconfig.io - - # Cosmos DB endpoint from your Azure deployment - AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ - AZURE_COSMOS_DATABASE=contentprocess - - # Local development settings - CRITICAL for local authentication - APP_ENV=dev - APP_AUTH_ENABLED=False - AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True - ``` - - **ContentProcessor:** - - Create a `.env.dev` file (note the `.dev` suffix) in `src/ContentProcessor/src/` directory: - ```bash - # App Configuration endpoint - APP_CONFIG_ENDPOINT=https://.azconfig.io - - # Cosmos DB endpoint - AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ - AZURE_COSMOS_DATABASE=contentprocess - - # Local development settings - APP_ENV=dev - APP_AUTH_ENABLED=False - AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True - - # Logging settings - APP_LOGGING_LEVEL=INFO - APP_LOGGING_ENABLE=True - ``` - - **ContentProcessorWeb:** - - Update the `.env` file in `src/ContentProcessorWeb/` directory: - ```bash - REACT_APP_API_BASE_URL=http://localhost:8000 - REACT_APP_AUTH_ENABLED=false - REACT_APP_CONSOLE_LOG_ENABLED=true - ``` - - **Important Notes:** - - Replace `` and `` with your actual Azure resource names from deployment - - `APP_ENV=dev` is **REQUIRED** for local development - it enables Azure CLI credential usage instead of Managed Identity - - ContentProcessor requires `.env.dev` (not `.env`) in the `src/` subdirectory - - Get your resource names from Azure Portal or by running: `az resource list -g ` - -10. **Assign Azure RBAC roles:** - Before running the application locally, you need proper Azure permissions: - - ```bash - # Get your Azure principal ID (user object ID) - az ad signed-in-user show --query id -o tsv - - # Get your subscription ID - az account show --query id -o tsv - - # Assign App Configuration Data Reader role - az role assignment create --role "App Configuration Data Reader" \ - --assignee \ - --scope /subscriptions//resourceGroups//providers/Microsoft.AppConfiguration/configurationStores/ - - # Assign Cosmos DB Data Contributor role - az role assignment create --role "Cosmos DB Built-in Data Contributor" \ - --assignee \ - --scope /subscriptions//resourceGroups//providers/Microsoft.DocumentDB/databaseAccounts/ - - # Assign Storage Queue Data Contributor role (for full file processing) - az role assignment create --role "Storage Queue Data Contributor" \ - --assignee \ - --scope /subscriptions//resourceGroups//providers/Microsoft.Storage/storageAccounts/ - - # Assign Cognitive Services User role (for Content Understanding) - az role assignment create --role "Cognitive Services User" \ - --assignee \ - --scope /subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/ - ``` - - **Note:** Azure role assignments can take 5-10 minutes to propagate. If you get "Forbidden" errors when starting the API, wait a few minutes and try again. - -11. **Install requirements - Frontend:** - • Navigate to the frontend folder: - ```bash - cd src\ContentProcessorWeb - ``` - - • Install dependencies with `--legacy-peer-deps` flag (required for @azure/msal-react compatibility): - ```powershell - npm install --legacy-peer-deps - ``` - - • Install additional required FluentUI packages: - ```powershell - npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps - ``` - - **Note:** Always use the `--legacy-peer-deps` flag for npm commands in this project to avoid dependency conflicts. - -12. **Configure CORS for local development:** - - The FastAPI backend needs CORS configuration to allow requests from the React frontend during local development. - - Edit `src/ContentProcessorAPI/app/main.py` and add the CORS middleware configuration: - - ```python - from fastapi.middleware.cors import CORSMiddleware - ``` - - Then after the line `app = FastAPI(redirect_slashes=False)`, add: - - ```python - # Configure CORS for local development - app.add_middleware( - CORSMiddleware, - allow_origins=["http://localhost:3000"], # Frontend URL - allow_credentials=True, - allow_methods=["*"], # Allow all HTTP methods - allow_headers=["*"], # Allow all headers - ) - ``` - - **Note:** This CORS configuration is only needed for local development. Azure deployment handles CORS at the infrastructure level. - -13. **Run the application:** - - Open three separate terminal windows and run each component: - - **Terminal 1 - API (ContentProcessorAPI):** - - PowerShell: - ```powershell - cd src\ContentProcessorAPI - .venv\Scripts\Activate.ps1 - python -m uvicorn app.main:app --reload --port 8000 - ``` - - Command Prompt: - ```cmd - cd src\ContentProcessorAPI - .venv\Scripts\activate.bat - python -m uvicorn app.main:app --reload --port 8000 - ``` - - Git Bash / Linux / macOS: - ```bash - cd src/ContentProcessorAPI - source .venv/bin/activate - python -m uvicorn app.main:app --reload --port 8000 - ``` - - **Terminal 2 - Background Processor (ContentProcessor):** - - PowerShell: - ```powershell - cd src\ContentProcessor - .venv\Scripts\Activate.ps1 - python src/main.py - ``` - - Command Prompt: - ```cmd - cd src\ContentProcessor - .venv\Scripts\activate.bat - python src/main.py - ``` - - Git Bash / Linux / macOS: - ```bash - cd src/ContentProcessor - source .venv/bin/activate - python src/main.py - ``` - - **Terminal 3 - Frontend (ContentProcessorWeb):** - ```bash - cd src\ContentProcessorWeb - npm start - ``` - - **Troubleshooting startup:** - - If you get "Forbidden" errors from App Configuration or Cosmos DB, ensure your Azure role assignments have propagated (wait 5-10 minutes after creating them) - - If you see "ManagedIdentityCredential" errors, verify `.env` files have `APP_ENV=dev` set - - If frontend shows "Unable to connect to the server", verify you added CORS configuration in `main.py` (step 12) and restart the API - - Storage Queue errors in ContentProcessor are expected if you haven't assigned the Storage Queue Data Contributor role - the processor will keep retrying - - Content Understanding 401 errors are expected if you haven't assigned the Cognitive Services User role - -14. **Open a browser and navigate to `http://localhost:3000`** - -15. **To see Swagger API documentation, you can navigate to `http://localhost:8000/docs`** - -## Debugging the Solution Locally - -You can debug the API backend running locally with VSCode using the following launch.json entry: - -```json -{ - "name": "Python Debugger: Content Processor API", - "type": "debugpy", - "request": "launch", - "cwd": "${workspaceFolder}/src/ContentProcessorAPI", - "module": "uvicorn", - "args": ["app.main:app", "--reload"], - "jinja": true -} -``` - -To debug the Content Processor service, add the following launch.json entry: - -```json -{ - "name": "Python Debugger: Content Processor", - "type": "debugpy", - "request": "launch", - "cwd": "${workspaceFolder}/src/ContentProcessor", - "program": "src/main.py", - "jinja": true -} -``` - -For debugging the React frontend, you can use the browser's developer tools or set up debugging in VS Code with the appropriate extensions. +After deploying Azure resources (using `azd up` or Bicep template), gather the following information: -## Troubleshooting +```bash +# List resources in your resource group +az resource list -g -o table -### Common Issues +# Get App Configuration endpoint +az appconfig show -n -g --query endpoint -o tsv + +# Get Cosmos DB endpoint +az cosmosdb show -n -g --query documentEndpoint -o tsv +``` + +Example resource names from deployment: +- App Configuration: `appcs-{suffix}.azconfig.io` +- Cosmos DB: `cosmos-{suffix}.documents.azure.com` +- Storage Account: `st{suffix}.queue.core.windows.net` +- Content Understanding: `aicu-{suffix}.cognitiveservices.azure.com` + +### Required Azure RBAC Permissions + +To run the application locally, your Azure account needs the following role assignments on the deployed resources: + +#### Get Your Principal ID + +```bash +# Get your principal ID for role assignments +PRINCIPAL_ID=$(az ad signed-in-user show --query id -o tsv) +echo $PRINCIPAL_ID + +# Get your subscription ID +SUBSCRIPTION_ID=$(az account show --query id -o tsv) +echo $SUBSCRIPTION_ID +``` + +#### Assign Required Roles + +```bash +# 1. App Configuration Data Reader +az role assignment create \ + --role "App Configuration Data Reader" \ + --assignee $PRINCIPAL_ID \ + --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.AppConfiguration/configurationStores/" + +# 2. Cosmos DB Built-in Data Contributor +az role assignment create \ + --role "Cosmos DB Built-in Data Contributor" \ + --assignee $PRINCIPAL_ID \ + --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.DocumentDB/databaseAccounts/" + +# 3. Storage Queue Data Contributor +az role assignment create \ + --role "Storage Queue Data Contributor" \ + --assignee $PRINCIPAL_ID \ + --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.Storage/storageAccounts/" + +# 4. Cognitive Services User +az role assignment create \ + --role "Cognitive Services User" \ + --assignee $PRINCIPAL_ID \ + --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.CognitiveServices/accounts/" +``` -**Python Module Not Found:** +> **Note:** RBAC permission changes can take 5-10 minutes to propagate. If you encounter "Forbidden" errors after assigning roles, wait a few minutes and try again. + +## Step 3: ContentProcessorAPI Setup & Run Instructions + +> 📋 **Terminal Reminder**: Open a dedicated terminal window (Terminal 1) for the ContentProcessorAPI service. All commands in this section assume you start from the repository root directory. + +The ContentProcessorAPI provides REST endpoints for the frontend and handles API requests. + +### 3.1. Navigate to API Directory + +```bash +# From repository root +cd src/ContentProcessorAPI +``` + +### 3.2. Create Virtual Environment -PowerShell: ```powershell -# Ensure virtual environment is activated -.venv\Scripts\Activate.ps1 -pip install -r requirements.txt +# Create virtual environment +python -m venv .venv + +# Activate virtual environment +.venv\Scripts\Activate.ps1 # Windows PowerShell +# or +source .venv/bin/activate # Linux/macOS ``` -Command Prompt: -```cmd -# Ensure virtual environment is activated -.venv\Scripts\activate.bat +**Note for PowerShell Users:** If you get an error about scripts being disabled, run: +```powershell +Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser +``` + +### 3.3. Install Dependencies + +```bash pip install -r requirements.txt ``` -Git Bash / Linux / macOS: +**If you encounter compilation errors** on Windows (cffi, pydantic-core, or cryptography): + +These packages often fail to build from source on Windows. Use this workaround to install precompiled wheels: + +```powershell +# Create temporary requirements without problematic packages +Get-Content requirements.txt | Where-Object { $_ -notmatch "cffi==1.17.1|pydantic==2.11.7|pydantic-core==2.33.2" } | Out-File temp_requirements.txt -Encoding utf8 + +# Install other dependencies first +pip install -r temp_requirements.txt + +# Install problematic packages with newer precompiled versions +pip install cffi==2.0.0 pydantic==2.12.5 pydantic-core==2.41.5 + +# Upgrade typing-extensions if needed +pip install --upgrade "typing-extensions>=4.14.1" "typing-inspection>=0.4.2" + +# Clean up temporary file +Remove-Item temp_requirements.txt +``` + +### 3.4. Configure Environment Variables + +Create a `.env` file in the `src/ContentProcessorAPI/app/` directory: + +```bash +cd app + +# Create .env file +New-Item .env # Windows PowerShell +# or +touch .env # Linux/macOS +``` + +Add the following to the `.env` file: + +```bash +# App Configuration endpoint from your Azure deployment +APP_CONFIG_ENDPOINT=https://.azconfig.io + +# Cosmos DB endpoint from your Azure deployment +AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ +AZURE_COSMOS_DATABASE=contentprocess + +# Local development settings - CRITICAL for local authentication +APP_ENV=dev +APP_AUTH_ENABLED=False +AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True +``` + +> ⚠️ **Important**: +> - Replace `` and `` with your actual Azure resource names +> - `APP_ENV=dev` is **REQUIRED** for local development - it enables Azure CLI credential usage instead of Managed Identity +> - Get your resource names from the Azure Portal or by running: `az resource list -g ` + +### 3.5. Configure CORS for Local Development + +Edit `src/ContentProcessorAPI/app/main.py` and add the CORS middleware configuration. + +Add the import at the top: + +```python +from fastapi.middleware.cors import CORSMiddleware +``` + +Then after the line `app = FastAPI(redirect_slashes=False)`, add: + +```python +# Configure CORS for local development +app.add_middleware( + CORSMiddleware, + allow_origins=["http://localhost:3000"], # Frontend URL + allow_credentials=True, + allow_methods=["*"], # Allow all HTTP methods + allow_headers=["*"], # Allow all headers +) +``` + +> **Note:** This CORS configuration is only needed for local development. Azure deployment handles CORS at the infrastructure level. + +### 3.6. Run the API + +```bash +# Make sure you're in the ContentProcessorAPI directory with activated venv +cd .. # Go back to ContentProcessorAPI root if in app/ + +# Run with uvicorn +python -m uvicorn app.main:app --reload --port 8000 +``` + +The ContentProcessorAPI will start at: +- API: `http://localhost:8000` +- API Documentation: `http://localhost:8000/docs` + +**Keep this terminal open** - the API server will continue running and show request logs. + +## Step 4: ContentProcessor Setup & Run Instructions + +> 📋 **Terminal Reminder**: Open a second dedicated terminal window (Terminal 2) for the ContentProcessor. Keep Terminal 1 (API) running. All commands assume you start from the repository root directory. + +The ContentProcessor handles background document processing from Azure Storage Queue. + +### 4.1. Navigate to Processor Directory + +```bash +# From repository root +cd src/ContentProcessor +``` + +### 4.2. Create Virtual Environment + +```powershell +# Create virtual environment +python -m venv .venv + +# Activate virtual environment +.venv\Scripts\Activate.ps1 # Windows PowerShell +# or +source .venv/bin/activate # Linux/macOS +``` + +### 4.3. Install Dependencies + ```bash -# Ensure virtual environment is activated -source .venv/bin/activate pip install -r requirements.txt ``` -**Python Dependency Compilation Errors (Windows):** +**If you encounter errors**, upgrade problematic packages: + +```powershell +pip install --upgrade cffi cryptography pydantic pydantic-core numpy pandas +``` + +### 4.4. Configure Environment Variables + +Create a `.env.dev` file (note the `.dev` suffix) in the `src/ContentProcessor/src/` directory: + +```bash +cd src + +# Create .env.dev file +New-Item .env.dev # Windows PowerShell +# or +touch .env.dev # Linux/macOS +``` + +Add the following to the `.env.dev` file: + +```bash +# App Configuration endpoint +APP_CONFIG_ENDPOINT=https://.azconfig.io + +# Cosmos DB endpoint +AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ +AZURE_COSMOS_DATABASE=contentprocess + +# Local development settings +APP_ENV=dev +APP_AUTH_ENABLED=False +AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True + +# Logging settings +APP_LOGGING_LEVEL=INFO +APP_LOGGING_ENABLE=True +``` + +> ⚠️ **Important**: The `.env.dev` file must be located in `src/ContentProcessor/src/` directory, not in `src/ContentProcessor/` root. The application looks for the `.env.dev` file in the same directory as `main.py`. + +### 4.5. Run the Processor + +```bash +# Make sure you're in the src directory +python main.py +``` + +The ContentProcessor will start and begin polling the Azure Storage Queue for messages. + +**Expected behavior:** +- You may see Storage Queue authorization errors if roles haven't propagated (wait 5-10 minutes) +- The processor will show continuous polling activity +- Document processing will begin when files are uploaded via the frontend + +**Keep this terminal open** - the processor will continue running and show processing logs. + +## Step 5: ContentProcessorWeb Setup & Run Instructions + +> 📋 **Terminal Reminder**: Open a third dedicated terminal window (Terminal 3) for the ContentProcessorWeb. Keep Terminals 1 (API) and 2 (Processor) running. All commands assume you start from the repository root directory. + +The ContentProcessorWeb provides the React-based user interface. + +### 5.1. Navigate to Frontend Directory + +```bash +# From repository root +cd src/ContentProcessorWeb +``` + +### 5.2. Install Dependencies + +```bash +# Install dependencies with legacy peer deps flag +npm install --legacy-peer-deps + +# Install additional required FluentUI packages +npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps +``` + +> **Note:** Always use the `--legacy-peer-deps` flag for npm commands in this project to avoid dependency conflicts with @azure/msal-react. + +### 5.3. Configure Environment Variables + +Update the `.env` file in the `src/ContentProcessorWeb/` directory: + +```bash +REACT_APP_API_BASE_URL=http://localhost:8000 +REACT_APP_AUTH_ENABLED=false +REACT_APP_CONSOLE_LOG_ENABLED=true +``` + +### 5.4. Start Development Server + +```bash +npm start +``` + +The ContentProcessorWeb will start at: `http://localhost:3000` + +**Keep this terminal open** - the React development server will continue running with hot reload. + +## Step 6: Verify All Services Are Running + +Before using the application, confirm all three services are running in separate terminals: + +### Terminal Status Checklist + +| Terminal | Service | Command | Expected Output | URL | +|----------|---------|---------|-----------------|-----| +| Terminal 1 | ContentProcessorAPI | `python -m uvicorn app.main:app --reload --port 8000` | `Application startup complete` | http://localhost:8000 | +| Terminal 2 | ContentProcessor | `python main.py` | Polling messages, no fatal errors | N/A | +| Terminal 3 | ContentProcessorWeb | `npm start` | `Compiled successfully!` | http://localhost:3000 | + +### Quick Verification + +1. **Check Backend API**: + ```bash + # In a new terminal (Terminal 4) + curl http://localhost:8000/health + # Expected: {"message":"I'm alive!"} + ``` + +2. **Check Frontend**: + - Open browser to http://localhost:3000 + - Should see the Content Processing UI + - No "Unable to connect to the server" errors + +3. **Check Processor**: + - Look at Terminal 2 output + - Should see processing activity or queue polling + - No authorization errors (if roles have propagated) + +## Step 7: Next Steps + +Once all services are running (as confirmed in Step 6), you can: + +1. **Access the Application**: Open `http://localhost:3000` in your browser to explore the frontend UI +2. **Upload Documents**: Use the UI to upload documents for processing +3. **View API Documentation**: Navigate to `http://localhost:8000/docs` to explore API endpoints +4. **Check Processing Status**: Monitor Terminal 2 for document processing logs + +## Troubleshooting + +### Common Issues + +#### Python Compilation Errors (Windows) If you see errors like "Microsoft Visual C++ 14.0 is required" or "error: metadata-generation-failed" when installing cffi, pydantic-core, or cryptography: @@ -488,9 +532,10 @@ Remove-Item temp_requirements.txt **Explanation:** Older versions of cffi (1.17.1) and pydantic-core (2.33.2) require compilation from source, which fails on Windows without Visual Studio build tools. Newer versions have precompiled wheels that install without compilation. -**pydantic_core ImportError:** +#### pydantic_core ImportError If you see "PyO3 modules compiled for CPython 3.8 or older may only be initialized once" or "ImportError: pydantic_core._pydantic_core": + ```powershell # Uninstall and reinstall with compatible versions pip uninstall -y pydantic pydantic-core @@ -500,17 +545,8 @@ pip install --upgrade "typing-extensions>=4.14.1" **Explanation:** Version mismatch between pydantic and pydantic-core causes runtime errors. The compatible versions above work reliably together. -**pandas/numpy Import Errors:** +#### Node.js Dependencies Issues -If you see "Error importing numpy from its source directory": -```powershell -# Force reinstall all requirements to resolve conflicts -pip install --upgrade --force-reinstall -r requirements.txt -``` - -**Node.js Dependencies Issues:** - -PowerShell: ```powershell # Clear npm cache and reinstall with legacy peer deps npm cache clean --force @@ -522,29 +558,12 @@ npm install --legacy-peer-deps npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps ``` -Bash / Linux / macOS: -```bash -# Clear npm cache and reinstall with legacy peer deps -npm cache clean --force -rm -rf node_modules package-lock.json -npm install --legacy-peer-deps - -# Install missing FluentUI packages if needed -npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps -``` - **Explanation:** The `--legacy-peer-deps` flag is required due to peer dependency conflicts with @azure/msal-react. Some FluentUI packages may not be included in the initial install and need to be added separately. -**Port Conflicts:** -```bash -# Check what's using the port -netstat -tulpn | grep :8000 # Linux/Mac -netstat -ano | findstr :8000 # Windows -``` - -**Azure Authentication Issues:** +#### Azure Authentication Issues If you get "Forbidden" errors when accessing App Configuration or Cosmos DB: + ```bash # Check your current Azure account az account show @@ -557,30 +576,16 @@ az role assignment list --assignee $(az ad signed-in-user show --query id -o tsv # Refresh your access token az account get-access-token --resource https://azconfig.io - -# If roles are missing, assign them (replace with your ID from above) -az role assignment create --role "App Configuration Data Reader" \ - --assignee \ - --scope /subscriptions//resourceGroups//providers/Microsoft.AppConfiguration/configurationStores/ - -az role assignment create --role "Cosmos DB Built-in Data Contributor" \ - --assignee \ - --scope /subscriptions//resourceGroups//providers/Microsoft.DocumentDB/databaseAccounts/ - -az role assignment create --role "Storage Queue Data Contributor" \ - --assignee \ - --scope /subscriptions//resourceGroups//providers/Microsoft.Storage/storageAccounts/ - -az role assignment create --role "Cognitive Services User" \ - --assignee \ - --scope /subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/ ``` -**Note:** Role assignments can take 5-10 minutes to propagate through Azure AD. If you just assigned roles, wait a few minutes before retrying. +If roles are missing, assign them as shown in Step 2. + +> **Note:** Role assignments can take 5-10 minutes to propagate through Azure AD. If you just assigned roles, wait a few minutes before retrying. -**Cognitive Services Permission Errors:** +#### Cognitive Services Permission Errors If you see "401 Client Error: PermissionDenied" for Content Understanding service: + ```bash # Assign Cognitive Services User role az role assignment create --role "Cognitive Services User" \ @@ -590,15 +595,14 @@ az role assignment create --role "Cognitive Services User" \ This error occurs when processing documents. Wait 5-10 minutes after assigning the role, then restart the ContentProcessor service. -**ManagedIdentityCredential Errors:** +#### ManagedIdentityCredential Errors If you see "ManagedIdentityCredential authentication unavailable" or "No managed identity endpoint found": + ```bash # Ensure your .env files have these settings: APP_ENV=dev AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True - -# This tells the app to use Azure CLI credentials instead of Managed Identity ``` **Locations to check:** @@ -607,54 +611,61 @@ AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True **Explanation:** Managed Identity is used in Azure deployments but doesn't work locally. Setting `APP_ENV=dev` switches to Azure CLI credential authentication. -**General authentication reset:** -```bash -# Re-authenticate with Azure CLI -az logout -az login -``` - -**CORS Issues:** +#### CORS Issues If the frontend loads but shows "Unable to connect to the server" error: 1. Verify CORS is configured in `src/ContentProcessorAPI/app/main.py`: -```python -from fastapi.middleware.cors import CORSMiddleware - -app = FastAPI(redirect_slashes=False) - -# Configure CORS for local development -app.add_middleware( - CORSMiddleware, - allow_origins=["http://localhost:3000"], - allow_credentials=True, - allow_methods=["*"], - allow_headers=["*"], -) -``` + ```python + from fastapi.middleware.cors import CORSMiddleware + + app = FastAPI(redirect_slashes=False) + + # Configure CORS for local development + app.add_middleware( + CORSMiddleware, + allow_origins=["http://localhost:3000"], + allow_credentials=True, + allow_methods=["*"], + allow_headers=["*"], + ) + ``` -2. Restart the API service after adding CORS configuration +2. Restart the API service (Terminal 1) after adding CORS configuration 3. Check browser console (F12) for CORS errors 4. Verify API is running on port 8000 and frontend on port 3000 **Explanation:** CORS (Cross-Origin Resource Sharing) blocks requests between different origins by default. The frontend (localhost:3000) needs explicit permission to call the API (localhost:8000). -**PowerShell Script Execution Policy Error:** +#### Environment Variables Not Loading + +- Verify `.env` file is in the correct directory: + - ContentProcessorAPI: `src/ContentProcessorAPI/app/.env` + - ContentProcessor: `src/ContentProcessor/src/.env.dev` (must be `.env.dev`, not `.env`) + - ContentProcessorWeb: `src/ContentProcessorWeb/.env` +- Check file permissions (especially on Linux/macOS) +- Ensure no extra spaces in variable assignments +- Restart the service after changing `.env` files + +#### PowerShell Script Execution Policy Error If you get "cannot be loaded because running scripts is disabled" when activating venv: + ```powershell Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser ``` -**Environment Variables Not Loading:** -• Verify `.env` file is in the correct directory: - - ContentProcessorAPI: `src/ContentProcessorAPI/app/.env` - - ContentProcessor: `src/ContentProcessor/src/.env.dev` (must be `.env.dev`, not `.env`) - - ContentProcessorWeb: `src/ContentProcessorWeb/.env` -• Check file permissions (especially on Linux/macOS) -• Ensure no extra spaces in variable assignments -• Restart the service after changing `.env` files +#### Port Conflicts + +```bash +# Check what's using the port +netstat -ano | findstr :8000 # Windows +netstat -tulpn | grep :8000 # Linux/Mac + +# Kill the process using the port if needed +# Windows: taskkill /PID /F +# Linux: kill -9 +``` ### Debug Mode @@ -665,13 +676,13 @@ APP_LOGGING_LEVEL=DEBUG APP_LOGGING_ENABLE=True ``` -### Getting Help +## Related Documentation -• Check the [Technical Architecture](./TechnicalArchitecture.md) documentation -• Review the [API Documentation](./API.md) for endpoint details -• Submit issues to the [GitHub repository](https://github.com/microsoft/content-processing-solution-accelerator/issues) -• Check existing issues for similar problems +- [Deployment Guide](./DeploymentGuide.md) - Production deployment instructions +- [Technical Architecture](./TechnicalArchitecture.md) - System architecture overview +- [API Documentation](./API.md) - API endpoint details +- [README](../README.md) - Project overview and getting started --- -For additional support, please refer to the [main README](../README.md) or the [Deployment Guide](./DeploymentGuide.md) for production deployment instructions. +For additional support, please submit issues to the [GitHub repository](https://github.com/microsoft/content-processing-solution-accelerator/issues). diff --git a/docs/LocalSetupGuide_NEW.md b/docs/LocalSetupGuide_NEW.md deleted file mode 100644 index 26e114fb..00000000 --- a/docs/LocalSetupGuide_NEW.md +++ /dev/null @@ -1,688 +0,0 @@ -# Local Development Setup Guide - -This guide provides comprehensive instructions for setting up the Content Processing Solution Accelerator for local development across Windows and Linux platforms. - -## Important Setup Notes - -### Multi-Service Architecture - -This application consists of three separate services that run independently: - -1. **ContentProcessorAPI** - REST API server for the frontend -2. **ContentProcessor** - Background processor that handles document processing from Azure Storage Queue -3. **ContentProcessorWeb** - React-based user interface - -> ⚠️ **Critical**: Each service must run in its own terminal/console window -> -> - Do NOT close terminals while services are running -> - Open 3 separate terminal windows for local development -> - Each service will occupy its terminal and show live logs -> -> **Terminal Organization:** -> - Terminal 1: ContentProcessorAPI - HTTP server on port 8000 -> - Terminal 2: ContentProcessor - Runs continuously, polls Azure Storage Queue -> - Terminal 3: ContentProcessorWeb - Development server on port 3000 - -### Path Conventions - -All paths in this guide are relative to the repository root directory: - -``` -content-processing-solution-accelerator/ ← Repository root (start here) -├── src/ -│ ├── ContentProcessorAPI/ -│ │ ├── .venv/ ← Virtual environment -│ │ └── app/ -│ │ ├── main.py ← API entry point -│ │ └── .env ← API config file -│ ├── ContentProcessor/ -│ │ ├── .venv/ ← Virtual environment -│ │ └── src/ -│ │ ├── main.py ← Processor entry point -│ │ └── .env.dev ← Processor config file -│ └── ContentProcessorWeb/ -│ ├── node_modules/ -│ └── .env ← Frontend config file -└── docs/ ← Documentation (you are here) -``` - -Before starting any step, ensure you are in the repository root directory: - -```powershell -# Verify you're in the correct location -pwd # Linux/macOS - should show: .../content-processing-solution-accelerator -Get-Location # Windows PowerShell - should show: ...\content-processing-solution-accelerator - -# If not, navigate to repository root -cd path/to/content-processing-solution-accelerator -``` - -### Configuration Files - -This project uses separate `.env` files in each service directory with different configuration requirements: - -- **ContentProcessorAPI**: `src/ContentProcessorAPI/app/.env` - Azure App Configuration URL, Cosmos DB endpoint -- **ContentProcessor**: `src/ContentProcessor/src/.env.dev` - Azure App Configuration URL, Cosmos DB endpoint (note `.dev` suffix) -- **ContentProcessorWeb**: `src/ContentProcessorWeb/.env` - API base URL, authentication settings - -When copying `.env` samples, always navigate to the specific service directory first. - -## Step 1: Prerequisites - Install Required Tools - -### Windows Development - -```powershell -# Install Python 3.11+ and Git -winget install Python.Python.3.11 -winget install Git.Git - -# Install Node.js for frontend -winget install OpenJS.NodeJS.LTS - -# Verify installations -python --version # Should show Python 3.11.x -node --version # Should show v18.x or higher -npm --version -``` - -### Linux Development - -#### Ubuntu/Debian - -```bash -# Install prerequisites -sudo apt update && sudo apt install python3.11 python3.11-venv python3-pip git curl nodejs npm -y - -# Verify installations -python3.11 --version -node --version -npm --version -``` - -#### RHEL/CentOS/Fedora - -```bash -# Install prerequisites -sudo dnf install python3.11 python3.11-devel git curl gcc nodejs npm -y - -# Verify installations -python3.11 --version -node --version -npm --version -``` - -### Clone the Repository - -```bash -git clone https://github.com/microsoft/content-processing-solution-accelerator.git -cd content-processing-solution-accelerator -``` - -## Step 2: Azure Authentication Setup - -Before configuring services, authenticate with Azure: - -```bash -# Login to Azure CLI -az login - -# Set your subscription -az account set --subscription "your-subscription-id" - -# Verify authentication -az account show -``` - -### Get Azure Resource Information - -After deploying Azure resources (using `azd up` or Bicep template), gather the following information: - -```bash -# List resources in your resource group -az resource list -g -o table - -# Get App Configuration endpoint -az appconfig show -n -g --query endpoint -o tsv - -# Get Cosmos DB endpoint -az cosmosdb show -n -g --query documentEndpoint -o tsv -``` - -Example resource names from deployment: -- App Configuration: `appcs-{suffix}.azconfig.io` -- Cosmos DB: `cosmos-{suffix}.documents.azure.com` -- Storage Account: `st{suffix}.queue.core.windows.net` -- Content Understanding: `aicu-{suffix}.cognitiveservices.azure.com` - -### Required Azure RBAC Permissions - -To run the application locally, your Azure account needs the following role assignments on the deployed resources: - -#### Get Your Principal ID - -```bash -# Get your principal ID for role assignments -PRINCIPAL_ID=$(az ad signed-in-user show --query id -o tsv) -echo $PRINCIPAL_ID - -# Get your subscription ID -SUBSCRIPTION_ID=$(az account show --query id -o tsv) -echo $SUBSCRIPTION_ID -``` - -#### Assign Required Roles - -```bash -# 1. App Configuration Data Reader -az role assignment create \ - --role "App Configuration Data Reader" \ - --assignee $PRINCIPAL_ID \ - --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.AppConfiguration/configurationStores/" - -# 2. Cosmos DB Built-in Data Contributor -az role assignment create \ - --role "Cosmos DB Built-in Data Contributor" \ - --assignee $PRINCIPAL_ID \ - --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.DocumentDB/databaseAccounts/" - -# 3. Storage Queue Data Contributor -az role assignment create \ - --role "Storage Queue Data Contributor" \ - --assignee $PRINCIPAL_ID \ - --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.Storage/storageAccounts/" - -# 4. Cognitive Services User -az role assignment create \ - --role "Cognitive Services User" \ - --assignee $PRINCIPAL_ID \ - --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.CognitiveServices/accounts/" -``` - -> **Note:** RBAC permission changes can take 5-10 minutes to propagate. If you encounter "Forbidden" errors after assigning roles, wait a few minutes and try again. - -## Step 3: ContentProcessorAPI Setup & Run Instructions - -> 📋 **Terminal Reminder**: Open a dedicated terminal window (Terminal 1) for the ContentProcessorAPI service. All commands in this section assume you start from the repository root directory. - -The ContentProcessorAPI provides REST endpoints for the frontend and handles API requests. - -### 3.1. Navigate to API Directory - -```bash -# From repository root -cd src/ContentProcessorAPI -``` - -### 3.2. Create Virtual Environment - -```powershell -# Create virtual environment -python -m venv .venv - -# Activate virtual environment -.venv\Scripts\Activate.ps1 # Windows PowerShell -# or -source .venv/bin/activate # Linux/macOS -``` - -**Note for PowerShell Users:** If you get an error about scripts being disabled, run: -```powershell -Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser -``` - -### 3.3. Install Dependencies - -```bash -pip install -r requirements.txt -``` - -**If you encounter compilation errors** on Windows (cffi, pydantic-core, or cryptography): - -These packages often fail to build from source on Windows. Use this workaround to install precompiled wheels: - -```powershell -# Create temporary requirements without problematic packages -Get-Content requirements.txt | Where-Object { $_ -notmatch "cffi==1.17.1|pydantic==2.11.7|pydantic-core==2.33.2" } | Out-File temp_requirements.txt -Encoding utf8 - -# Install other dependencies first -pip install -r temp_requirements.txt - -# Install problematic packages with newer precompiled versions -pip install cffi==2.0.0 pydantic==2.12.5 pydantic-core==2.41.5 - -# Upgrade typing-extensions if needed -pip install --upgrade "typing-extensions>=4.14.1" "typing-inspection>=0.4.2" - -# Clean up temporary file -Remove-Item temp_requirements.txt -``` - -### 3.4. Configure Environment Variables - -Create a `.env` file in the `src/ContentProcessorAPI/app/` directory: - -```bash -cd app - -# Create .env file -New-Item .env # Windows PowerShell -# or -touch .env # Linux/macOS -``` - -Add the following to the `.env` file: - -```bash -# App Configuration endpoint from your Azure deployment -APP_CONFIG_ENDPOINT=https://.azconfig.io - -# Cosmos DB endpoint from your Azure deployment -AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ -AZURE_COSMOS_DATABASE=contentprocess - -# Local development settings - CRITICAL for local authentication -APP_ENV=dev -APP_AUTH_ENABLED=False -AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True -``` - -> ⚠️ **Important**: -> - Replace `` and `` with your actual Azure resource names -> - `APP_ENV=dev` is **REQUIRED** for local development - it enables Azure CLI credential usage instead of Managed Identity -> - Get your resource names from the Azure Portal or by running: `az resource list -g ` - -### 3.5. Configure CORS for Local Development - -Edit `src/ContentProcessorAPI/app/main.py` and add the CORS middleware configuration. - -Add the import at the top: - -```python -from fastapi.middleware.cors import CORSMiddleware -``` - -Then after the line `app = FastAPI(redirect_slashes=False)`, add: - -```python -# Configure CORS for local development -app.add_middleware( - CORSMiddleware, - allow_origins=["http://localhost:3000"], # Frontend URL - allow_credentials=True, - allow_methods=["*"], # Allow all HTTP methods - allow_headers=["*"], # Allow all headers -) -``` - -> **Note:** This CORS configuration is only needed for local development. Azure deployment handles CORS at the infrastructure level. - -### 3.6. Run the API - -```bash -# Make sure you're in the ContentProcessorAPI directory with activated venv -cd .. # Go back to ContentProcessorAPI root if in app/ - -# Run with uvicorn -python -m uvicorn app.main:app --reload --port 8000 -``` - -The ContentProcessorAPI will start at: -- API: `http://localhost:8000` -- API Documentation: `http://localhost:8000/docs` - -**Keep this terminal open** - the API server will continue running and show request logs. - -## Step 4: ContentProcessor Setup & Run Instructions - -> 📋 **Terminal Reminder**: Open a second dedicated terminal window (Terminal 2) for the ContentProcessor. Keep Terminal 1 (API) running. All commands assume you start from the repository root directory. - -The ContentProcessor handles background document processing from Azure Storage Queue. - -### 4.1. Navigate to Processor Directory - -```bash -# From repository root -cd src/ContentProcessor -``` - -### 4.2. Create Virtual Environment - -```powershell -# Create virtual environment -python -m venv .venv - -# Activate virtual environment -.venv\Scripts\Activate.ps1 # Windows PowerShell -# or -source .venv/bin/activate # Linux/macOS -``` - -### 4.3. Install Dependencies - -```bash -pip install -r requirements.txt -``` - -**If you encounter errors**, upgrade problematic packages: - -```powershell -pip install --upgrade cffi cryptography pydantic pydantic-core numpy pandas -``` - -### 4.4. Configure Environment Variables - -Create a `.env.dev` file (note the `.dev` suffix) in the `src/ContentProcessor/src/` directory: - -```bash -cd src - -# Create .env.dev file -New-Item .env.dev # Windows PowerShell -# or -touch .env.dev # Linux/macOS -``` - -Add the following to the `.env.dev` file: - -```bash -# App Configuration endpoint -APP_CONFIG_ENDPOINT=https://.azconfig.io - -# Cosmos DB endpoint -AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ -AZURE_COSMOS_DATABASE=contentprocess - -# Local development settings -APP_ENV=dev -APP_AUTH_ENABLED=False -AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True - -# Logging settings -APP_LOGGING_LEVEL=INFO -APP_LOGGING_ENABLE=True -``` - -> ⚠️ **Important**: The `.env.dev` file must be located in `src/ContentProcessor/src/` directory, not in `src/ContentProcessor/` root. The application looks for the `.env.dev` file in the same directory as `main.py`. - -### 4.5. Run the Processor - -```bash -# Make sure you're in the src directory -python main.py -``` - -The ContentProcessor will start and begin polling the Azure Storage Queue for messages. - -**Expected behavior:** -- You may see Storage Queue authorization errors if roles haven't propagated (wait 5-10 minutes) -- The processor will show continuous polling activity -- Document processing will begin when files are uploaded via the frontend - -**Keep this terminal open** - the processor will continue running and show processing logs. - -## Step 5: ContentProcessorWeb Setup & Run Instructions - -> 📋 **Terminal Reminder**: Open a third dedicated terminal window (Terminal 3) for the ContentProcessorWeb. Keep Terminals 1 (API) and 2 (Processor) running. All commands assume you start from the repository root directory. - -The ContentProcessorWeb provides the React-based user interface. - -### 5.1. Navigate to Frontend Directory - -```bash -# From repository root -cd src/ContentProcessorWeb -``` - -### 5.2. Install Dependencies - -```bash -# Install dependencies with legacy peer deps flag -npm install --legacy-peer-deps - -# Install additional required FluentUI packages -npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps -``` - -> **Note:** Always use the `--legacy-peer-deps` flag for npm commands in this project to avoid dependency conflicts with @azure/msal-react. - -### 5.3. Configure Environment Variables - -Update the `.env` file in the `src/ContentProcessorWeb/` directory: - -```bash -REACT_APP_API_BASE_URL=http://localhost:8000 -REACT_APP_AUTH_ENABLED=false -REACT_APP_CONSOLE_LOG_ENABLED=true -``` - -### 5.4. Start Development Server - -```bash -npm start -``` - -The ContentProcessorWeb will start at: `http://localhost:3000` - -**Keep this terminal open** - the React development server will continue running with hot reload. - -## Step 6: Verify All Services Are Running - -Before using the application, confirm all three services are running in separate terminals: - -### Terminal Status Checklist - -| Terminal | Service | Command | Expected Output | URL | -|----------|---------|---------|-----------------|-----| -| Terminal 1 | ContentProcessorAPI | `python -m uvicorn app.main:app --reload --port 8000` | `Application startup complete` | http://localhost:8000 | -| Terminal 2 | ContentProcessor | `python main.py` | Polling messages, no fatal errors | N/A | -| Terminal 3 | ContentProcessorWeb | `npm start` | `Compiled successfully!` | http://localhost:3000 | - -### Quick Verification - -1. **Check Backend API**: - ```bash - # In a new terminal (Terminal 4) - curl http://localhost:8000/health - # Expected: {"message":"I'm alive!"} - ``` - -2. **Check Frontend**: - - Open browser to http://localhost:3000 - - Should see the Content Processing UI - - No "Unable to connect to the server" errors - -3. **Check Processor**: - - Look at Terminal 2 output - - Should see processing activity or queue polling - - No authorization errors (if roles have propagated) - -## Step 7: Next Steps - -Once all services are running (as confirmed in Step 6), you can: - -1. **Access the Application**: Open `http://localhost:3000` in your browser to explore the frontend UI -2. **Upload Documents**: Use the UI to upload documents for processing -3. **View API Documentation**: Navigate to `http://localhost:8000/docs` to explore API endpoints -4. **Check Processing Status**: Monitor Terminal 2 for document processing logs - -## Troubleshooting - -### Common Issues - -#### Python Compilation Errors (Windows) - -If you see errors like "Microsoft Visual C++ 14.0 is required" or "error: metadata-generation-failed" when installing cffi, pydantic-core, or cryptography: - -```powershell -# Create temporary requirements excluding problematic packages -Get-Content requirements.txt | Where-Object { $_ -notmatch "cffi==1.17.1|pydantic==2.11.7|pydantic-core==2.33.2" } | Out-File temp_requirements.txt -Encoding utf8 - -# Install other dependencies first -pip install -r temp_requirements.txt - -# Install problematic packages with newer precompiled versions -pip install cffi==2.0.0 pydantic==2.12.5 pydantic-core==2.41.5 - -# Upgrade typing-extensions if needed -pip install --upgrade "typing-extensions>=4.14.1" "typing-inspection>=0.4.2" - -# Clean up -Remove-Item temp_requirements.txt -``` - -**Explanation:** Older versions of cffi (1.17.1) and pydantic-core (2.33.2) require compilation from source, which fails on Windows without Visual Studio build tools. Newer versions have precompiled wheels that install without compilation. - -#### pydantic_core ImportError - -If you see "PyO3 modules compiled for CPython 3.8 or older may only be initialized once" or "ImportError: pydantic_core._pydantic_core": - -```powershell -# Uninstall and reinstall with compatible versions -pip uninstall -y pydantic pydantic-core -pip install pydantic==2.12.5 pydantic-core==2.41.5 -pip install --upgrade "typing-extensions>=4.14.1" -``` - -**Explanation:** Version mismatch between pydantic and pydantic-core causes runtime errors. The compatible versions above work reliably together. - -#### Node.js Dependencies Issues - -```powershell -# Clear npm cache and reinstall with legacy peer deps -npm cache clean --force -Remove-Item -Recurse -Force node_modules -ErrorAction SilentlyContinue -Remove-Item -Force package-lock.json -ErrorAction SilentlyContinue -npm install --legacy-peer-deps - -# Install missing FluentUI packages if needed -npm install @fluentui/react-dialog @fluentui/react-button --legacy-peer-deps -``` - -**Explanation:** The `--legacy-peer-deps` flag is required due to peer dependency conflicts with @azure/msal-react. Some FluentUI packages may not be included in the initial install and need to be added separately. - -#### Azure Authentication Issues - -If you get "Forbidden" errors when accessing App Configuration or Cosmos DB: - -```bash -# Check your current Azure account -az account show - -# Get your principal ID for role assignments -az ad signed-in-user show --query id -o tsv - -# Verify you have the correct role assignments -az role assignment list --assignee $(az ad signed-in-user show --query id -o tsv) --resource-group - -# Refresh your access token -az account get-access-token --resource https://azconfig.io -``` - -If roles are missing, assign them as shown in Step 2. - -> **Note:** Role assignments can take 5-10 minutes to propagate through Azure AD. If you just assigned roles, wait a few minutes before retrying. - -#### Cognitive Services Permission Errors - -If you see "401 Client Error: PermissionDenied" for Content Understanding service: - -```bash -# Assign Cognitive Services User role -az role assignment create --role "Cognitive Services User" \ - --assignee \ - --scope /subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/ -``` - -This error occurs when processing documents. Wait 5-10 minutes after assigning the role, then restart the ContentProcessor service. - -#### ManagedIdentityCredential Errors - -If you see "ManagedIdentityCredential authentication unavailable" or "No managed identity endpoint found": - -```bash -# Ensure your .env files have these settings: -APP_ENV=dev -AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True -``` - -**Locations to check:** -- `src/ContentProcessorAPI/app/.env` -- `src/ContentProcessor/src/.env.dev` (note: must be `.env.dev` in the `src/` subdirectory, not `.env` in root) - -**Explanation:** Managed Identity is used in Azure deployments but doesn't work locally. Setting `APP_ENV=dev` switches to Azure CLI credential authentication. - -#### CORS Issues - -If the frontend loads but shows "Unable to connect to the server" error: - -1. Verify CORS is configured in `src/ContentProcessorAPI/app/main.py`: - ```python - from fastapi.middleware.cors import CORSMiddleware - - app = FastAPI(redirect_slashes=False) - - # Configure CORS for local development - app.add_middleware( - CORSMiddleware, - allow_origins=["http://localhost:3000"], - allow_credentials=True, - allow_methods=["*"], - allow_headers=["*"], - ) - ``` - -2. Restart the API service (Terminal 1) after adding CORS configuration -3. Check browser console (F12) for CORS errors -4. Verify API is running on port 8000 and frontend on port 3000 - -**Explanation:** CORS (Cross-Origin Resource Sharing) blocks requests between different origins by default. The frontend (localhost:3000) needs explicit permission to call the API (localhost:8000). - -#### Environment Variables Not Loading - -- Verify `.env` file is in the correct directory: - - ContentProcessorAPI: `src/ContentProcessorAPI/app/.env` - - ContentProcessor: `src/ContentProcessor/src/.env.dev` (must be `.env.dev`, not `.env`) - - ContentProcessorWeb: `src/ContentProcessorWeb/.env` -- Check file permissions (especially on Linux/macOS) -- Ensure no extra spaces in variable assignments -- Restart the service after changing `.env` files - -#### PowerShell Script Execution Policy Error - -If you get "cannot be loaded because running scripts is disabled" when activating venv: - -```powershell -Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser -``` - -#### Port Conflicts - -```bash -# Check what's using the port -netstat -ano | findstr :8000 # Windows -netstat -tulpn | grep :8000 # Linux/Mac - -# Kill the process using the port if needed -# Windows: taskkill /PID /F -# Linux: kill -9 -``` - -### Debug Mode - -Enable detailed logging by setting these environment variables in your `.env` files: - -```bash -APP_LOGGING_LEVEL=DEBUG -APP_LOGGING_ENABLE=True -``` - -## Related Documentation - -- [Deployment Guide](./DeploymentGuide.md) - Production deployment instructions -- [Technical Architecture](./TechnicalArchitecture.md) - System architecture overview -- [API Documentation](./API.md) - API endpoint details -- [README](../README.md) - Project overview and getting started - ---- - -For additional support, please submit issues to the [GitHub repository](https://github.com/microsoft/content-processing-solution-accelerator/issues). From cc61d54dafd1a115fd1f98af8ba1a76a3f9f1537 Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Wed, 17 Dec 2025 18:04:37 +0530 Subject: [PATCH 11/13] Update LocalSetupGuide with environment file changes and dependency management improvements --- docs/LocalSetupGuide.md | 127 +++++++++++++++++++--------------------- 1 file changed, 60 insertions(+), 67 deletions(-) diff --git a/docs/LocalSetupGuide.md b/docs/LocalSetupGuide.md index 26e114fb..6de6b748 100644 --- a/docs/LocalSetupGuide.md +++ b/docs/LocalSetupGuide.md @@ -39,7 +39,7 @@ content-processing-solution-accelerator/ ← Repository root (start here) │ │ ├── .venv/ ← Virtual environment │ │ └── src/ │ │ ├── main.py ← Processor entry point -│ │ └── .env.dev ← Processor config file +│ │ └── .env ← Processor config file │ └── ContentProcessorWeb/ │ ├── node_modules/ │ └── .env ← Frontend config file @@ -61,8 +61,8 @@ cd path/to/content-processing-solution-accelerator This project uses separate `.env` files in each service directory with different configuration requirements: -- **ContentProcessorAPI**: `src/ContentProcessorAPI/app/.env` - Azure App Configuration URL, Cosmos DB endpoint -- **ContentProcessor**: `src/ContentProcessor/src/.env.dev` - Azure App Configuration URL, Cosmos DB endpoint (note `.dev` suffix) +- **ContentProcessorAPI**: `src/ContentProcessorAPI/app/.env` - Azure App Configuration URL and local dev settings +- **ContentProcessor**: `src/ContentProcessor/src/.env` - Azure App Configuration URL and local dev settings - **ContentProcessorWeb**: `src/ContentProcessorWeb/.env` - API base URL, authentication settings When copying `.env` samples, always navigate to the specific service directory first. @@ -185,13 +185,19 @@ az role assignment create \ --assignee $PRINCIPAL_ID \ --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.DocumentDB/databaseAccounts/" -# 3. Storage Queue Data Contributor +# 3. Storage Blob Data Contributor (for document upload/download) +az role assignment create \ + --role "Storage Blob Data Contributor" \ + --assignee $PRINCIPAL_ID \ + --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.Storage/storageAccounts/" + +# 4. Storage Queue Data Contributor (for message processing) az role assignment create \ --role "Storage Queue Data Contributor" \ --assignee $PRINCIPAL_ID \ --scope "/subscriptions/$SUBSCRIPTION_ID/resourceGroups//providers/Microsoft.Storage/storageAccounts/" -# 4. Cognitive Services User +# 5. Cognitive Services User az role assignment create \ --role "Cognitive Services User" \ --assignee $PRINCIPAL_ID \ @@ -233,30 +239,15 @@ Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser ### 3.3. Install Dependencies ```bash -pip install -r requirements.txt -``` - -**If you encounter compilation errors** on Windows (cffi, pydantic-core, or cryptography): - -These packages often fail to build from source on Windows. Use this workaround to install precompiled wheels: - -```powershell -# Create temporary requirements without problematic packages -Get-Content requirements.txt | Where-Object { $_ -notmatch "cffi==1.17.1|pydantic==2.11.7|pydantic-core==2.33.2" } | Out-File temp_requirements.txt -Encoding utf8 +# Install uv package manager if not already installed +pip install uv -# Install other dependencies first -pip install -r temp_requirements.txt - -# Install problematic packages with newer precompiled versions -pip install cffi==2.0.0 pydantic==2.12.5 pydantic-core==2.41.5 - -# Upgrade typing-extensions if needed -pip install --upgrade "typing-extensions>=4.14.1" "typing-inspection>=0.4.2" - -# Clean up temporary file -Remove-Item temp_requirements.txt +# Install all dependencies using uv +uv sync --python 3.11 ``` +**Note:** This project uses `uv` as the package manager with `pyproject.toml`. The `uv sync` command automatically installs all dependencies with proper version resolution. + ### 3.4. Configure Environment Variables Create a `.env` file in the `src/ContentProcessorAPI/app/` directory: @@ -273,13 +264,9 @@ touch .env # Linux/macOS Add the following to the `.env` file: ```bash -# App Configuration endpoint from your Azure deployment +# App Configuration endpoint - ALL other settings are read from App Configuration APP_CONFIG_ENDPOINT=https://.azconfig.io -# Cosmos DB endpoint from your Azure deployment -AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ -AZURE_COSMOS_DATABASE=contentprocess - # Local development settings - CRITICAL for local authentication APP_ENV=dev APP_AUTH_ENABLED=False @@ -287,8 +274,9 @@ AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True ``` > ⚠️ **Important**: -> - Replace `` and `` with your actual Azure resource names +> - Replace `` with your actual App Configuration resource name > - `APP_ENV=dev` is **REQUIRED** for local development - it enables Azure CLI credential usage instead of Managed Identity +> - All other settings (Cosmos DB, Storage, AI endpoints) are automatically loaded from Azure App Configuration > - Get your resource names from the Azure Portal or by running: `az resource list -g ` ### 3.5. Configure CORS for Local Development @@ -360,49 +348,63 @@ source .venv/bin/activate # Linux/macOS ### 4.3. Install Dependencies ```bash -pip install -r requirements.txt -``` +# Install uv package manager if not already installed +pip install uv -**If you encounter errors**, upgrade problematic packages: - -```powershell -pip install --upgrade cffi cryptography pydantic pydantic-core numpy pandas +# Install all dependencies using uv +uv sync --python 3.11 ``` +**Note:** This project uses `uv` as the package manager with `pyproject.toml`. The `uv sync` command automatically installs all dependencies with proper version resolution. + ### 4.4. Configure Environment Variables -Create a `.env.dev` file (note the `.dev` suffix) in the `src/ContentProcessor/src/` directory: +Create a `.env` file in the `src/ContentProcessor/src/` directory: ```bash cd src -# Create .env.dev file -New-Item .env.dev # Windows PowerShell +# Create .env file +New-Item .env # Windows PowerShell # or -touch .env.dev # Linux/macOS +touch .env # Linux/macOS ``` -Add the following to the `.env.dev` file: +Add the following to the `.env` file: ```bash -# App Configuration endpoint +# App Configuration endpoint - ALL other settings are read from App Configuration APP_CONFIG_ENDPOINT=https://.azconfig.io -# Cosmos DB endpoint -AZURE_COSMOS_ENDPOINT=https://.documents.azure.com:443/ -AZURE_COSMOS_DATABASE=contentprocess - # Local development settings APP_ENV=dev APP_AUTH_ENABLED=False AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True -# Logging settings +# Logging settings (optional) APP_LOGGING_LEVEL=INFO APP_LOGGING_ENABLE=True ``` -> ⚠️ **Important**: The `.env.dev` file must be located in `src/ContentProcessor/src/` directory, not in `src/ContentProcessor/` root. The application looks for the `.env.dev` file in the same directory as `main.py`. +### 4.5. Update main.py to Use .env File + +The code currently uses `.env.dev` by default. Update it to use the standard `.env` file: + +1. Open `src/ContentProcessor/src/main.py` +2. Find line 25 (inside the `__init__` method) +3. Change: + ```python + env_file_path=os.path.join(os.path.dirname(__file__), ".env.dev"), + ``` + to: + ```python + env_file_path=os.path.join(os.path.dirname(__file__), ".env"), + ``` + +> ⚠️ **Important**: +> - The `.env` file must be located in `src/ContentProcessor/src/` directory, not in `src/ContentProcessor/` root +> - After making this change, the application will look for `.env` file in the same directory as `main.py` +> - All Azure resource settings (Cosmos DB, Storage, AI endpoints) are automatically loaded from Azure App Configuration ### 4.5. Run the Processor @@ -511,26 +513,17 @@ Once all services are running (as confirmed in Step 6), you can: #### Python Compilation Errors (Windows) -If you see errors like "Microsoft Visual C++ 14.0 is required" or "error: metadata-generation-failed" when installing cffi, pydantic-core, or cryptography: +If you see errors when installing dependencies, ensure you're using `uv sync` instead of `pip install`: ```powershell -# Create temporary requirements excluding problematic packages -Get-Content requirements.txt | Where-Object { $_ -notmatch "cffi==1.17.1|pydantic==2.11.7|pydantic-core==2.33.2" } | Out-File temp_requirements.txt -Encoding utf8 - -# Install other dependencies first -pip install -r temp_requirements.txt - -# Install problematic packages with newer precompiled versions -pip install cffi==2.0.0 pydantic==2.12.5 pydantic-core==2.41.5 - -# Upgrade typing-extensions if needed -pip install --upgrade "typing-extensions>=4.14.1" "typing-inspection>=0.4.2" +# Install uv if not already installed +pip install uv -# Clean up -Remove-Item temp_requirements.txt +# Use uv sync which handles dependencies better +uv sync --python 3.11 ``` -**Explanation:** Older versions of cffi (1.17.1) and pydantic-core (2.33.2) require compilation from source, which fails on Windows without Visual Studio build tools. Newer versions have precompiled wheels that install without compilation. +**Explanation:** This project uses `uv` as the package manager with `pyproject.toml`. The `uv` tool provides better dependency resolution and automatically uses precompiled wheels when available, avoiding compilation issues on Windows. #### pydantic_core ImportError @@ -607,7 +600,7 @@ AZURE_IDENTITY_EXCLUDE_MANAGED_IDENTITY_CREDENTIAL=True **Locations to check:** - `src/ContentProcessorAPI/app/.env` -- `src/ContentProcessor/src/.env.dev` (note: must be `.env.dev` in the `src/` subdirectory, not `.env` in root) +- `src/ContentProcessor/src/.env` (note: must be in the `src/` subdirectory) **Explanation:** Managed Identity is used in Azure deployments but doesn't work locally. Setting `APP_ENV=dev` switches to Azure CLI credential authentication. @@ -641,7 +634,7 @@ If the frontend loads but shows "Unable to connect to the server" error: - Verify `.env` file is in the correct directory: - ContentProcessorAPI: `src/ContentProcessorAPI/app/.env` - - ContentProcessor: `src/ContentProcessor/src/.env.dev` (must be `.env.dev`, not `.env`) + - ContentProcessor: `src/ContentProcessor/src/.env` (must be in `src/` subdirectory) - ContentProcessorWeb: `src/ContentProcessorWeb/.env` - Check file permissions (especially on Linux/macOS) - Ensure no extra spaces in variable assignments From 6180e544254cd09ed4868e3bd0920e7a2414d00e Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Wed, 17 Dec 2025 18:18:04 +0530 Subject: [PATCH 12/13] Remove LocalSetupGuide.md and create LocalDevelopmentSetup.md with comprehensive instructions for setting up the Content Processing Solution Accelerator for local development on Windows and Linux. --- docs/{LocalSetupGuide.md => LocalDevelopmentSetup.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/{LocalSetupGuide.md => LocalDevelopmentSetup.md} (100%) diff --git a/docs/LocalSetupGuide.md b/docs/LocalDevelopmentSetup.md similarity index 100% rename from docs/LocalSetupGuide.md rename to docs/LocalDevelopmentSetup.md From faf05469337783d6037b3c82e2d2fb9257007e77 Mon Sep 17 00:00:00 2001 From: Venkateswarlu Marthula Date: Thu, 18 Dec 2025 08:20:34 +0530 Subject: [PATCH 13/13] Update Local Development Setup Guide to reflect Python 3.12 installation instructions --- docs/LocalDevelopmentSetup.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/LocalDevelopmentSetup.md b/docs/LocalDevelopmentSetup.md index 6de6b748..afeca90e 100644 --- a/docs/LocalDevelopmentSetup.md +++ b/docs/LocalDevelopmentSetup.md @@ -72,15 +72,15 @@ When copying `.env` samples, always navigate to the specific service directory f ### Windows Development ```powershell -# Install Python 3.11+ and Git -winget install Python.Python.3.11 +# Install Python 3.12+ and Git +winget install Python.Python.3.12 winget install Git.Git # Install Node.js for frontend winget install OpenJS.NodeJS.LTS # Verify installations -python --version # Should show Python 3.11.x +python --version # Should show Python 3.12.x node --version # Should show v18.x or higher npm --version ``` @@ -91,10 +91,10 @@ npm --version ```bash # Install prerequisites -sudo apt update && sudo apt install python3.11 python3.11-venv python3-pip git curl nodejs npm -y +sudo apt update && sudo apt install python3.12 python3.12-venv python3-pip git curl nodejs npm -y # Verify installations -python3.11 --version +python3.12 --version node --version npm --version ``` @@ -243,7 +243,7 @@ Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser pip install uv # Install all dependencies using uv -uv sync --python 3.11 +uv sync --python 3.12 ``` **Note:** This project uses `uv` as the package manager with `pyproject.toml`. The `uv sync` command automatically installs all dependencies with proper version resolution. @@ -352,7 +352,7 @@ source .venv/bin/activate # Linux/macOS pip install uv # Install all dependencies using uv -uv sync --python 3.11 +uv sync --python 3.12 ``` **Note:** This project uses `uv` as the package manager with `pyproject.toml`. The `uv sync` command automatically installs all dependencies with proper version resolution. @@ -406,7 +406,7 @@ The code currently uses `.env.dev` by default. Update it to use the standard `.e > - After making this change, the application will look for `.env` file in the same directory as `main.py` > - All Azure resource settings (Cosmos DB, Storage, AI endpoints) are automatically loaded from Azure App Configuration -### 4.5. Run the Processor +### 4.6. Run the Processor ```bash # Make sure you're in the src directory @@ -520,7 +520,7 @@ If you see errors when installing dependencies, ensure you're using `uv sync` in pip install uv # Use uv sync which handles dependencies better -uv sync --python 3.11 +uv sync --python 3.12 ``` **Explanation:** This project uses `uv` as the package manager with `pyproject.toml`. The `uv` tool provides better dependency resolution and automatically uses precompiled wheels when available, avoiding compilation issues on Windows.