image.png

  1. Deploy the Search Service

    1. Go to Deploy a Custom Template in the Azure Portal

    2. Click "Build your own template in the editor”

    3. Click "Load file" and upload the json file here using these parameters: Customer Name: the name used to deploy your portal (can be found in the update page of your AI Portal) Storage Name: the name you’d like to give your default storage for rag

      ragSearch.json

    4. Click Review + Create and then Create

  2. Get key for Embeddings model

    1. Head to Microsoft Foundry in the Azure Portal

      image.png

    2. In Use with FoundryFoundry click on the foundry project created with the name embeddings (named {name}-uf-aiportal-embeddings)

    3. Click on the button that says Go To Foundry Portal

      image.png

    4. Once in Foundry portal, Under My AssetsModels + endpoints find the text-embedding-3-large deployment and copy the key. You’ll use it in one of the following steps. Also copy the endpoint field (should be something like: https://{NAME}-aiportal-embeddings.cognitiveservices.azure.com/

    image.png

  3. Add Data Source

    1. Back in Microsoft Foundry in the Azure Portal, go to Use with Foundry → AI Search. Click on the search named {name}-rag-aiportal

      image.png

    2. In Search ManagementData Sources click Add a Data Source

      image.png

    3. Your data source should say: Name: Any identifying name (here I chose rag-datasource) Subscription: Subscription of your AI Portal Storage Account: {name}aiportal Blob Container: The name you gave the storage account in step 1 Click Create

      image.png

    4. Once created, click into the datasource, click edit, and turn on Deletion Tracking:

    image.png

    1. Enable Deletion Tracking
  4. Add Index

    1. In Search Management → Indexes click Add Index (JSON)

    2. Paste this JSON into the index and click create (Feel free to update the name docsindex to better match your storage)

      {
        "name": "docsindex",
        "purviewEnabled": false,
        "fields": [
          {
            "name": "id",
            "type": "Edm.String",
            "searchable": true,
            "filterable": true,
            "retrievable": true,
            "stored": true,
            "sortable": true,
            "facetable": true,
            "key": true,
            "analyzer": "keyword",
            "synonymMaps": []
          },
          {
            "name": "content",
            "type": "Edm.String",
            "searchable": true,
            "filterable": false,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "synonymMaps": []
          },
          {
            "name": "allowedUsers",
            "type": "Collection(Edm.String)",
            "searchable": true,
            "filterable": true,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": true,
            "key": false,
            "synonymMaps": []
          },
          {
            "name": "sourceDoc",
            "type": "Edm.String",
            "searchable": true,
            "filterable": true,
            "retrievable": true,
            "stored": true,
            "sortable": true,
            "facetable": true,
            "key": false,
            "synonymMaps": []
          },
          {
            "name": "contentVector",
            "type": "Collection(Edm.Single)",
            "searchable": true,
            "filterable": false,
            "retrievable": false,
            "stored": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "dimensions": 3072,
            "vectorSearchProfile": "default-profile",
            "synonymMaps": []
          },
          {
            "name": "parentId",
            "type": "Edm.String",
            "searchable": false,
            "filterable": true,
            "retrievable": true,
            "stored": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "synonymMaps": []
          }
        ],
        "scoringProfiles": [],
        "suggesters": [],
        "analyzers": [],
        "normalizers": [],
        "tokenizers": [],
        "tokenFilters": [],
        "charFilters": [],
        "similarity": {
          "@odata.type": "#Microsoft.Azure.Search.BM25Similarity"
        },
        "semantic": {
          "configurations": [
            {
              "name": "default",
              "flightingOptIn": false,
              "rankingOrder": "BoostedRerankerScore",
              "prioritizedFields": {
                "titleField": {
                  "fieldName": "id"
                },
                "prioritizedContentFields": [
                  {
                    "fieldName": "content"
                  }
                ],
                "prioritizedKeywordsFields": []
              }
            }
          ]
        },
        "vectorSearch": {
          "algorithms": [
            {
              "name": "default-hnsw",
              "kind": "hnsw",
              "hnswParameters": {
                "metric": "cosine",
                "m": 4,
                "efConstruction": 400,
                "efSearch": 500
              }
            }
          ],
          "profiles": [
            {
              "name": "default-profile",
              "algorithm": "default-hnsw"
            }
          ],
          "vectorizers": [],
          "compressions": []
        }
      }
      
  5. Create a Skillset

    1. In Search Management → Skillsets Click Add a Skillset . Paste the JSON below, Updating these text fields: INSERT_NAME_HERE: Feel free to put any name that defines your skillset (I’d suggest index-skillset. So mine would be docsindex-skillset INSERT_EMBEDDINGS_ENDPOINT_HERE: The endpoint copied from the embeddings provider in step 2. INSERT_YOUR_API_KEY_HERE: The API key copied from the embeddings provider in step 2 INSERT_INDEX_NAME_HERE: The index name from step 4. Click Create

      {
        "name": "INSERT_NAME_HERE",
        "description": "Splits document text into token-based chunks and generates embeddings.",
        "skills": [
          {
            "@odata.type": "#Microsoft.Skills.Util.DocumentExtractionSkill",
            "name": "extractText",
            "description": "Extract text from PDFs / Office docs / etc.",
            "context": "/document",
            "parsingMode": "default",
            "dataToExtract": "contentAndMetadata",
            "inputs": [
              {
                "name": "file_data",
                "source": "/document/file_data",
                "inputs": []
              }
            ],
            "outputs": [
              {
                "name": "content",
                "targetName": "content"
              }
            ],
            "configuration": {}
          },
          {
            "@odata.type": "#Microsoft.Skills.Text.SplitSkill",
            "name": "SplitSkill",
            "description": "Split document text into pages based on Azure OpenAI tokens.",
            "context": "/document",
            "defaultLanguageCode": "en",
            "textSplitMode": "pages",
            "maximumPageLength": 1400,
            "pageOverlapLength": 150,
            "maximumPagesToTake": 0,
            "unit": "azureOpenAITokens",
            "inputs": [
              {
                "name": "text",
                "source": "/document/content",
                "inputs": []
              },
              {
                "name": "languageCode",
                "source": "/document/language",
                "inputs": []
              }
            ],
            "outputs": [
              {
                "name": "textItems",
                "targetName": "pages"
              }
            ],
            "azureOpenAITokenizerParameters": {
              "encoderModelName": "cl100k_base",
              "allowedSpecialTokens": [
                "[START]",
                "[END]"
              ]
            }
          },
          {
            "@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill",
            "name": "EmbeddingSkill",
            "description": "Generate embeddings for each page chunk.",
            "context": "/document/pages/*",
            "resourceUri": "INSERT_EMBEDDINGS_ENDPOINT_HERE",
            "apiKey": "INSERT_YOUR_API_KEY_HERE",
            "deploymentId": "text-embedding-3-large",
            "dimensions": 3072,
            "modelName": "text-embedding-3-large",
            "inputs": [
              {
                "name": "text",
                "source": "/document/pages/*",
                "inputs": []
              }
            ],
            "outputs": [
              {
                "name": "embedding",
                "targetName": "contentVector"
              }
            ]
          }
        ],
        "indexProjections": {
          "selectors": [
            {
              "targetIndexName": "INSERT_INDEX_NAME_HERE",
              "parentKeyFieldName": "parentId",
              "sourceContext": "/document/pages/*",
              "mappings": [
                {
                  "name": "content",
                  "source": "/document/pages/*",
                  "inputs": []
                },
                {
                  "name": "contentVector",
                  "source": "/document/pages/*/contentVector",
                  "inputs": []
                },
                {
                  "name": "sourceDoc",
                  "source": "/document/metadata_storage_name",
                  "inputs": []
                }
              ]
            }
          ],
          "parameters": {
            "projectionMode": "skipIndexingParentDocuments"
          }
        }
      }
      
  6. Add Indexer

    1. In Search Management → Indexers. Click Add Indexer (JSON) Copy and paste the JSON below. Fill in these fields: INSERT_NAME_HERE: Insert a name for your indexer (I would put the name of the datasource-indexer. So here it would be docsindex-indexer INSERT_DATASOURCE_NAME_HERE: Datasource name from Step 4 INSERT_SKILLSET_NAME_HERE: Skillset name from Step 5 INSERT_INDEX_NAME_HERE: Index name from Step 3
    {
      "name": "INSERT_NAME_HERE",
      "description": null,
      "dataSourceName": "INSERT_DATASOURCE_NAME_HERE",
      "skillsetName": "INSERT_SKILLSET_NAME_HERE",
      "targetIndexName": "INSERT_INDEX_NAME_HERE",
      "disabled": null,
      "schedule": {
        "interval": "PT5M",
        "startTime": "2025-12-02T18:51:24.38Z"
      },
      "parameters": {
        "batchSize": null,
        "maxFailedItems": null,
        "maxFailedItemsPerBatch": null,
        "configuration": {
          "dataToExtract": "storageMetadata",
          "parsingMode": "default",
          "allowSkillsetToReadFileData": true,
          "failOnUnsupportedContentType": false,
          "failOnUnprocessableDocument": false
        }
      },
      "fieldMappings": [
        {
          "sourceFieldName": "metadata_storage_path",
          "targetFieldName": "id",
          "mappingFunction": {
            "name": "base64Encode",
            "parameters": null
          }
        },
        {
          "sourceFieldName": "metadata_storage_name",
          "targetFieldName": "sourceDoc",
          "mappingFunction": null
        }
      ],
      "outputFieldMappings": [
        {
          "sourceFieldName": "/document/pages/*",
          "targetFieldName": "content",
          "mappingFunction": null
        },
        {
          "sourceFieldName": "/document/pages/*/contentVector",
          "targetFieldName": "contentVector",
          "mappingFunction": null
        },
        {
          "sourceFieldName": "/document/sourceDoc",
          "targetFieldName": "sourceDoc",
          "mappingFunction": null
        }
      ],
      "cache": null,
      "encryptionKey": null
    }
    
  7. In Search Service → Overview, Copy down the Search Endpoint listed under URL

    image.png

  8. In AIPortal, go to the Admin Panel → RAG Config and Add a New Config: Name: The user friendly name for your RAG in AI Portal Document Return Count: Number of Returned Documents from the RAG Model Embedding Deployment Endpoint: The Endpoint for the Embeddings Deployment (should be something like https://{NAME}-aiportal-embeddings.cognitiveservices.azure.com/ ). Embedding Deployment: text-embedding-3-large Search Endpoint: The Endpoint copied from Step 6 Search Index: What you named your index in Step 3

    image.png

  9. In Your AI Portal, In Admin → Provider Config, attach the RAG Search to an existing provider, or create a new provider (e.g. copy the GPT-5.4 provider and under RAG Search select your new search)

Your RAG Provider is now Ready to use!

Notes:

  1. To Upload Documents: In the Azure Portal, go to Storage Center → {NAME}aiportal From there, go to Data Storage → Containers and locate the container you’ve named when deploying a custom template. This container is where you can upload documents. The Indexer will scan every 5 minutes for new documents.

image.png