(C# ASP.NET Core) Project on Vector Search using the Semantic Kernel in C#

Welcome to this tutorial, where we'll demystify the exciting world of Vector Search using the Semantic Kernel in C#. If you're looking to make your data smarter and more searchable, you've come to the right place. We'll walk through how to transform ordinary data into a powerful vector database, enabling semantic searches that go beyond keyword matching.
(Rev. 22-Nov-2025)

Categories | About |     |  

Parveen,

Source Code of the Project on Github here ↓

» →Source code here on GitHub

Please visit this page for other projects - Source code here

Getting Started: Your Essential Tools

Before we dive into the nitty-gritty, let's gather our prerequisites:

  1. Free Gemini API Keys: These are crucial for testing our semantic capabilities. You'll need to obtain these to interact with the AI models that power our vector embeddings. Click here to Get Keys.

  2. Project Resources: All necessary links, including where to get your Gemini keys and the GitHub repository [ » →Source code here on GitHub] for the source code discussed, are conveniently provided in the tutorial's description. If you need a more visual guide on obtaining your Gemini keys, there's also a linked video you can refer to. [See the description of the linked video]

Once you have your Gemini keys in hand and the project source code ready, we're all set to explore how this powerful system works.

Video Explanation

Please watch the following youtube video:

Understanding the Magic: What is a Vector Database?

Imagine your company has a treasure trove of information – perhaps in the form of countless PDF documents, sprawling text files, or even a curated collection of quotes and their authors. The challenge? Making this data intelligently searchable, not just by exact keywords, but by *meaning*. This is where a vector database shines.

Think of it like this: traditional search is like looking for a specific word in a dictionary. Vector search is like asking the dictionary to find all words that are "similar in meaning" to your query, even if they don't share the exact same letters.

The process generally involves two key steps:

  1. Data Preparation: From Raw to Structured: Your raw data (PDFs, plain text) first needs to be transformed into a structured, organized format. For instance, if you have quotes, you might parse them into a JSON file, with fields like "quote," "author," and "tags." This makes your data manageable and machine-readable. Tools like PDF or text readers can assist in extracting and structuring this information, perhaps into pages of text or a database with specific columns.

  2. Creating the Vector Database: From Structured to Semantic: Once your data is structured, the real transformation begins. Each piece of structured data (e.g., a quote, a page of text) is converted into a numerical representation called a "vector" or "embedding." These vectors are essentially mathematical fingerprints that capture the semantic meaning and context of your data. Data points with similar meanings will have vectors that are numerically "close" to each other in a high-dimensional space.

    This vector-enabled data is then stored in a specialized database, often called a "vector database." When you perform a search, your query is also converted into a vector, and the database quickly finds the most semantically similar data vectors, giving you highly relevant results. Crucially, this vector representation, often managed by a "read-only" component of the embedding model, is foundational to the entire search mechanism.

Seeing is Believing: A Project Walkthrough [see the linked video]

Let's put theory into practice with a working project. When you launch the application, you'll be greeted by a page offering options to "Generate or Update" your vector database.

The One-Time Setup: The first time you run this, you'll need to create your vector database. The project includes a dedicated link, such as "Click here to generate vector database," which triggers this essential, one-time process. This step takes your prepared JSON file (containing quotes and authors in our example) and converts it into a searchable SQLite vector database file. This is like building the very index for your semantic library.

The Search Experience: Once your database is successfully generated, you can return to the main page. Here, you'll find a prompt: "Enter a few words of quote to get a matching quote by a famous person." For instance, if you type "books," the system will instantly query your newly created SQLite database and present you with the top three most relevant quotes.

Each returned quote isn't just a hit; it comes with a "score." This score acts as a relevance metric: the smaller the score, the more semantically similar (or "closer") the result is to your query. In our experience, a score less than 0.4 indicates a very strong match – a good rule of thumb for relevance.

Under the Hood: A Look at the C# Source Code

Now, let's lift the lid and examine the C# code that orchestrates this magic.

  1. Data Source: `quotes.json`
    This file serves as our raw, structured data. It's a JSON array where each object contains a "quote," an "author," and "tags." This is the foundation from which our vector database is built.

  2. Models Folder: Defining Data Structures
    Inside the `Models` folder, you'll find several crucial definitions:

    • `Quotes` Model: This C# model maps directly to the structure of your `quotes.json` file (e.g., properties for `Quote`, `Author`, `Tags`). It acts as an intermediate representation to easily read and work with your JSON data within the application.

    • `VectorStore` Model: This defines the structure of your vector database entries. It typically includes fields like `VectorStoreKey` (a unique identifier), `VectorStoreData` (the original text or metadata), and `VectorDefinitionEmbeddings` (the actual numerical vector representation generated by the AI model).

  3. Helper Functions and Database Setup
    The project incorporates helper functions and classes to manage the SQLite database. These include:

    • Database Path and Connection String: Specifies where your SQLite file (e.g., `vectors.db`) is stored and how the application connects to it.

    • Vector Database Creation: A key helper function is responsible for taking your structured data (from the `Quotes` model, sourced from `quotes.json`) and generating the embeddings, then populating the SQLite file with these `VectorStore` entries. This is the "generation" step we discussed earlier.

  4. User Interface: `index.cshtml`
    This is the front-end page where users interact with the application. It contains the input field for search queries and uses a `foreach` loop to dynamically display the matching quotes and their scores returned from the backend.

  5. Backend Logic: Searching the Vector Database
    The corresponding backing file (often `index.cshtml.cs` or a controller action) handles the actual search operation. Here, a concise piece of code utilizes the Semantic Kernel's capabilities: a method like `SearchAsync` is invoked, passing the user's query and specifying parameters such as `SearchVectorTop(3)` to retrieve the top three most relevant results. These results are then collected and passed to the front-end for display.

  6. Service Registration: `Program.cs`
    In `Program.cs`, the application's services are configured. This is where you register the AI services needed for semantic search, specifically:

    • `AddGoogleAIGemini()`: Integrates the Google Gemini AI model.

    • `AddKernel()`: Adds the Semantic Kernel itself.

    • Gemini Embedding Service: Crucially, the same Gemini API key is used here to initialize the embedding service, which is responsible for converting text into those powerful numerical vectors.

    The design intentionally keeps the setup straightforward, making the code easy to follow and understand.

Wrapping Up

You've just walked through the fundamental concepts and practical implementation of vector search using the Semantic Kernel in C#. From preparing your raw data and transforming it into semantic vectors in a SQLite database, to performing intelligent searches and understanding the relevance scores, you now have a solid foundation. This capability opens doors to building highly intuitive and context-aware applications, making your data not just accessible, but truly intelligent.

Feel free to explore the provided source code, experiment with different data sets, and customize the search logic to fit your specific needs. The world of semantic search is vast, and you've just taken your first powerful step!


This Blog Post/Article "(C# ASP.NET Core) Project on Vector Search using the Semantic Kernel in C#" by Parveen is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.