Move Microsoft.Extensions.VectorData.Abstractions over from Semantic Kernel#7434
Move Microsoft.Extensions.VectorData.Abstractions over from Semantic Kernel#7434roji wants to merge 7 commits intodotnet:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR ports Microsoft.Extensions.VectorData.Abstractions (and related conformance + unit tests) from the Semantic Kernel repo into dotnet/extensions, including build/packaging wiring for dependencies and test infrastructure.
Changes:
- Adds the
Microsoft.Extensions.VectorData.Abstractionslibrary source (vector store + vector search abstractions, provider services, and utilities). - Introduces a new
Microsoft.Extensions.VectorData.ConformanceTeststest project and supporting fixtures to validate provider behavior. - Updates repository package-version props to include required centralized versions (e.g.,
Microsoft.Extensions.AI.Abstractions,xunit).
Reviewed changes
Copilot reviewed 78 out of 78 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/TypeTests/KeyTypeTests.cs | Adds conformance tests for supported key types and auto-generation behavior. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/TypeTests/EmbeddingTypeTests.cs | Adds conformance tests for supported embedding/vector CLR types and embedding generation scenarios. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/TestSuiteImplementationTests.cs | Ensures provider test suites implement all base test classes (or explicitly ignore them). |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/Support/VectorStoreFixture.cs | Adds base fixture for vector store lifecycle + key generation. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/Support/VectorStoreCollectionFixtureBase.cs | Adds base fixture for collection setup + seeding. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/Support/VectorStoreCollectionFixture.cs | Adds reseeding logic for typed collections. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/Support/TestStore.cs | Adds provider-overrideable store abstraction + wait-for-indexing helper. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/Support/TestRecord.cs | Adds shared record base with [VectorStoreKey]. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/Support/DynamicVectorStoreCollectionFixture.cs | Adds reseeding support for dynamic-mapped collections. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/ModelTests/NoVectorModelTests.cs | Adds model tests for providers supporting models without vectors. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/ModelTests/NoDataModelTests.cs | Adds model tests for key+vector-only records. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/ModelTests/MultiVectorModelTests.cs | Adds tests for multi-vector models and required vector-property selection. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/ModelTests/DynamicModelTests.cs | Adds dynamic-mapping CRUD/search tests. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/Microsoft.Extensions.VectorData.ConformanceTests.csproj | Introduces the conformance test project and references required packages/projects. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/IndexKindTests.cs | Adds conformance coverage for index-kind support. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/HybridSearchTests.cs | Adds hybrid search conformance tests and fixtures. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/DistanceFunctionTests.cs | Adds conformance tests for distance function behavior and score threshold support. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/DependencyInjectionTests.cs | Adds DI registration conformance tests for stores/collections and embedding generator resolution. |
| test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/CollectionManagementTests.cs | Adds conformance tests around collection management and metadata service exposure. |
| test/Libraries/Microsoft.Extensions.VectorData.Abstractions.Tests/PropertyModelTests.cs | Adds unit tests for nullability detection behavior in provider property models. |
| test/Libraries/Microsoft.Extensions.VectorData.Abstractions.Tests/Microsoft.Extensions.VectorData.Abstractions.Tests.csproj | Adds a unit test project for the abstractions library. |
| test/.editorconfig | Adds a C# preference for static anonymous functions (duplicated entry currently). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorStorage/VectorStoreMetadata.cs | Adds vector store metadata shape for GetService scenarios. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorStorage/VectorStoreException.cs | Adds common exception type with store/collection/operation metadata. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorStorage/VectorStoreCollectionOptions.cs | Adds common options base for collections (definition + embedding generator). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorStorage/VectorStoreCollectionMetadata.cs | Adds collection metadata shape for GetService scenarios. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorStorage/VectorStoreCollection.cs | Adds core collection abstraction for CRUD + search operations. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorStorage/VectorStore.cs | Adds core store abstraction for collections + management APIs. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorSearch/VectorSearchResult.cs | Adds search-result record+score container. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorSearch/VectorSearchOptions.cs | Adds vector-search options (filtering, skip, vector selection, score threshold, include vectors). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorSearch/IVectorSearchable.cs | Adds searchable interface for vector similarity search. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorSearch/IKeywordHybridSearchable.cs | Adds hybrid-search interface (dense vector + keyword search). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorSearch/HybridSearchOptions.cs | Adds hybrid-search options (filtering, skip, property selection, include vectors, score threshold). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/Utilities/UnreachableException.cs | Adds UnreachableException polyfill for older TFMs. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordOptions/RecordRetrievalOptions.cs | Adds options for single/batch record retrieval. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordOptions/FilteredRecordRetrievalOptions.cs | Adds options for filtered Get (skip/order/include vectors) and order-by builder types. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordDefinition/VectorStoreVectorProperty{TInput}.cs | Adds generic vector property definition for custom embedding-generation input types. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordDefinition/VectorStoreVectorProperty.cs | Adds vector property definition (dimensions, index kind, distance function, generator, embedding type). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordDefinition/VectorStoreProperty.cs | Adds base definition for record properties with provider annotations. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordDefinition/VectorStoreKeyProperty.cs | Adds key property definition (type + auto-generation). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordDefinition/VectorStoreDataProperty.cs | Adds data property definition (indexed/full-text indexed). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordDefinition/VectorStoreCollectionDefinition.cs | Adds collection schema definition (properties + default generator). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordDefinition/IndexKind.cs | Adds well-known index-kind constants. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordDefinition/DistanceFunction.cs | Adds well-known distance-function constants. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordAttributes/VectorStoreVectorAttribute.cs | Adds attribute for vector properties (dimensions/index/distance/storage name). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordAttributes/VectorStoreKeyAttribute.cs | Adds attribute for key properties (storage name + auto-generation). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/RecordAttributes/VectorStoreDataAttribute.cs | Adds attribute for data properties (index/full-text index/storage name). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/README.md | Adds package README and links to conceptual docs. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/VectorPropertyModel{TInput}.cs | Adds provider model support for embedding generation with custom input type. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/VectorPropertyModel.cs | Adds provider model support for vector properties, embedding dispatch, and resolution. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/VectorDataStrings.cs | Adds shared provider error-string helpers. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/PropertyModel.cs | Adds base provider property model with POCO/dynamic accessors and nullability detection. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/KeyPropertyModel.cs | Adds provider key-property model. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/IRecordCreator.cs | Adds internal record factory interface for provider implementations. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/Filter/QueryParameterExpression.cs | Adds expression node for parameterized captured values during filter translation. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/Filter/FilterTranslatorBase.cs | Adds base for filter translators, including preprocessing and Contains-pattern support. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/Filter/FilterPreprocessingOptions.cs | Adds options for preprocessing filter expression trees. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/EmbeddingGenerationDispatcher{TEmbedding}.cs | Adds typed dispatcher for embedding generation and embedding type resolution. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/EmbeddingGenerationDispatcher.cs | Adds base dispatcher abstraction/factory for embedding generation. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/DataPropertyModel.cs | Adds provider data-property model. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/CollectionModelBuildingOptions.cs | Adds model-building feature flags (multi-vector, serializer behavior, reserved key name). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/CollectionModel.cs | Adds built model container and helpers for property selection/validation. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/CollectionJsonModelBuilder.cs | Adds JSON-specific model-building customization hooks. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/Microsoft.Extensions.VectorData.Abstractions.csproj | Introduces the new library project and its TFM/dependency configuration. |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/FilterClauses/FilterClause.cs | Adds obsolete filter clause base type (compat layer). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/FilterClauses/EqualToFilterClause.cs | Adds obsolete equality filter clause type (compat layer). |
| src/Libraries/Microsoft.Extensions.VectorData.Abstractions/FilterClauses/AnyTagEqualToFilterClause.cs | Adds obsolete tag-contains filter clause type (compat layer). |
| eng/packages/Tests.props | Adds centralized xunit package version entry. |
| eng/packages/General-net9.props | Adds centralized Microsoft.Extensions.AI.Abstractions version entry for net9. |
| eng/packages/General-net10.props | Adds centralized Microsoft.Extensions.AI.Abstractions version entry for net10. |
| eng/packages/General-LTS.props | Adds centralized Microsoft.Extensions.AI.Abstractions version entry for LTS. |
| eng/Versions.props | Adds Microsoft.Extensions.AI.Abstractions version properties (net9/net10/LTS). |
src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorStorage/VectorStore.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.VectorData.Abstractions/README.md
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorSearch/HybridSearchOptions.cs
Outdated
Show resolved
Hide resolved
...rosoft.Extensions.VectorData.ConformanceTests/Support/DynamicVectorStoreCollectionFixture.cs
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.VectorData.Abstractions/README.md
Outdated
Show resolved
Hide resolved
...Microsoft.Extensions.VectorData.ConformanceTests/Support/VectorStoreCollectionFixtureBase.cs
Outdated
Show resolved
Hide resolved
test/Libraries/Microsoft.Extensions.VectorData.ConformanceTests/HybridSearchTests.cs
Show resolved
Hide resolved
.../Microsoft.Extensions.VectorData.Abstractions/ProviderServices/CollectionJsonModelBuilder.cs
Outdated
Show resolved
Hide resolved
| <!-- Packages from dotnet/runtime --> | ||
| <MicrosoftBclAsyncInterfacesLTSVersion>8.0.0</MicrosoftBclAsyncInterfacesLTSVersion> | ||
| <MicrosoftBclTimeProviderLTSVersion>8.0.1</MicrosoftBclTimeProviderLTSVersion> | ||
| <MicrosoftExtensionsAIAbstractionsLTSVersion>9.10.2</MicrosoftExtensionsAIAbstractionsLTSVersion> |
There was a problem hiding this comment.
Note that there is no MEAI 8.0, but 9.0 supports net8.0.
There was a problem hiding this comment.
10.x supports net8.0 as well
There was a problem hiding this comment.
Is that a better choice for 9.0? I see we depend on 8.0 for everything we can, but since that's not an option here we can indeed go with 10.0...
| /// <summary> | ||
| /// Represents a filter clause that filters by checking if a field consisting of a list of values contains a specific value. | ||
| /// </summary> | ||
| [Obsolete("Use LINQ expressions via VectorSearchOptions<TRecord>.Filter instead. This type will be removed in a future version.")] |
There was a problem hiding this comment.
This (and the other obsolete types below) are unfortunately still required for backwards compat for non-MEVD SK APIs layered on top... We'll be removing this at some point.
| /// Options for filter expression preprocessing. | ||
| /// This is an internal support type meant for use by connectors only and not by applications. | ||
| /// </summary> | ||
| [Experimental("MEVD9001")] |
There was a problem hiding this comment.
All files under ProviderServices are only intended for us by implementations of the abstractions, and never by end users. These are being kept experimental for the near future, to (a) generate an alert for end users accidentally using them, and (b) allow us a bit more leeway to do provider-facing breaking changes in order to evolve things (we still own/maintain almost all of the providers, and will continue to be in close touch with the maintainers in any case).
src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorStorage/VectorStore.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorSearch/HybridSearchOptions.cs
Outdated
Show resolved
Hide resolved
...rosoft.Extensions.VectorData.ConformanceTests/Support/DynamicVectorStoreCollectionFixture.cs
Show resolved
Hide resolved
...Microsoft.Extensions.VectorData.ConformanceTests/Support/VectorStoreCollectionFixtureBase.cs
Outdated
Show resolved
Hide resolved
…d update API baseline The project was missing Stage=normal, which prevented the API baseline from being loaded during build. Without the baseline, all public symbols were flagged as 'newly added' (LA0003) and treated as errors in CI. Also add MinCodeCoverage/MinMutationScore (required for staged packages), disable PackageValidation (new package has no prior version on feeds), and update the baseline assembly version to match the current build. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| public string FieldName { get; private set; } | ||
|
|
||
| /// <summary> | ||
| /// Gets the value that the list should contain. | ||
| /// </summary> | ||
| public string Value { get; private set; } |
There was a problem hiding this comment.
Why are the private set;s needed? They're only set from the ctor; seems like these could just be get-only auto props.
There was a problem hiding this comment.
You're right, it's just that this is totally unimportant legacy-only code that's already [Obsolete]... I worked hard to clean up and rip this legacy filtering mechanism out of MEVD, this piece was left over. So I'm just moving the types across for now, in a few months we'll get rid of them entirely.
| /// <summary> | ||
| /// Gets the field name to match. | ||
| /// </summary> | ||
| public string FieldName { get; private set; } |
There was a problem hiding this comment.
See the other comment about this being [Obsolete] and will be removed.
...oft.Extensions.VectorData.Abstractions/ProviderServices/Filter/FilterPreprocessingOptions.cs
Outdated
Show resolved
Hide resolved
|
|
||
| // Otherwise, transform the node to a QueryParameterExpression which the connector will then translate to a parameter (e.g. SqlParameter). | ||
|
|
||
| // TODO: Share the same parameter when it references the same captured value |
src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/PropertyModel.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.VectorData.Abstractions/ProviderServices/PropertyModel.cs
Outdated
Show resolved
Hide resolved
| /// Initializes a new instance of the <see cref="VectorStoreVectorAttribute"/> class. | ||
| /// </summary> | ||
| /// <param name="Dimensions">The number of dimensions that the vector has.</param> | ||
| public VectorStoreVectorAttribute(int Dimensions) |
There was a problem hiding this comment.
Yeah, I face-palmed whe I got the diagnostic here - after moving the code to dotnet/extensions (the SK repo didn't have it on). This is unfortunately a user-facing API, so would be a breaking change; but I think it's probably still reasonable to do this - I'll do it in a separate PR after this initial PR (makes sense?)
.../Microsoft.Extensions.VectorData.Abstractions/RecordAttributes/VectorStoreVectorAttribute.cs
Outdated
Show resolved
Hide resolved
| /// 0 means vectors are orthogonal. | ||
| /// 1 means vectors are identical. | ||
| /// </remarks> | ||
| public const string CosineSimilarity = nameof(CosineSimilarity); |
There was a problem hiding this comment.
Are these public consts rather than public static get-only properties because we expect them to be used in places that require consts, e.g. attributes, switch cases, etc.?
There was a problem hiding this comment.
Yes - users can use these magic/standard values to indicate which similarity function they want, see these tests for an example. This is conceptually an enum, but distance functions (and index types) are an open list, and some database out there may implement some new thing. We don't want users to be blocked by MEVD not having an enum value for something, and at the same time there's a good body of standard values already; so this is a sort of compromise: users can use these constants for the standard stuff, but a provider is also free to have its own arbitrary string value which users can then pass in (and which we may eventually add here if it becomes universal).
Does that make sense?
...braries/Microsoft.Extensions.VectorData.Abstractions/RecordDefinition/VectorStoreProperty.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.VectorData.Abstractions/VectorSearch/VectorSearchResult.cs
Show resolved
Hide resolved
...osoft.Extensions.VectorData.Abstractions/Microsoft.Extensions.VectorData.Abstractions.csproj
Show resolved
Hide resolved
...osoft.Extensions.VectorData.Abstractions/Microsoft.Extensions.VectorData.Abstractions.csproj
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.VectorData.Abstractions/README.md
Outdated
Show resolved
Hide resolved
LA0007 — Bug in the analyzer's Utils.GetConstraints(). When a type had both base types (: Interface) AND multiple where constraints, the naive : split produced different index offsets for the baseline string (which includes base types) vs. the compiled display string (which doesn't). Fixed the parser to first isolate the where portion of the string, then extract constraints from each clause individually.
| /// <summary> | ||
| /// Gets whether this key property's value is auto-generated or not, or <see langword="null" /> if not set. | ||
| /// </summary> | ||
| internal bool? IsAutoGeneratedNullable { get; private set; } |
adamsitnik
left a comment
There was a problem hiding this comment.
@roji overall it LGTM (I've seen this code more than once), but I've asked some questions regarding having the source of MEVD in this repo and also referencing MEVD by other projects in this repo at the same time
| </ItemGroup> | ||
|
|
||
| <ItemGroup Condition="'$(InjectUnreachableExceptionOnLegacy)' == 'true' AND ('$(TargetFramework)' == 'net462' or '$(TargetFramework)' == 'netstandard2.0' or '$(TargetFramework)' == 'netstandard2.1' or '$(TargetFramework)' == 'netcoreapp3.1' or '$(TargetFramework)' == 'net6.0')"> | ||
| <Compile Include="$(MSBuildThisFileDirectory)\..\..\src\LegacySupport\UnreachableException\*.cs" LinkBase="LegacySupport\UnreachableException" /> |
There was a problem hiding this comment.
nit: I can see that it's consistent with other includes in this props file, but I just don't see the point of not including just the single file
| <Compile Include="$(MSBuildThisFileDirectory)\..\..\src\LegacySupport\UnreachableException\*.cs" LinkBase="LegacySupport\UnreachableException" /> | |
| <Compile Include="$(MSBuildThisFileDirectory)\..\..\src\LegacySupport\UnreachableException\UnreachableException.cs" LinkBase="LegacySupport\UnreachableException" /> |
| <MicrosoftBclAsyncInterfacesVersion>9.0.14</MicrosoftBclAsyncInterfacesVersion> | ||
| <MicrosoftBclMemoryVersion>9.0.14</MicrosoftBclMemoryVersion> | ||
| <MicrosoftBclTimeProviderVersion>9.0.14</MicrosoftBclTimeProviderVersion> | ||
| <MicrosoftExtensionsAIAbstractionsVersion>9.10.2</MicrosoftExtensionsAIAbstractionsVersion> |
There was a problem hiding this comment.
What is our plan for the current reference to MEVD that is being used by the Data Ingestion projects?
There was a problem hiding this comment.
I wonder if this project should go to src/ instead, as I expect plenty of similar settings apply to it due to being in test/:
extensions/test/Directory.Build.props
Lines 3 to 4 in 84c186e
extensions/test/Directory.Build.targets
Lines 5 to 12 in 84c186e
|
|
||
| <ItemGroup> | ||
| <PackageReference Include="xunit" /> | ||
| <PackageReference Include="Microsoft.Extensions.AI.Abstractions" /> |
There was a problem hiding this comment.
If my understanding is correct, this is going to reference 9.7.0 as of now.
Line 229 in 84c186e
This moves Microsoft.Extensions.VectorData.Abstractions from the Semantic Kernel repo (here) to the dotnet/extensions repo.
Note: this introduces two test projects:
Microsoft Reviewers: Open in CodeFlow.