PCA (Principal Component Analysis) reduction is a technique for shrinking high-dimensional vectors into fewer dimensions while preserving as much of the important information as possible.
So using PCA reduction will reduce the vectors from n, example 512 down to just 2 which you can then save to CSV. You can reduce to any number, example 3072 can be reduced to 384. Here we used 2 because its easy to pop into CSV
So although the x.y values by themself dont mean anything, once plotted on a graph we can see the grouping showing they are close together, so a meaning can be inferred.
Example Code
Using the values above create a collection of vectors with 512 dimensions and Open AIs text-embedding-3-small model.
using System.Globalization; using MathNet.Numerics.LinearAlgebra;
namespaceChatbot.API;
publicstaticclassPcaCsvExporter { publicstaticstringSaveReducedVectorsToCsv( IReadOnlyList<(string Word, float[] Vector)> vectors, string? outputDirectory = null) { if (vectors.Count < 2) { thrownew ArgumentException("At least two vectors are required for PCA.", nameof(vectors)); }
var featureCount = vectors[0].Vector.Length; if (featureCount < 2) { thrownew ArgumentException("Vectors must have at least two dimensions.", nameof(vectors)); }
if (vectors.Any(item => item.Vector.Length != featureCount)) { thrownew ArgumentException("All vectors must have the same dimensionality.", nameof(vectors)); }
var sampleCount = vectors.Count; var matrix = Matrix<double>.Build.Dense(sampleCount, featureCount);