====== std.fasttext ======

FastText-Wortvektor-Engine als native Lyx-Implementierung (angelehnt an Facebook Research, 2016). Trainiert Skip-gram- oder CBOW-Modelle (Stochastic Gradient Descent) auf einem Wort-Korpus und erzeugt dichte Einbettungsvektoren (Standard: 100 Dimensionen). Findet die k nächsten Nachbarn (''FastTextFindNearest''), löst Analogien (»König – Mann + Frau = ?«, ''FastTextAnalogies''), klassifiziert Textvektoren (''FastTextClassify'') und speichert/lädt Modelle als Binärdatei.

Einsatzbereiche: Textklassifikation, semantische Ähnlichkeitssuche, Sprachmodelle, Empfehlungssysteme, NLP-Preprocessing für ''std.ml''.

**Autor:** Andreas Röne\\
**Copyright:** 2024-2025 Andreas Röne

----

===== Konstanten =====

^ Name ^ Typ ^ Wert ^ Sichtbarkeit ^
| ''DEFAULT_DIM'' | ''int64'' | ''100'' | pub |
| ''DEFAULT_WINDOW'' | ''int64'' | ''5'' | pub |
| ''DEFAULT_EPOCHS'' | ''int64'' | ''50'' | pub |
| ''DEFAULT_LR'' | ''f64'' | ''0.025'' | pub |
| ''MIN_COUNT'' | ''int64'' | ''1'' | pub |
| ''NEG_SAMPLES'' | ''int64'' | ''5'' | pub |

----

===== Funktionen =====

^ Signatur ^ Sichtbarkeit ^ Beschreibung ^
| ''FastTextInitVocab(corpus_size: int64): void'' | pub | Initialisiert den Vokabular-Speicher für Corpus |
| ''FastTextGetEmbeddingAt(word_idx: int64): int64'' | pub | Gibt Einbettungsvektor für Wortindex zurück |
| ''FastTextComputeContextVector('' | pub | Berechnet Kontextvektor aus Nachbarwörtern |
| ''FastTextTrainSkipgramSGD('' | pub | Trainiert Skip-gram-Modell via SGD |
| ''FastTextTrainCBOWSGD('' | pub | Trainiert CBOW-Modell via SGD |
| ''FastTextSampleNegative(positive_word: int64): int64'' | pub | Samples negatives Wort für Negative Sampling |
| ''FastTextDotProduct(vec1: int64, vec2: int64): f64'' | pub | Berechnet Skalarprodukt zweier Vektoren |
| ''FastTextVectorNorm(vec: int64): f64'' | pub | Berechnet euklidische Norm eines Vektors |
| ''FastTextNormalize(vec: int64): void'' | pub | Normalisiert Vektor auf Einheitslänge |
| ''FastTextFindNearest(query_vec: int64, k: int64): int64'' | pub | Findet k nächste Nachbarn im Vektorraum |
| ''FastTextAnalogies(a: int64, b: int64, c: int64): int64'' | pub | Löst Wortanalogie a–b+c im Vektorraum |
| ''FastTextClassify(text_vec: int64): int64'' | pub | Klassifiziert Textvektor in Label-Kategorie |
| ''FastTextClassifyProb(text_vec: int64, label: int64): f64'' | pub | Gibt Klassifikationswahrscheinlichkeit für Label zurück |
| ''FastTextPredictWord(context_vec: int64): int64'' | pub | Sagt wahrscheinlichstes Wort für Kontext voraus |
| ''FastTextSaveModel(path: pchar): int64'' | pub | Speichert trainiertes Modell als Binärdatei |
| ''FastTextLoad(path: pchar): int64'' | pub | Lädt Modell aus Binärdatei |
| ''FastTextFree(): void'' | pub | Gibt gesamten Modell-Speicher frei |
| ''FastTextSetDimension(d: int64): void'' | pub | Setzt Einbettungsdimension vor dem Training |
| ''FastTextSetLearningRate(lr: f64): void'' | pub | Setzt Lernrate für SGD-Training |
| ''FastTextSetWindow(window: int64): void'' | pub | Setzt Kontextfenstergröße für Training |
| ''FastTextGetDimension(): int64'' | pub | Gibt aktuelle Einbettungsdimension zurück |
| ''FastTextGetVocabSize(): int64'' | pub | Gibt Anzahl der Vokabular-Einträge zurück |