newmind AI · AI Report

The AI Report.
Research from inside the operating system.

A periodic research publication from newmind AI — field studies, protocol papers, and the structural ideas behind governed, agentic legal intelligence.

12Issues
2026Volume I
OpenAccess
Current Issues
Issue 01
Protocol Paper · 2026

The Mecellem Protocol: a new grammar of reasoning

How structured ontology turns scattered legal records into contestable, governed reasoning — the theory behind newmind's semantic layer.

OntologyReasoningGovernance
Read the report →
Issue 02
Annual Report · 2026

State of Agentic Legal AI — 2026

What changed when reasoning moved from chatbots to governed operating systems — adoption, architecture, and the year ahead.

Field StudyAgentsEnterprise
Read the report →
Issue 03
Domain Models · 2026

Mecellem Models: Turkish Models for the Legal Domain

ModernBERT encoders pre-trained from scratch on 112.7B Turkish-dominant tokens and continually adapted — reaching top-3 on the Turkish retrieval leaderboard at a fraction of the parameters.

Legal LLMPretrainingTurkish
Read the report →
Issue 04
Embeddings · 2025

TurkEmbed: a Turkish Embedding Model for Inference & Similarity

A Turkish embedding model trained on native — not machine-translated — data with matryoshka representation learning, setting new marks on natural-language inference and semantic textual similarity.

EmbeddingsNLISimilarity
Read the report →
Issue 05
Retrieval · 2025

TurkEmbed4Retrieval: Turkish Embeddings Tuned for Retrieval

A retrieval-tuned TurkEmbed variant fine-tuned on MS-MARCO-TR that outperforms TurkColBERT on Scifact-TR by 19–26%, raising the bar for Turkish information retrieval.

EmbeddingsRetrievalMS-MARCO
Read the report →
Issue 06
Benchmark · 2025

TurkColBERT: Benchmarking Late-Interaction Retrieval in Turkish

The first comprehensive benchmark of dense bi-encoders against late-interaction ColBERT-style retrievers for morphologically rich Turkish, across five BEIR-TR domains.

RetrievalBenchmarkColBERT
Read the report →
Issue 07
Hallucination · 2025

Turk-LettuceDetect: Hallucination Detection for Turkish RAG

The first suite of hallucination-detection models for Turkish RAG, framing the problem as token-level classification across question answering, data-to-text, and summarization.

HallucinationRAGDetection
Read the report →
Issue 08
RAG Systems · 2025

Guided Decoding and its Role in Retrieval-Augmented Generation

A systematic study of guided decoding — Outlines, XGrammar, and LM Format Enforcer — and how multi-turn prompting shapes structured, hallucination-resistant RAG output.

RAGStructured OutputDecoding
Read the report →
Issue 09
Robustness · 2025

PARROT: a Sycophancy Robustness Benchmark for LLMs

A robustness benchmark that measures how far language models bend the truth under authority and social pressure, with an eight-state behavioral taxonomy evaluated across 22 models.

RobustnessSycophancyEvaluation
Read the report →
Issue 10
Fine-Tuning · 2026

RDP LoRA: Geometry-Driven Layer Selection for Fine-Tuning

A training-free method that uses the Ramer–Douglas–Peucker algorithm to pick which layers to adapt — beating full LoRA fine-tuning while touching far fewer layers.

Fine-TuningLoRAEfficiency
Read the report →
Issue 11
Semantics · 2026

Beyond Cosine Similarity: Taming a 15M-Node Synonym Graph

A 15-million-node Turkish synonym graph with a three-way relation discriminator that finally separates synonyms from antonyms and tames semantic drift in large-scale clustering.

SemanticsSynonymsClustering
Read the report →
Issue 12
Datasets · 2026

A Hybrid Protocol for Turkish Semantic Relations

A hybrid pipeline — FastText clustering, Gemini classification, and curated dictionaries — that builds an 843K-pair Turkish semantic-relations corpus, a 10× scale-up for about $65.

DatasetSemanticsLow-Resource
Read the report →