FR/EN
Negent - Truth Layer

Get your data right.
Then do AI.

Negent Clean transforms your unstructured corpus into a reliable, versioned, governed foundation.Negent Clean transforms your unstructured corpus into a reliable, versioned, governed foundation. — and finally cross the line from POC to scale.

85%
ROT DATA
"85% of enterprise data has no value: ROT, dark data, duplicates."
Veritas - Global Databerg Report 2025
70%
OBSOLETE
"70% of text data is obsolete. It's the #1 cause of RAG project failure."
Infotechtion - AI Readiness Study 2025
60%
ABANDONED
"Without a solid foundation, 60% of AI projects will be abandoned by 2027."
Gartner - Predicts 2024-2026
Single Source of Truth
One reference per document family. No more contradictions.
Relationship Graph
Contract ↔ amendments ↔ annexes. Business context reconstructed.
Business Rules Applied
Your taxonomy, your thresholds, your ACLs — enforced continuously.
01Advanced deduplication02Version reconstruction03Relationship mapping04Canonical promotion05ACL-aware indexing06Human-in-the-loop07Continuous delta08Business graph 01Advanced deduplication02Version reconstruction03Relationship mapping04Canonical promotion05ACL-aware indexing06Human-in-the-loop07Continuous delta08Business graph
Why Clean is critical

AI can't guess
what's true.

Most companies don't lack data — they lack structured, trustworthy knowledge. Without cleaning, your AI agents start hallucinating the moment the pilot ends.

🔀

Version anarchy

15 versions of the same contract. No one knows which one is valid. Signed? Which one? The "final" or the "REAL_final"?

🧩

Missing context

A contract without its amendments, annexes, and decision history is useless. These relationships exist nowhere in exploitable form.

📉

Zero trust

Your search returns something relevant — but is it the right version? Is it legally binding? Doubt persists and slows decisions.

🔓

ACL violations

Without native ACL enforcement, "smart search" becomes a security breach. A user can retrieve content they shouldn't see.

🏷️

Metadata chaos

Inconsistent naming, missing tags, heterogeneous taxonomies. Automation fails. Humans spend hours just finding things.

📈

Doesn't scale

POC works. Production fails. Without observability, metrics, and continuous improvement, your AI assistant stays a fragile prototype.

What Clean delivers

Your documents are a liability.
Clean makes them an asset.

Clean doesn't just index. It arbitrates, reconciles, and governs — so your AI answers with the right version, from the right scope, for the right user.

Negent in action One document selected — its card revealed ↓
Dashboard
Sources
Conversations
Administration
Organisation
Taxonomies
Groups & Rules
Master
Monitoring
Prompts
↻ Sync All
+ Add Source
WORKSPACE › DOCUMENTS
Knowledge Base
Total Documents
6 852
Categorized
6 613
!
Action needed
239
All Documents ▾
Showing: 31 files
NAME / SOURCE UPLOAD DATE SIZE STATUS ACTIONS
S-AMAR_Annex_16.1__Executed_LNTP_1.pdf
Contractual Documentation › Contract
Mar 30, 10:28
606 KB
OK
AMAR_Annex_5.1_-_Updated_Time_Schedule.pdf
Technical Report › Technical Report
Mar 30, 10:27
844 KB
OK
S-AMAR_Amendment_3_Executed.docx.pdf
Contractual Documentation › Amendment
Mar 30, 10:29
505 KB
OK
S-AMAR_Annex_11_-_Updated_Certificates.docx.pdf
Tender › Award
Mar 30, 10:28
373 KB
OK
S-AMAR_Amendment_2_v221129_Rev_VF.docx.pdf
Contractual Documentation › Amendment
Mar 30, 10:28
541 KB
OK
S-PT_1.1.1_-_Grid_Requirement_-_P73_2020.pdf
Technical Report › Aucune
Mar 30, 10:29
2.7 MB
OK
S-PT_1.1.1.1_-_Grid_Code_and_Scope_of_works.pdf
Technical Report › Technical Report
Mar 30, 10:29
1.4 MB
Error
S-PT_1.1.1_-_Grid_Requirement_-_DL172_2006.pdf
Statutory meetings › None
Mar 30, 10:29
492 KB
OK
S-PT-EPC_Contract_Gensun_AMAR_VF.docx.pdf
Contractual Documentation › Contract
Mar 30, 10:22
1.7 MB
OK
S-PT_2.2.2_-_HSE_-_Penalties.pdf
HSSE › Aucune
Mar 30, 10:30
238 KB
OK
S-PT_2.2_-_HSE_Requirements_Cover_Page.pdf
HSSE
Mar 30, 10:30
167 KB
OK
S-AMAR_Amendment_2_v221129_Rev_VF.docx.pdf
Document Card — Negent AI Enterprise Intelligence
Categorisation
Contractual Documentation avenant
Document Identity
Titre Amendment #2 to the Lump Sum Contract — Solar Park of Amar
Langue Français (fr)
Nature Contractual amendment
Purpose Modifier les termes du contrat EPC initial pour le parc solaire d'Amar.
Business Scope
Anchor type
projetcontrat
Projet Solar Park of Amar
Companies AMAR, UNIPAR LDA · VOLTEX PVS, S.A.
Phase Construction
Temporality & Status
Doc date 11 January 2023
Signed 23 July 2021
Statut Final Signé Approuvé
Semantic summary
« Avenant #2 daté du 11 January 2023 — modifie le contrat EPC initial signé le 23 July 2021. Révise l'Advance Payment (Art. 8.1) et les délais LNTP 2 au plus tard le 1er mars 2023, Notice to Proceed au plus tard le 30 June 2023. »
1

Single Source of Truth

One canonical reference per content "family" — no more contradictions, no more competing versions. Clean selects, justifies, and audits every promotion.

Canonical promotion
2

Semantic Foundation

clean embeddings layer that gives your AI agents reliable, up-to-date sources. Encrypted text, normalized metadata, semantic + full-text index.

AI-ready index
3

Knowledge Graph

Intelligently connect your content — amendments, contracts, annexes — to reconstruct full business context. Relations traced, explained, and correctable.

Relationship mapping
4

Continuous Governance

Your taxonomy and business rules apply in real time across your entire document flow. ACLs propagated to document level. Automatic delta. Full audit trail.

Delta - ACL - HITL
The Doc Card

Each document becomes
actionable intelligence.

For every processed file, Clean produces a complete identity card — structured, comparable, and queryable by your AI without ever opening the source document.

S-AMAR_Amendment_2_v221129_Rev_VF.docx.pdf
Document Card — Negent AI Enterprise Intelligence
Signé Final Haute
Categorisation
Contractual Documentation avenant
Document Identity
TitreAmendment #2 to the Lump Sum Contract — Solar Park of Amar
LangueFrançais (fr)
NatureContractual amendment
Sous-typeAmendment to EPC contract
PurposeAmend the terms of the initial EPC contract for the Amar solar park.
Business Scope
Anchor type
projetcontrat
ProjetSolar Park of Amar
CompaniesAMAR, UNIPAR LDA
VOLTEX PVS, S.A.
Contrat réf.Lump Sum Contract — Solar Park of Amar
PhaseConstruction
Acteurs
ÉmetteursAMAR, UNIPAR LDA · VOLTEX PVS, S.A.
Rôles
EmployerContractorParties
Temporalité
Doc date11 January 2023
Signed23 July 2021
Avenant #111 February 2022
LNTP 21 March 2023 (planned)
NTP30 June 2023
Statut Formel
StatutFinal
SignéOui
ApprouvéOui
PertinenceHaute
Signaux discriminants
Amendment #2 Contrat signé: 23/07/2021 Avenant #1: 11/02/2022 Avenant #2: 11/01/2023 Modification Advance Payment Art. 8.1 Commencement of Work LNTP 2: 01/03/2023 NTP: 30/06/2023
Relations documentaires
Lump Sum Contract — Solar Park of Amar
Annex 16.1 LNTP 1
Annex 16.2 LNTP2 Template
Rev_VF (final version)Potential translation
Semantic summary
« Cet avenant contractuel (Amendment #2), daté du 11 January 2023, modifie le contrat initial de construction du parc solaire d'Amar signé le 23 July 2021. Il révise l'Advance Payment et l'article 8.1 relatif au Commencement of Work, incluant les délais LNTP 2 au plus tard le 1er mars 2023 et Notice to Proceed au plus tard le 30 June 2023. »
01
Category & identity — without opening the file
Every document receives a business category, a normalised title, and an automatically extracted purpose. Your teams and AI know exactly what it is — before opening it.
Business taxonomy · Normalisation
02
Actors, dates and formal status captured
Who signed, when, with what legal validity. Critical metadata is extracted, structured and comparable across all your documents — regardless of origin.
Extraction · Structuring
03
The document chain reconstructed
Parent contract, amendments, annexes — inter-document relations are traced automatically. Your AI queries the full context, not an isolated fragment stripped of its history.
Relationship mapping · Graph
04
The semantic summary — your LLM's fuel
The AI queries the card to find, then the document to answer. Analysing a Doc Card costs 100× less than a raw 80-page document. Less noise, lower costs, greater reliability.
RAG · LLM-ready
×100
Analysing a Doc Card costs 100× less than a raw 80-page document for your LLM. Fewer tokens, fewer calls, more reliable answers at scale.
Configuration

Configure Clean for your business.
Zero code required.

Taxonomy, classification rules, metadata fields, extraction prompts — all editable in the interface, versioned and reactivatable at any time. Your business teams stay in control, without depending on IT.

Structure
Tender
Rules/RFP
Award
Package
+ Add Subcategory
Statutory meetings
Financial Statement approval
Shareholders decisions ●
Board decisions
+ Add Subcategory
Land Register
Technical Report
Studies
Business Plan
Claims
HSSE
Financial Report
Lender Audit Report
ADMINISTRATION › TAXONOMIES › NEGENT-AI TAXO (V3)
Negent-AI Taxo (v3)
← Back
✓ Save Changes
Subcategory Details
Active Component
Name
Shareholders decisions
Code (Identifier)
SHR-DCN
Description
Décisions d'assemblée / shareholders resolutions : approbations, nominations, opérations sur capital. Mots-clés : résolution, shareholders meeting, assemblée des actionnaires, approval, decision, minutes / procès-verbal. À ne pas confondre : Board decisions, Articles of association.
Metadata Fields
Define data points specifically extracted for this component.
+ Add Field
date
Date ▾
Required
Description
Decision date
Options / Examples
2024-01-01

Your vocabulary, not a generic one

Categories reflect exactly how your teams name things: "EPC Amendment" rather than "Amendment", "Site Report" rather than "Document". Your people recognise their corpus instantly — and so does your AI.

Categories · Subcategories · Identifier codes

Classification rules in plain language

Describe how to tell a contract from an amendment, a board resolution from minutes. Add keywords, examples, edge cases. Clean applies these rules uniformly across your entire corpus.

Description · Keywords · Edge cases

Per-category typed metadata fields

Each subcategory defines its own fields: a decision date for board meetings, an amount for invoices, a project phase for site reports. These fields feed directly into every Doc Card.

Date · Amount · List · Required field
Return on investment

Clean isn't a cost.
It's a recovery.

Beyond AI quality, Clean delivers measurable gains from the first weeks — across five concrete operational dimensions.

-70%
Coûts de
stockage
ROT data eliminated
×100
Réduction
coûts LLM
Doc Card vs raw document
96.5%
Précision
classification
Reliable AI answers
Recherche
plus rapide
Clean corpus
0
Document sans
permission
ACL propagées
01
Reduced storage costs
40 to 70% of enterprise data is ROT — Redundant, Obsolete, Trivial. Clean identifies and qualifies every file for deletion or archiving, directly reducing storage, backup and indexation costs.
Storage · Backup · Indexation
Leaner backups from the pilot phase
Indexation costs reduced proportionally
eDiscovery 3× faster on a clean corpus
02
Optimised LLM costs
The AI queries Doc Cards to find, then documents to answer. A card costs 100× less to analyse than a raw 80-page document — fewer tokens, fewer calls, less noise in context.
Tokens · API calls · Latency
Token consumption reduced at scale
Unnecessary calls eliminated by pre-filtering
Context transmitted more precisely, less noisy
03
More reliable AI answers
An AI is only as reliable as its document base. On a clean, structured, governed corpus, hallucinations decrease and every answer is traceable to its source — the right version, for the right user.
RAG · Hallucinations · Traceability
Answers grounded in the canonical version
Source cited, verifiable, audited
Version conflicts eliminated upstream
04
Faster search
Moins de temps à chercher, vérifier, ou demander "quelle est la bonne version ?". The Doc Card répond avant même d'ouvrir le document — catégorie, statut, acteurs, relations : tout est lisible en un coup d'œil.
Productivity · Friction · Time
Relevant result on the first try
Status and version visible instantly
Full document chain reconstructed
05
Reduced risks
Fewer decisions made on the wrong documents. ACLs are propagated to document level — your AI only reveals what the user is authorised to see. Every processing action, validation, or correction is timestamped and tracked.
Compliance · ACL · Governance
No content returned without user permission
Full traceability across the entire corpus
Timestamped and secured audit trail
Implementation

Your path to
an AI-ready foundation.

Six steps. No file migration. A reliable, secure, governed foundation — built on top of what you already have.

01
Sources

Where your content lives.

Connect Negent to your existing systems — SharePoint, emails, servers, ECM, knowledge tools. Read-only. No disruption. Your workflows stay untouched.

Native connectors · Read-only mode · ACLs captured at connection
N SharePoint ✓ ACL G. Drive ✓ ACL Emails Servers ECM Notion 👁 READ ONLY
02
Scope

Scope Configuration

Teach Negent your language. Define your own rules, tags, and taxonomies — the platform adapts to how your business actually works, not the other way around.

Custom Taxonomy · Versioning rules · Confidence thresholds
TAXONOMY — BUSINESS DNA v1.0 Legal LGL · Contracts · Agreements 312 Technical TEC · Blueprints · Specs · Manuals 287 💰 Financial FIN · Budgets · Reports 156 📋 Operation | CONFIDENCE THRESHOLD 0.88
03
Secure Extraction

Your data stays yours

Only essential text and metadata are temporarily extracted, encrypted in transit, then processed. Your source files remain secure in your infrastructure.

TTL/purge · Minimal storage · Encryption in transit · Optional BYOK
source.pdf 🔒 stays here ✓ text + meta 🔐 AES-256 BYOK N TTL / purge after processing YOUR FILES NEVER LEAVE YOUR INFRASTRUCTURE
04
Foundation Build

Your AI-ready core.

The system normalizes formats, cleans noise, segments content, and creates embeddings to enable semantic search across your entire corpus, regardless of source.

Normalization · Semantic chunking · Embeddings · Text index
doc.pdf chunk 1 chunk 2 chunk 3 chunk N chunks dim₁ dim₂ Legal Technical Financial chunking sémantique espace vectoriel
05
Unification & Resolution

One truth. No contradictions.

Negent resolves semantic conflicts automatically, surfaces version families, and maps relationships across related content. Where ambiguity remains, human validation steps in.

Deduplication · Version chains · HITL on at-risk content
contract_v1.pdf 2022-03-01 contract_v2.pdf 2022-06-15 contract_FINAL.pdf 2022-10-01 contract_FINAL2.pdf Duplicate detected Canonical reference contract_FINAL.pdf 👤 HITL triggered Human validation required VERSION CHAIN RECONSTRUCTED
06
Governance

Continuous Loop

Your foundation stays reliable as your business evolves. Every new piece of content or rule change is automatically propagated. Logs, SLAs, drift alerts — full observability.

Delta sync · Audit trail · Targeted reprocessing · Full observability
📥 delta repro. 📋 audit alerts corpus trusted ✓ SLA 99% uptime TRAIL history DRIFT Δ alert REAL-TIME LOOP — ALWAYS CURRENT
Negent Architecture
Source SystemsSharePoint - Drive - ECM - Network - Intranet
See Clean ↗
↓ connectors - read-only - ACLs captured
Negent CleanTruth Layer - AI-ready Index - Business Graph
You are here ↗
↓ trusted corpus - native ACLs - embeddings
Negent IntelligenceSource RAG - Semantic Search - Records
Discover ↗
↓ robust - traceable - governed foundation
Negent AgentiqueAutomated Actions - Workflows - Agents
Coming soon ⋯

Clean builds the foundation.
The rest becomes possible.

Intelligence and Agentic can't scale on broken ground. A RAG built on a chaotic corpus doesn't fix the chaos — it amplifies it, answering confidently from the wrong source.

Activate Clean first and you create the conditions for the hardest transition in AI: from POC to production. Reliable, versioned, traceable — with permissions enforced at document level.

→ Source files never leave your environment.

FAQ

We have the answers
you're looking for.

Does Negent Clean replace our ECM/SharePoint ?
+
No. Clean connects to your source systems without replacing or touching them. Your files stay exactly where they are. It builds a truth layer on top of your existing repositories — an intelligent index, not a migration. Your DMS keeps running exactly as before. Clean just makes it AI-ready.
Do we need perfect metadata or a finalized taxonomy before we start ?
+
No — and that's the point. Clean is designed for imperfect corpora: inconsistent naming, incomplete tags, fragmented environments. It infers structure, enriches metadata, and proposes a taxonomy from what exists. You refine the rules over time through the scope configurator.
Where is our data stored with Clean ?
+
Source files remain in your systems (SharePoint, Drive, servers). Clean never moves them. Only metadata, the semantic index, and the relationship graph are stored in Negent — with at-rest and in-transit encryption, tenant isolation, and propagated ACLs. You maintain full control: your documents never leave your infrastructure.
How does the corpus stay reliable over time ?
+
Delta mode keeps it current without rebuilding from scratch. Clean continuously detects changes in your source systems and reprocesses only the affected families — nothing more. Rule change? Only impacted families are updated. Full observability through logs, SLAs, queues, and drift alerts means you always know the state of your foundation.
How long before Clean delivers results ?
+
Days, not months. From the connection and inventory phase — typically on a pilot scope — you get a document chaos report, identified duplicate families, and reconstructed version chains. Concrete evidence before any full deployment decision.
Why not just use SharePoint Search or our existing ECM?
+
Native tools (SharePoint, ECM) index everything without arbitration. Result: your search returns 15 versions of the same contract without telling you which one is binding. Clean solves this upstream: it identifies the authoritative reference, reconstructs version chains, detects semantic duplicates, and applies your business rules. SharePoint Search indexes. Clean governs.
Who decides when Clean detects a conflict or ambiguity?
+
Clean does — until it shouldn't. Clear-cut cases are resolved automatically based on your business rules. When ambiguity crosses a threshold — two signed versions dated close together, conflicting metadata — Clean triggers Human-in-the-Loop mode. A team expert is notified and validates the call in the interface. Automation handles the volume. Humans handle the judgment calls.
Contact

Let's talk about
your corpus.

Request a demo on your own scope. We assess your document chaos level and show you what Clean delivers — before any commitment.

🕐
Response within 24 hours
A Negent expert will reach out to scope your needs and propose a demo tailored to your environment.
🔎
Free diagnostic
Before any commitment, we produce a document chaos report on a sample of your corpus.
🔒
NDA available
For sensitive discussions, a confidentiality agreement can be signed from the very first contact.

Build your AI
on a real fondation.

Request a demo on your own corpus. We assess your document chaos level and show you what Clean delivers — before any commitment.

Negent Agentic

This module is currently in development. Agentic will enable automated actions across your information systems, directly from a trusted, governed document foundation.

Leave your contact details to be notified first.