Contact us

IA case study: Comparison of contract terms

Man smiling while working on laptop, colleagues behind
Orange circular arrows icon
ActivDev

Customer issue

Case study : Know complex licensing contracts and compare them easily

Customer issue

This streaming platform specializing in language learning works with partners from around the world to acquire international film and series rights.
Each agreement is formalized in a PDF contractoften accompanied by amendments. These files are stored in Google Drive, in sub-folders specific to each partner.

Over time, the platform found itself with dozens of contracts to manageall different, often long and technical.
But there was no simple solution for extracting key information:

  • How much does each title cost?

  • For which platforms?

  • In which territories?

  • What exact rights are granted?

The team had to reread each contract by hand to find the right information, which slowed down decision-making and legal validation.

Context

The customer had nothing for the moment; he was relying on the memory of his teams and the rereading of the contract.


They wanted save time, avoid mistakes, get a clear view of each agreement without having to reread the entire PDF and power compare contracts.

But contracts varied enormously:

  • Some were simple (a single title, a global amount).

  • Others are very complex (several titles, each with different rights, prices, territories and dates).

They needed an automated system, stable, and adapted to their contractual diversity.

Team collaborating with digital data holograms.

 

Challenges

  • A wide range of formats by partner (terminology, structure, level of detail)

  • Some documents included several titles with different conditions

  • Awards could be global or per episode, depending on the case

  • The contracts were in non-editable PDF formatrequiring reliable OCR

  • The structure of their Drive included numerous sub-folders, which conventional automations did not handle well.

  • Airtable rejected some data if it was poorly formatted (e.g. type errors, nested fields, unhandled line feeds).

Work carried out by ActivDev

What we have put in place

1. Intelligent document collection

Every day, automation all Drive filesIt identifies new PDF files, and even retrieves them from deep sub-folders.

 2. Reading, OCR and data extraction

The PDF is first converted to text using OCR. Next, an AI agent reads the content and extracts the expected fields :

  • Dates, rights, territories, prices, languages, titles, renewal conditions, etc.

  • Even complex details such as price per episode or authorized languages are analyzed.

The agent was prompted to take into account :

  • The complexity of certain contracts

  • The fact that several titles can be present with different conditions

  • The need for a format clear and easy to read in Airtable

 3. Clean registration in Airtable

Before insertion, the data are :

  • Verifiedcleaned and adapted to Airtable format

  • Deduplicated if a contract has already been processed

  • Analyzed to understand terms, milestones, conditions and limitations

Each contract can therefore be consulted at a glance, without having to open the PDF and can be compared to another.

Tools used

  • n8n to orchestrate complete automation

  • Mistral & Gemini  to extract text from scanned pdfs.

  • OpenAI : to analyze and structure key information
  • Google Drive API to recover all files, even in sub-folders

  • Airtable for storing, validating and viewing data in structured form

Results

Contract comparison in 1 min instead of 30