Terminium Plus API

API for querying the Terminium Plus terminology database

Interactive API Docs ReDoc API Docs

Overview

Terminium Plus API provides access to a multilingual terminology database. It supports term lookups in English, French, Spanish, and Portuguese, allowing users to search for terminology across different domains and subjects.

Fuzzy Search Feature: The API implements fuzzy matching technology that allows for flexibility in search terms. This means that even if your search term has slight spelling variations or errors, the API can still find relevant matches. The similarity matching is controlled by the term_threshold and subject_threshold parameters.

Who Benefits from This Database

It is particularly valuable for professionals dealing with complex legal, governmental, and industry-specific terminology, including:

  • Translators ensuring precise term usage
  • Policymakers and legal experts handling legislative texts and official documents
  • Regulatory bodies working with trade agreements, compliance, and legal frameworks
  • Researchers and domain specialists requiring authoritative terminology references

Unique Content

Unlike general dictionaries or translation tools, this database contains a vast collection of terms that exist nowhere else. Derived from TERMIUM PlusĀ®, the Government of Canada's official terminology and linguistic data bank, it includes millions of terms covering:

  • Government agency names
  • Corporate and organizational names
  • Legal concepts and industry regulations
  • Crop types, import/export rules, and trade-specific terminology
  • Specialized jargon across multiple domains

These terms are essential for accurate communication in international trade, legal documentation, policy development, and technical translations.

Terminology Database

The full terminology database is available for download:

Download CSV Dataset GitHub Release Page

Note: The CSV file is required to utilize the service and is not included in the GitHub repository due to its size. After downloading, place it in the 'data' folder of your project.

CSV Format

id,subject_en,term_en,term_en_parameter,abbreviation_en,terme_fr,domaine_fr,...

The CSV file contains multilingual terminology data with fields for:

API Endpoints

Root Endpoint

GET / - Returns a welcome message

Term Search

GET /term - Search for terms in the database

Parameters:

Parameter Type Required Description
term string Yes The term to search for
lang string Yes Language code for the term being searched (en, fr, es, pt)
subject string No Domain or subject area that provides context for the terminology (e.g., "Sports", "Medicine", "Law")
term_threshold integer No Similarity threshold for term matching (0-100, default: 80)
subject_threshold integer No Similarity threshold for subject matching (0-100, default: 70)

Example Usage

Search for the term "Boxing" in English:

GET /term?term=Boxing&lang=en

Search with subject filter:

GET /term?term=Boxing&lang=en&subject=Sports

Adjust similarity thresholds:

GET /term?term=Boxing&lang=en&term_threshold=75&subject_threshold=60

Fuzzy search example (will still find "Boxing" with a typo):

GET /term?term=Boxng&lang=en

Response Format

The API returns results in JSON format:

{
  "count": 1,
  "results": [
    {
      "id": 1,
      "subject_en": "Sports",
      "term_en": "Boxing",
      "term_en_parameter": null,
      "abbreviation_en": null,
      // Other fields...
      "terme_fr": "Boxe",
      // Additional language translations...
    }
  ]
}

Installation

Docker Installation
Manual Installation

Docker Installation

The fastest way to get started is using Docker:

# Clone the repository
git clone https://github.com/julianchen24/Termium-Plus-API.git
cd Termium-Plus-API

# Download the CSV file from the GitHub release
# The direct download link is: https://github.com/julianchen24/Termium-Plus-API/releases/download/v1.0/combined.csv
# Place the downloaded combined.csv file in the 'data' directory
# For Windows: copy combined.csv data/
# For Mac/Linux: cp combined.csv data/

# Build and run with Docker
docker build -t terminium-api .
docker run -p 8000:8000 terminium-api

Once running, access the API at http://localhost:8000

Manual Installation

If you prefer to run without Docker:

# Clone the repository
git clone https://github.com/julianchen24/Termium-Plus-API.git
cd Termium-Plus-API

# Create and activate virtual environment
python -m venv venv
venv\Scripts\Activate.ps1  # Windows
source venv/bin/activate   # Linux/Mac

# Install dependencies
pip install -r requirements.txt

# Download the CSV file from the GitHub release
# The direct download link is: https://github.com/julianchen24/Termium-Plus-API/releases/download/v1.0/combined.csv
# Place the downloaded combined.csv file in the 'data' directory
# For Windows: copy combined.csv data/
# For Mac/Linux: cp combined.csv data/

# Initialize the database
python scripts/db_setup.py
python scripts/import_data.py

# Run the API
uvicorn app.main:app --reload

Once running, access the API at http://localhost:8000