Overview
Terminium Plus API provides access to a multilingual terminology database. It supports term lookups in English, French, Spanish, and Portuguese, allowing users to search for terminology across different domains and subjects.
Fuzzy Search Feature: The API implements fuzzy matching technology that allows for flexibility in search terms. This means that even if your search term has slight spelling variations or errors, the API can still find relevant matches. The similarity matching is controlled by the term_threshold and subject_threshold parameters.
Who Benefits from This Database
It is particularly valuable for professionals dealing with complex legal, governmental, and industry-specific terminology, including:
- Translators ensuring precise term usage
- Policymakers and legal experts handling legislative texts and official documents
- Regulatory bodies working with trade agreements, compliance, and legal frameworks
- Researchers and domain specialists requiring authoritative terminology references
Unique Content
Unlike general dictionaries or translation tools, this database contains a vast collection of terms that exist nowhere else. Derived from TERMIUM PlusĀ®, the Government of Canada's official terminology and linguistic data bank, it includes millions of terms covering:
- Government agency names
- Corporate and organizational names
- Legal concepts and industry regulations
- Crop types, import/export rules, and trade-specific terminology
- Specialized jargon across multiple domains
These terms are essential for accurate communication in international trade, legal documentation, policy development, and technical translations.
Terminology Database
The full terminology database is available for download:
Note: The CSV file is required to utilize the service and is not included in the GitHub repository due to its size. After downloading, place it in the 'data' folder of your project.
CSV Format
id,subject_en,term_en,term_en_parameter,abbreviation_en,terme_fr,domaine_fr,...
The CSV file contains multilingual terminology data with fields for:
- Terms in English, French, Spanish, and Portuguese
- Subject/domain classifications
- Abbreviations and parameters
- Synonyms and textual support
API Endpoints
Root Endpoint
GET /
- Returns a welcome message
Term Search
GET /term
- Search for terms in the database
Parameters:
Parameter | Type | Required | Description |
---|---|---|---|
term | string | Yes | The term to search for |
lang | string | Yes | Language code for the term being searched (en, fr, es, pt) |
subject | string | No | Domain or subject area that provides context for the terminology (e.g., "Sports", "Medicine", "Law") |
term_threshold | integer | No | Similarity threshold for term matching (0-100, default: 80) |
subject_threshold | integer | No | Similarity threshold for subject matching (0-100, default: 70) |
Example Usage
Search for the term "Boxing" in English:
GET /term?term=Boxing&lang=en
Search with subject filter:
GET /term?term=Boxing&lang=en&subject=Sports
Adjust similarity thresholds:
GET /term?term=Boxing&lang=en&term_threshold=75&subject_threshold=60
Fuzzy search example (will still find "Boxing" with a typo):
GET /term?term=Boxng&lang=en
Response Format
The API returns results in JSON format:
{ "count": 1, "results": [ { "id": 1, "subject_en": "Sports", "term_en": "Boxing", "term_en_parameter": null, "abbreviation_en": null, // Other fields... "terme_fr": "Boxe", // Additional language translations... } ] }
Installation
Docker Installation
The fastest way to get started is using Docker:
# Clone the repository git clone https://github.com/julianchen24/Termium-Plus-API.git cd Termium-Plus-API # Download the CSV file from the GitHub release # The direct download link is: https://github.com/julianchen24/Termium-Plus-API/releases/download/v1.0/combined.csv # Place the downloaded combined.csv file in the 'data' directory # For Windows: copy combined.csv data/ # For Mac/Linux: cp combined.csv data/ # Build and run with Docker docker build -t terminium-api . docker run -p 8000:8000 terminium-api
Once running, access the API at http://localhost:8000
Manual Installation
If you prefer to run without Docker:
# Clone the repository git clone https://github.com/julianchen24/Termium-Plus-API.git cd Termium-Plus-API # Create and activate virtual environment python -m venv venv venv\Scripts\Activate.ps1 # Windows source venv/bin/activate # Linux/Mac # Install dependencies pip install -r requirements.txt # Download the CSV file from the GitHub release # The direct download link is: https://github.com/julianchen24/Termium-Plus-API/releases/download/v1.0/combined.csv # Place the downloaded combined.csv file in the 'data' directory # For Windows: copy combined.csv data/ # For Mac/Linux: cp combined.csv data/ # Initialize the database python scripts/db_setup.py python scripts/import_data.py # Run the API uvicorn app.main:app --reload
Once running, access the API at http://localhost:8000