Manufacturer Registry
Last Updated: 2026-01-18
The manufacturer registry prevents duplicate manufacturers when ingesting multiple sources.
Problem it solves
Different sources refer to the same manufacturer in inconsistent ways (IDs, casing, legacy names). Without a registry, Neo4j would accumulate duplicates and break relationships.
Example: “LOC Precision” may appear as LOC, Loc, LOC Precision, or a Tripoli-generated ID.
How it works
- Exact match (case-insensitive) against known aliases
- Fuzzy match when exact match fails (configurable threshold)
- Auto-create a new canonical entry when enabled
graph LR A[Source A: "LOC Precision"] --> R[Registry] B[Source B: "loc-621319214668"] --> R R --> C[Canonical: "LOC"]
Configuration
Mappings are driven by YAML (no code changes required):
data/platform/config/manufacturer_mappings.yaml
Implementation
- Library:
data/platform/src/libraries/manufacturer_registry.py
Where it’s used
- Primarily during Data Cleaning
- Also helpful for any ingest step that needs stable manufacturer IDs