Sync products from wholesale vendors like S&S Activewear into your WooCommerce store — automatically downloading catalogs, staging the data, filtering and mapping fields, and building variable products with images, sizes, colors, and brand logos. No CSV imports, no manual data entry.
Under the hood, Catalog Sync is a vendor-neutral ETL pipeline designed for large vendor datasets.
Version 2.0 introduced a major architectural shift from monolithic catalog ingestion to a modular, endpoint-based ingestion model.
Instead of importing a single large product feed, Catalog Sync now:
- ingests vendor data from multiple endpoints
- stages each dataset independently
- assembles a unified working dataset
- applies filtering and mapping before synchronization
- introduces deterministic endpoint-level execution using vendor_key + endpoint_key
- uses a single import entry point with endpoint-based routing
- standardizes ingestion from JSON into staging tables
- improves reliability for large inventory datasets and nested vendor payloads
- ensures endpoint processing aligns with the structure of each vendor data source
This approach provides:
- deterministic execution
- vendor-neutral architecture
- scalable product synchronization
- controlled catalog filtering
- safe batch execution
- reliable ingestion for large and complex datasets
WooCommerce becomes a projection of the assembled dataset, not the primary data source.
Key Features
- Vendor-neutral architecture
- Endpoint-based ingestion
- Stream-to-disk downloads
- Multi-table staging per endpoint
- Inventory warehouse-level normalization
- Assembled catalog working dataset
- JSON filter rules compiled to SQL with per-rule logic and nested groups
- Mapping profiles for WooCommerce product fields
- Batch-based WooCommerce synchronization
- SKU-based idempotent updates
- Featured image ingestion per product
- Color-specific variation image ingestion
- Brand logo ingestion to WooCommerce brand taxonomy terms
- Variable products grouped by styleID with color and size as variation attributes
- Manual field edit protection — sync never overwrites manually assigned categories
- Background synchronization support
- Database migration tool in admin Tools tab
- Unified admin ETL console
Architecture Overview
The system follows a modular staging-first data pipeline:
Vendor Endpoints
→ Endpoint Downloads
→ Raw Staging Tables
→ Catalog Assembly
→ Filter Profiles
→ Mapping Profiles
→ WooCommerce Synchronization
Staging tables act as the system of record, while WooCommerce products are projections generated from the assembled dataset.
Import execution is handled through a single entry point:
→ request received
→ endpoint-based routing decision
→ appropriate execution path invoked
→ data inserted into staging tables
Inventory endpoints are handled through a specialized execution path due to nested payload structures and warehouse-level relationships. Processing is aligned with vendor payload formats to ensure accurate ingestion.
All interactions execute within the primary WordPress admin context.
Catalog is a downstream derived dataset and is not used for field discovery, filtering, or mapping.
Data Model
Raw Staging Layer
- styles (includes brandImage and styleImage per style)
- inventory
- inventory_warehouses
- categories
- specs
- brands
Each table reflects the vendor API payload directly with no transformation beyond normalization required for storage.
Inventory ingestion supports parent-child relationships:
- parent: inventory
- child: inventory_warehouses
Warehouse-level data is extracted from nested payload structures and stored independently.
Catalog Layer
- one row per SKU
- joined across endpoint tables
- contains only fields required for filtering, mapping, and synchronization
- includes color_image_url (per-SKU color photo) and brand_image (brand logo URL) columns
The catalog dataset is not used for field discovery during filtering or mapping. It is a downstream working dataset used after filtering and mapping.
The catalog is a derived working dataset, not a raw data store.
Data is inserted using column intersection to ensure safe alignment with staging schema.
Raw JSON payload may be stored per row for traceability and debugging.
Product Structure
Variable products are grouped by styleID. Each unique styleID maps to exactly one WooCommerce variable product regardless of how many colors or sizes are available.
Color and size are both variation attributes. Each individual SKU (a specific color + size combination) maps to one WooCommerce product variation.
Variation images are sourced from the per-SKU color image field (colorFrontImage) so each color selection shows the correct product photo.
Brand logos are sourced from the per-style brand image field (brandImage) and attached to WooCommerce product_brand taxonomy terms so they appear on brand archive pages and in brand filters.
Manual Edit Protection
Sync operations never overwrite manually assigned WooCommerce categories. On existing products, categories are only set when the product currently has none assigned. New products always receive categories from the feed.
Brand taxonomy terms are always updated from the feed since they are not expected to be manually edited.
Endpoint Configuration
Endpoint behavior is database-driven and supports extended metadata such as:
- file type
- import strategy
- parser configuration
- child table relationships
These attributes enable deterministic routing and parsing without vendor-specific execution logic in the import layer.
Import Console
The admin interface provides a unified ETL console with the following steps:
- Vendors
- Download
- Import
- Filter
- Map
- Update
- Tools
Each step is isolated and server-rendered.
- Import execution is endpoint-specific and deterministic
- Each endpoint is processed independently
- Routing is based solely on endpoint identity
- Inventory endpoints invoke a specialized execution path
- All imports resolve files using vendor_key + endpoint_key
- Inventory imports are processed using endpoint-specific logic to support nested data structures
- Preview is performed inline within existing steps
Filtering System
Filter profiles define which products should be included in WooCommerce.
Features:
- JSON rule specification
- SQL compiler
- operates on a unified field set derived from staging tables
- field discovery is schema-driven using staging tables
- filter execution resolves fields across staging tables via structured joins
- preview with row counts and sample data
- vendor-scoped profiles
- single active profile per vendor
- per-rule AND/OR logic — each rule chooses how it joins to the rule above it, rather than one global connective for the whole profile
- nested groups — rules can be collected into groups with their own logic, enabling compound conditions such as
BrandA OR BrandB OR (BrandC AND title contains "scrub")
- groups may be nested one level deep, which is sufficient for catalog-scoped filtering without becoming unreadable
- fully backward compatible — saved profiles created before per-rule logic and grouping continue to load, render, and compile unchanged
- field availability reflects combined staging schemas (styles, inventory, warehouses, specs, brands, categories)
- filter field selection is derived from the assembled dataset field set across staging tables
- field discovery is anchored to the styles dataset
- fields from downstream staging tables are included only when not already present in styles
- duplicate field names across tables are resolved using styles precedence
- the effective field set is a deduplicated subset of the combined staging schemas (~37 fields for S&S baseline)
- filter field discovery is the canonical field source for the system
- field resolution is deterministic and shared across filter, mapping, and validation layers
- field names must match the canonical staging field keys exactly (no aliasing during runtime resolution)
Filtering is configuration, not code.
Mapping System
Mapping profiles translate assembled dataset fields into WooCommerce product fields.
Required fields:
Optional fields include:
- Description
- Short Description
- Stock Quantity
- Featured Image
- Sale Price
- Weight
Mapping operates on the same unified field set used by the filtering system, derived from staging tables with styles as the anchor.
Mapping field selection uses the same canonical field set as filtering with no independent field discovery.
Mapping validation and example resolution use identical field keys from the filtering system.
No catalog-based field sourcing is used during mapping.
Mappings are stored per vendor and applied during synchronization.
Image Management
The image manager supports three image types:
Product featured image
Sourced from the style-level image field (styleImage). Attached to the parent variable product. Skipped when the source URL is unchanged from the previous sync run.
Variation color image
Sourced from the SKU-level color image field (colorFrontImage). Attached to each product variation so the correct photo appears when a color is selected. Deduplicated within each style group — the same color image URL is sideloaded only once per sync run regardless of how many size variants share it.
Brand logo
Sourced from the style-level brand image field (brandImage). Attached to the WooCommerce product_brand taxonomy term via thumbnail_id. Deduplicated globally — each brand logo is downloaded once per sync run regardless of how many styles belong to that brand.
All image types:
- check the WordPress media library before downloading to reuse existing attachments
- track source URL per product/variation/term to skip unchanged images on subsequent syncs
- handle sideload errors gracefully without interrupting product saves
- tag attachments with _catalogsync_image_source for cleanup tooling
Performance
Version 2.0 introduced significant performance improvements:
- avoids unnecessary large-file handling in the product synchronization path
- supports large vendor catalogs
- endpoint-based ingestion reduces processing overhead
- warehouse-level inventory normalization
- reduced WooCommerce write operations
- improves handling of large inventory datasets
- improves handling of nested vendor payload structures
- ensures stable execution across endpoint types
- avoids unnecessary runtime schema validation
Version 2.1 adds further synchronization-path improvements:
- defers WooCommerce variable-product sync to the end of each batch instead of firing per variation group, eliminating a severe per-row regression on large brands
- memoizes attribute-taxonomy existence checks within a batch
- surfaces per-batch progress with rows-per-second telemetry in the update UI
Version 2.1.1 image pipeline improvements:
- variation and brand image caches are scoped correctly — variation cache per style group, brand cache static across all groups in the request — preventing redundant downloads on large catalogs
- brand and variation image failures are caught independently and logged as warnings without interrupting the product save path
Frontend Execution Constraints
- No iframe-based UI rendering
- No reliance on script enqueueing inside AJAX-rendered views
- All JavaScript executes within the primary admin DOM
- Shared core script provides global utilities
Security
All asynchronous operations use:
Nonce Action:
catalogsync_sync_nonce
Payload Key:
security
Required capability:
manage_options
No public endpoints are exposed.
License
This plugin is licensed under the GPLv2 or later.
Credits
Developed by Informedio.