Product Data for Paper & Forms: Formats and Standards

Paper is a spec product: format, grammage and whiteness decide the sale. Here's how those attributes get structured, where BMEcat helps, and where AI takes over.

Jakob Feinböck, ProductbayJuly 4, 20267 min read
☝️Key takeaways
  • Paper and forms are attribute products: the buying decision runs on DIN format, grammage (g/m²), whiteness and ply count — not on brand or copy.
  • Suppliers deliver these specs inconsistently — free text, buried in titles, mixed units — so filters break and an unfilterable range barely sells online.
  • BMEcat covers the well-organized part of the range, but not every supplier ships it, and coverage of the attribute fields varies.
  • Productbay normalizes format and grammage into clean, filterable attributes and uses AI enrichment where the standard and the supplier feed stop.

A ream of copy paper looks like the simplest product in the catalog. It isn't — not as data. The customer who buys it never reads a description; they filter for DIN A4, 80 g/m², high whiteness, FSC-certified, and buy whatever matches. The entire sale hinges on a few technical attributes being present, correct and comparable. And that is exactly where paper ranges fall apart online.

Product data for paper and forms is attribute data first: DIN format, grammage in g/m² and whiteness carry the buying decision, not brand or marketing copy. This is a sub-topic of office supplies more broadly, but paper deserves its own look because it is the most attribute-driven, most filter-dependent part of the assortment.

Which attributes and standards define a paper product?

Almost the entire value of a paper record sits in a compact set of standardized attributes. Get these clean and the product is findable and filterable; get them wrong and it's invisible:

  • Format (DIN 476): A4, A3, A5 for sheets, DL and C-series for envelopes. A fixed, closed list — the ideal filter facet, if it's populated consistently.
  • Grammage (g/m²): the single most important spec — 80 g/m² for standard copy paper, 90–120 for premium, up to 300 g/m² for card and cover stock. Must be a number with a fixed unit, not free text.
  • Whiteness / CIE: a numeric brightness value (e.g. CIE 161) that separates budget from premium paper.
  • Sheets / plies: 500 per ream, or the ply count on multi-part forms.
  • Certification: FSC, PEFC, Blue Angel — increasingly a hard filter in tenders and B2B procurement.

The trouble is never the standard — DIN and g/m² are unambiguous. The trouble is that suppliers deliver the values inconsistently: one writes "80g", another "80 g/m²", a third puts the grammage in the title and leaves the attribute field empty. Multiply that across dozens of suppliers and your filter facets fill with noise.

Does BMEcat structure paper data — and where does it stop?

Office and paper supply has a genuine exchange standard: BMEcat, the B2B catalog format widely used in this sector. Where a supplier ships a clean BMEcat file with proper feature groups, format and grammage arrive as structured, typed attributes — which is a real head start. But it's worth being honest about the coverage:

Data layerWhat BMEcat / feeds deliverWhere it stops
Format & grammageStructured attributes when the supplier fills the feature groupFree-text or title-buried values in weaker feeds
Supplier coverageEstablished suppliers ship valid BMEcatMany still send Excel / PDF price lists
Whiteness / certificationSometimes presentFrequently blank or non-standardized
Sales contentNot the job of an exchange formatDescriptions, SEO text, benefit copy absent
Form-specific attributesBasic classificationPly count, perforation, printer compatibility thin

So BMEcat solves the well-organized part of the range — the established suppliers who fill their feature groups properly. What it doesn't cover is the supplier who still sends a PDF, the half-populated attribute fields, the whiteness left blank, and every bit of sales content. For how classification standards fit together more broadly, see GDSN, ETIM and eCl@ss explained.

How does Productbay structure and filter paper data?

The job is the same three steps every multi-supplier retailer runs — and for paper the enrichment step is unusually high-leverage because so much value sits in a few normalizable attributes. That's exactly what Productbay is built for:

  • Consolidate: import every source once — BMEcat, supplier CSV, Excel, feed URL, FTP, API — and match by SKU or EAN/GTIN so existing products update and new ones are created.
  • Enrich & normalize: AI reads grammage, format and whiteness out of titles, datasheets and PDF specs, maps them to single typed attributes with fixed units, standardizes certifications, writes descriptions and translates via DeepL — always with a review queue before anything publishes. "80g" and "80 g/m²" collapse into one filterable value.
  • Publish: two-way sync to Shopify and Shopware, ERP connections (Xentral, weclapp), and feed exports for Amazon, OTTO and Kaufland — each with per-channel transformations and clean facets for format and grammage.

Productbay starts where BMEcat and the supplier feed stop: the messy suppliers, the empty attribute fields, the form-specific specs and the sales content no standard carries. For the broader picture, see product data in office supplies. Productbay is built for specialist retailers running multi-supplier, multi-channel catalogs — from mid-sized shops to large chains. To dig into the normalization mechanics, read how we enrich and normalize data from multiple suppliers.

Frequently Asked Questions

Let's look at your product data process

Grammage in three different notations, formats hidden in titles, forms with ply counts — paper is all attributes. See how Productbay normalizes them into clean filters in a 30-minute walkthrough.

Get started