Tag: LLMs

  • Normalizing Size and Color in Fashion Using AI to Power Competitive Price Intelligence

    Normalizing Size and Color in Fashion Using AI to Power Competitive Price Intelligence

    Fashion is as dynamic a market as any—and more competitive than most others. Consumer trends and customer needs are always evolving, making it challenging for fashion and apparel brands to keep up.

    Despite the inherent difficulties fashion and apparel sellers face, this industry is one of the largest grossing markets in the world, estimated at $1.79 trillion in 2024. Global revenue for apparel is expected to grow at an annual rate of about 3.3% over the next four years. That means companies in this space stand to make significant revenue if they can competitively price their products, keep up with the competition, and win customer loyalty with consistent product availability.

    There are three main categories in fashion and apparel. These include:

    • Apparel and clothing (i.e., shirts, pants, dresses, and other apparel)
    • Footwear (i.e., sneakers, sandals, heels, and other products)
    • Accessories (i.e., bags, belts, watches, and so on)

    If you look at all of these product types across all sorts of retailers, there is a massive amount of overlapping data based on product attributes like style and size that are difficult to normalize.

    Fashion Attributes

    Style, color, and size are the main attribute categories in fashion and apparel. Style attributes include things like design, look, and overall aesthetics of the product. They’re very dependent on the actual product category of fashion as well. A shirt might have a slim fit attribute associated with it, whereas a belt might have a length. All these different attributes are usually labeled within a product listing and affect the consumer’s decision-making process:

    • Color (red, blue, sea green, etc.)
    • Pattern (solid, striped, checked, floral, etc.)
    • Material (cotton, polyester, leather, denim, silk, etc.)
    • Fit (regular, slim, relaxed, oversized, tailored, etc.)
    • Type (casual, formal, sporty, vintage, streetwear)

    Color Complexity in Fashion

    Color is perhaps the most visually distinctive attribute in fashion, yet it presents unique challenges for retailers. This is because color naming can vary across retailers and marketplaces. There are several major differences in color convention:

    • A single color can be labeled differently across brands (e.g., “navy,” “midnight blue,” “deep blue”)
    • Seasonal color names (e.g., “summer sage” vs. “forest green”)
    • Marketing-driven names (e.g., “sunset coral” vs. “pale orange”)
    Differences in color naming - challenges faced by fashion retail intelligence systems

    Size: The Other Critical Dimension

    Size in fashion refers to the dimensions or measurements that determine how fashion products fit. Depending on whether the product is a clothing item, shoes, or a hat, there will be different sizing options. Types of sizes include:

    • Standard sizes (XS, S, M, L, XL, XXL, XXL)
    • Custom sizes (based on brand, retailer, country, etc.)

    A single type of product may have different sizing labels. For instance, one pants listing may use traditional S, M, L, XL sizing, while another pants listing may use 24, 25, or 26, to refer to the waist measurement.

    Size Variations - challenges faced by fashion retail intelligence systems
    Size Variations - challenges faced by fashion retail intelligence systems
    Size Variations - challenges faced by fashion retail intelligence systems

    Size is a dynamic attribute that changes based on current trends. For example, there has recently been a significant shift towards inclusive sizing. Size inclusivity refers to the practice of selling apparel in a wide range of sizes to accommodate people of all body types. Consumers are more aware of this trend and are demanding a broader range of sizing offerings from the brands they shop from.

    In the US market, in particular, some 67% of American women wear a size 14 or above and may be interested in purchasing plus-size clothing. There is a growing demand in the plus-size market for more options and a wider selection. Many brands are considering expanding their sizes to accommodate more shoppers and tap into this growing revenue channel.

    Pricing Based on Size and Color

    Many fashion products are priced differently based on size and color. Let’s take a look at an example of what this can look like.

    Different colors may retail at different price points.

    A popular beauty brand (see image) is known for its viral lip tint. While most of the color variants are priced at $9.90 on Amazon, a specific colorway option, featuring less pigmented options, is priced at $9.57. This price differential is driven by both material costs and market demand.

    Different colorways (any of a range of combinations of colors in which a style or design is available) of the same product often command different prices also. This is based on:

    • Dye costs (some colors require more expensive processes)
    • Seasonal demand (traditional colors vs. trend colors)
    • Exclusivity (limited edition colors)

    An example of price variations by size is a women’s shirt that is being sold on Amazon as shown below. For this product, there are no style attributes to choose from. The only parameter the shopper has to select is the size they’d like to purchase. They can choose from S to XL. On the top, we can see that the product in size S is ₹389. Below, the size XL version of this same shirt is ₹399. This price increase is correlated to the change in size.

    Different sizes may retail at different price points.
    Different sizes may retail at different price points.

    So why are these same products priced differently? In an analysis of One Six, a plus-size clothing brand, several reasons for this difference in plus-size clothing were determined.

    • Extra material is needed, hence an increase in production costs
    • Extra stitching costs, hence an increase in production costs
    • Production of plus-size clothing often means acquiring specialized machinery
    • Smaller scale production runs for plus-size clothing means these initiatives often don’t benefit from cost savings

    Some sizes are sold more than others, meaning that in-demand sizes for certain apparel can affect pricing as well. Brands want to be able to charge as much as possible for their listing without risking losing a sale to a competitor.

    The Competitive Pricing Challenge: Normalizing Product Attributes Across Competitors in Apparel and Fashion

    There are hundreds of possible attribute permutations for every single apparel product. Some retailers may only sell core sizes and basic colors; some may sell a mix of sizes for multiple style types. Most retailers also sell multiple color variants for all styles they have on catalog. Other retailers may only sell a single, in-demand size of the product. Also, when other retailers are selling the product, it’s unlikely that their naming conventions, color options, style options, and sizing match yours one-for-one.

    In one analysis, it was found that there were 800+ unique values for heel sizes and 1000+ unique values for shirts and tops at a single retailer! If you’re looking to compare prices, the effort involved in setting up and managing lookup tables to identify discrepancies when one retailer uses European sizes and another uses USA sizes, for example, is simply too onerous to contemplate doing. Colors only add to the complexity – as similar colors may have new names in different regions and locations as well!

    Even if you managed to find all the discrepancies between product attributes, you would still need to update them any time a competitor changed a convention.

    Still, monitoring your competitors and strategically pricing your listings is essential to maintain and grow market share. So what do you do? You can’t simply eyeball your competitor’s website to check their pricing and naming conventions. Instead, you need advanced algorithms to scan the entire marketplace, identify individual products being sold, and normalize their data and attributes for analysis.

    Getting Color and Size Level Pricing Intelligence

    With DataWeave, size and color are just two of several dimensions of a product instead of an impossible big data problem for teams. Our product matching engine can easily handle color and sizing complexity via our AI-driven approach combined with human verification.

    This works by using AI built on more than 10 years of product catalog data across thousands of retail websites. It matches common identifiers, like UPC, SKU code, and other attributes for harmonization before employing a large language model (LLM) prompts to normalize color variations and sizing to a single standard.

    The data flow DataWeave uses for product sizing and color normalization

    For example, if a competitor has the smallest size listed as Sm but has your smallest listing identified as S, DataWeave can match those two attributes using AI. Similar classification can be performed on color as well.

    Complex LLM prompts are pre-established so that this process is fast and efficient, taking minutes rather than weeks of manual effort.

    Harmonizing products along with their color and sizing data across different retailers for further analysis has several benefits. Most importantly, product matching helps teams conduct better competitive analysis, allowing them to stay informed about market trends, competitors’ offerings, and how those competitors are pricing various permutations of the same product. It helps ensure that you’re offering the most competitive assortment of sizing in several colors to win more market share as well. Overall, it’s easier for teams to gain insights and exploit their findings when all the data is clean and available at their fingertips.

    Product Matching Size and Color in Apparel and Fashion

    Color and size are crucial attributes for retailers and brands in the apparel and fashion industry. It adds a level of complexity that can’t be overstated. While it’s a necessity to win consumers (more colors and sizes will mean a wider potential reach), the more permutations you add to your listing, the more complicated it will be to track it against your competition. However, This challenge is worth undertaking as long as you have the right solutions at your disposal.

    With a strategy backed by advanced technology to discover identical and similar products across the competitive landscape and normalize their color and sizing attributes, you can ensure that you are competitively pricing your products and offering the best assortment possible. Employing DataWeave’s AI technology to find competitor listings, match products across variants, and track pricing regularly is the way to go.

    Interested in learning more about DataWeave? Click here to get in touch!

  • DataWeave’s AI Evolution: Delivering Greater Value Faster in the Age of AI and LLMs

    DataWeave’s AI Evolution: Delivering Greater Value Faster in the Age of AI and LLMs

    In retail, competition is fierce, and in its ever-evolving landscape, consumer expectations are higher than ever.

    For years, our AI-driven solutions have been the foundation that empowers businesses to sharpen their competitive pricing and optimize digital shelf performance. But in today’s world, evolution is constant—so is innovation. We now find ourselves at the frontier of a new era in AI. With the dawn of Generative AI and the rise of Large Language Models (LLMs), the possibilities for eCommerce companies are expanding at an unprecedented pace.

    These technologies aren’t just a step forward; they’re a leap—propelling our capabilities to new heights. The insights are deeper, the recommendations more precise, and the competitive and market intelligence we provide is sharper than ever. This synergy between our legacy of AI expertise and the advancements of today positions DataWeave to deliver even greater value, thus helping businesses thrive in a fast-paced, data-driven world.

    This article marks the beginning of a series where we will take you through these transformative AI capabilities, each designed to give retailers and brands a competitive edge.

    In this first piece, we’ll offer a snapshot of how DataWeave aggregates and analyzes billions of publicly available data points to help businesses stay agile, informed, and ahead of the curve. These fall into four broad categories:

    • Product Matching
    • Attribute Tagging
    • Content Analysis
    • Promo Banner Analysis
    • Other Specialized Use Cases

    Product Matching

    Dynamic pricing is an indispensable tool for eCommerce stores to remain competitive. A blessing—and a curse—of online shopping is that users can compare prices of similar products in a few clicks, with most shoppers gravitating toward the lowest price. Consequently, retailers can lose sales over minor discrepancies of $1–2 or even less.

    All major eCommerce platforms compare product prices—especially their top selling products—across competing players and adjust prices to match or undercut competitors. A typical product undergoes 20.4 price changes annually, or roughly once every 18 days. Amazon takes it to the extreme, changing prices approximately every 10 minutes. It helps them maintain a healthy price perception among their consumers.

    However, accurate product matching at scale is a prerequisite for the above, and that poses significant challenges. There is no standardized approach to product cataloging, so even identical products bear different product titles, descriptions, and attributes. Information is often incomplete, noisy, or ambiguous. Image data contains even more variability—the same product can be styled using different backgrounds, lighting, orientations, and quality; images can have multiple overlapping objects of interest or extraneous objects, and at times the images and the text on a single page might belong to completely different products!

    DataWeave leverages advanced technologies, including computer vision, natural language processing (NLP), and deep learning, to achieve highly accurate product matching. Our pricing intelligence solution accurately matches products across hundreds of websites and automatically tracks competitor pricing data.

    Here’s how it works:

    Text Preprocessing

    It identifies relevant text features essential for accurate comparison.

    • Metadata Parsing: Extracts product titles, descriptions, attributes (e.g., color, size), and other structured data elements from Product Description Pages (PDP) that can help in accurately identifying and classifying products.
    • Attribute-Value Normalization: Normalize attributes names (e.g. RAM vs Memory) and their values (e.g., 16 giga bytes vs 16 gigs vs 16 GB); brand names (e.g., Benetton vs UCB vs United Colors of Benetton); mapping category hierarchies a standard taxonomy.
    • Noise Removal: Removes stop words and other elements with no descriptive value; this focuses keyword extraction on meaningful terms that contribute to product identification.

    Image Preprocessing

    Image processing algorithms use feature extraction to define visual attributes. For example, when comparing images of a red T-shirt, the algorithm might extract features such as “crew neck,” “red,” or “striped.”

    Image Preprocessing using advanced AI and other tech for product matching in retail analytics.

    Image hashing techniques create a unique representation (or “hash”) of an image, allowing for efficient comparison and matching of product images. This process transforms an image into a concise string or sequence of numbers that captures its essential features even if the image has been resized, rotated, or edited.

    Before we perform these activities there is a need to preprocess images to prepare them for downstream operations. These include object detection to identify objects of interest, background removal, face/skin detection and removal, pose estimation and correction, and so forth.

    Embeddings

    We have built a hybrid or a multimodal product-matching engine that uses image features, text features, and domain heuristics. For every product we process we create and store multiple text and image embeddings in a vector database. These include a combination of basic feature vectors (e.g. tf-idf based, colour histograms, share vectors) to more advanced deep learning algorithms-based embeddings (e.g., BERT, CLIP) to the latest LLM-based embeddings.

    Classification

    Classification algorithms enhance product attribute tagging by designating match types. For example, the product might be identified as an “exact match”, “variant”, “similar”, or “substitute.” The algorithm can also identify identical product combinations or “baskets” of items typically purchased together.

    What is the Business Impact of Product Matching?

    • Pricing Intelligence: Businesses can strategically adjust pricing to remain competitive while maintaining profitability. High-accuracy price comparisons help businesses analyze their competitive price position, identify opportunities to improve pricing, and reclaim market share from competitors.
    • Similarity-Based Matching: Products are matched based on a range of similarity features, such as product type, color, price range, specific features, etc., leading to more accurate matches.
    • Counterfeit Detection: Businesses can identify counterfeit or unauthorized versions of branded products by comparing them against authentic product listings. This helps safeguard brand identity and enables brands to take legal action against counterfeiters.

    Attribute Tagging

    Attribute tagging involves assigning standardized tags for product attributes, such as brand, model, size, color, or material. These naming conventions form the basis for accurate product matching. Tagging detailed attributes, such as specifications, features, and dimensions, helps match products that meet similar criteria. For example, tags like “collar” or “pockets” for apparel ensure high-fidelity product matches for hard-to-distinguish items with minor stylistic variations.

    Attributes that are tagged when images are matched for retail ecommerce analytcis.

    Including tags for synonyms, variants, and long-tail keywords (e.g., “denim” and “jeans”) improves the matching process by recognizing different terms used for similar products. Metadata tags categorize similar items according to SKU numbers, manufacturer details, and other identifiers.

    Altogether, these capabilities provide high-quality product matches and valuable metadata for retailers to classify their products and compare their product assortment to competitors.

    User-Generated Content (UGC) Analysis

    Customer reviews and ratings are rich sources of information, enabling brands to gauge consumer sentiment and identify shortcomings regarding product quality or service delivery. However, while informative, reviews constitute unstructured “noisy” data that is actionable only if parsed correctly.

    Here’s where DataWeave’s UGC analysis capability steps in.

    • Feature Extractor: Automatically pulls specific product attributes mentioned in the review (e.g., “battery life,” “design” and “comfort”)
    • Feature Opinion Pair: Pairs each product attribute with a corresponding sentiment from the review (e.g., “battery life” is “excellent,” “design” is “modern,” and “comfort” is “poor”)
    • Calculate Sentiment: Calculates an overall sentiment score for each product attribute
    The user generated content analysis framework used by DataWeave to calculate sentiment.

    The final output combines the information extracted from each of these features, which looks something like this:

    • Battery life is excellent
    • Design is modern
    • Not satisfied with the comfort

    The algorithm also recognizes spammy reviews and distinguishes subjective reviews (i.e., those fueled by emotion) from objective ones.

    DataWeave's image processing tool also analyses promo banners.

    Promo Banner Analysis

    Our image processing tool can interpret promotional banners and extract information regarding product highlights, discounts, and special offers. This provides insights into pricing strategies and promotional tactics used by other online stores.

    For example, if a competitor offers a 20% discount on a popular product, you can match or exceed this discount to attract more customers.

    The banner reader identifies successful promotional trends and patterns from competitors, such as the timing of discounts, frequently promoted product categories or brands, and the duration of sales events. Ecommerce stores can use this information to optimize their promotion strategies, ensuring they launch compelling and timely offers.

    Other Specialized Use Cases

    While these generalized AI tools are highly useful in various industries, we’ve created other category—and attribute-specific capabilities for specialty goods (e.g., those requiring certifications or approval by federal agencies) and food items. These use cases help our customers adhere to compliance requirements.

    Certification Mark Detector

    This detector lets retailers match items based on official certification marks. These marks represent compliance with industry standards, safety regulations, and quality benchmarks.

    Example:

    • USDA Organic: Certification for organic food production and handling
    • ISO 9001: Quality Management System Certification

    By detecting these certification marks, the system can accurately match products with their certified counterparts. By identifying which competitor products are certified, retailers can identify products that may benefit from certification.

    Image analysis based product matching at DataWeave also detects certificate marks.

    Nutrition Fact Table Reader

    Product attributes alone are insufficient for comparing food items. Differences in nutrition content can influence product category (e.g., “health food” versus regular food items), price point, and consumer choice. DataWeave’s nutrition fact table reader scans nutrition information on packaging, capturing details such as calorie count, macronutrient distribution (proteins, fats, carbohydrates), vitamins, and minerals.

    The solution ensures items with similar nutritional profiles are correctly identified and grouped based on specific dietary requirements or preferences. This helps with price comparisons and enables eCommerce stores to maintain a reliable database of product information and build trust among health-conscious consumers.

    Image processing for product matching also extracts nutrition table data at DataWeave.

    Building Next-Generation Competitive and Market Intelligence

    Moving forward, breakthroughs in generative AI and LLMs have fueled substantial innovation, which has enabled us to introduce powerful new capabilities for our customers.

    How Gen AI and LLMs are used by DataWeave to glean insights for analytics

    These include:

    • Building Enhanced Products, Solutions, and Capabilities: Generative AI and LLMs can significantly elevate the performance of existing solutions by improving the accuracy, relevance, and depth of insights. By leveraging these advanced AI technologies, DataWeave can enhance its product offerings, such as pricing intelligence, product matching, and sentiment analysis. These tools will become more intuitive, allowing for real-time updates and deeper contextual understanding. Additionally, AI can help create entirely new solutions tailored to specific use cases, such as automating competitive analysis or identifying emerging market trends. This positions DataWeave to remain at the forefront of innovation, offering cutting-edge solutions that meet the evolving needs of retailers and brands.
    • Reducing Turnaround Time (TAT) to Go-to-Market Faster: Generative AI and LLMs streamline data processing and analysis workflows, enabling faster decision-making. By automating tasks like data aggregation, sentiment analysis, and report generation, AI dramatically reduces the time required to derive actionable insights. This efficiency means that businesses can respond to market changes more swiftly, adjusting pricing or promotional strategies in near real-time. Faster insights translate into reduced turnaround times for product development, testing, and launch cycles, allowing DataWeave to bring new solutions to market quickly and give clients a competitive advantage.
    • Improving Data Quality to Achieve Higher Performance Metrics: AI-driven technologies are exceptionally skilled at cleaning, organizing, and structuring large datasets. Generative AI and LLMs can refine the data input process, reducing errors and ensuring more accurate, high-quality data across all touchpoints. Improved data quality enhances the precision of insights drawn from it, leading to higher performance metrics like better product matching, more accurate price comparisons, and more effective consumer sentiment analysis. With higher-quality data, businesses can make smarter, more informed decisions, resulting in improved revenue, market share, and customer satisfaction.
    • Augmenting Human Bandwidth with AI to Enhance Productivity: Generative AI and LLMs serve as powerful tools that augment human capabilities by automating routine, time-consuming tasks such as data entry, classification, and preliminary analysis. This allows human teams to focus on more strategic, high-value activities like interpreting insights, building relationships with clients, and developing new business strategies. By offloading these repetitive tasks to AI, human productivity is significantly enhanced. Employees can achieve more in less time, increasing overall efficiency and enabling teams to scale their operations without needing a proportional increase in human resources.

    In our ongoing series, we will dive deep into each of these capabilities, exploring how DataWeave leverages cutting-edge AI technologies like Generative AI and LLMs to solve complex challenges for retailers and brands.

    In the meantime, talk to us to learn more!