Category: Brands

  • The Importance of Pricing Parity for Brands

    The Importance of Pricing Parity for Brands

    With bricks-and-mortar stores steadily increasing their online presence, the balancing act of pricing online and in-store is now more important and complex than ever. Companies spend years building brands and brand equity. Yet, a misplaced or poorly executed pricing strategy to handle both online and offline pricing can erode that equity with consumers very quickly.

    This problem is not new. It first started when Clubs like Costco and Sam’s started popping up in the 80’s. Suddenly, brands had to figure out a way to balance Club and Grocery pricing while taking advantage of a new, fast-growing channel. The biggest difference between now and then is that consumers now can check prices within seconds on their phone.

    So, how do you avoid losing your brand equity while ensuring price parity across online and offline channels?

    The key areas to consider are:

    1. Product Mix

    Do you have a broad enough mix of product sizes and case configurations for each channel? To maximize your sales and minimize your price disruption, reviewing your supply chain and product mix to ensure you are able to deliver value to both online and offline retailers is critical. Each channel is looking for ways to improve and maximize your brand sales. If you do not give them the right size and case configuration to enable them to increase margins, you will end up relying disproportionately on trade spend (dollars a brand spends with a retailer to promote products) to do so, or find your product on page 212 of every search.

    Examples of this strategy can be seen with companies offering only “bundled” items such as 12 cans or a large case on online marketplaces, while other retailers offer individual cans for purchase. This allows your online partners to make up margin by shipping a full case and not going through the process of breaking down a case and shipping single units. Also, this allows bricks-and-mortar retailers to have a sharper price point to lure consumers into the store. This strategy has played out well for many brands as they dealt with the rise of Club stores and can be played successfully in e-commerce as well, benefiting all parties.

    2. Price Lists

    Do you have harmonized price lists that do not favor one channel over another? If you do not, you are likely subsidizing the higher list cost in a channel with trade spend, which is highly inefficient. A single price list that provides an adequate price slope between the various sizes across your product range will maximize your ability to manage both channel pricing and brand equity.

    The single largest mistake brands tend to make is thinking that offering “net price” price lists to online marketplaces will benefit them while they use trade dollars in bricks-and-mortar stores to cater to EDLP (Everyday low price) customers. This approach is quite inefficient in many ways, and consumes valuable time and resources that can otherwise be better utilized. Having a single price list with the same price offered to all retailers allows for a more manageable and equitable pricing environment. It also enables a more profitable distribution of trade spend across the most effective areas to invest in for each retailer.

    I have worked with two brands in the past – one that managed two separate price lists and one that we implemented as a single-standard. While the one with the single price list saw sales grow and trade spend remain constant, the other saw trade spend double in just two years as it got caught in a scenario of always having to placate one side of the equation or the other.

    3. Trade Spend

    Today’s brands need to focus on a balanced trade spend strategy to address each channel’s unique needs. Using trade spend with online retailers can be tricky, as the channel is usually assumed to be the lowest priced anyway. Still, it can be used to drive traffic and offset supply chain costs, in order to ensure sufficient margins for the retailer, which will keep you off the CRAP (Can’t Realize A Profit) lists. Meanwhile, as JC Penny quickly learned when it made the disastrous shift to EDLP, consumers still want in-store discounts and sales.

    The best approach I have worked with is to set a single dead net price inclusive of all trade. For example, if your product’s standard list cost is $6.80 and you have a dead net price for promotions (or EDLP) of $5.40, then all retailers – online and bricks-and-mortar – are on equal footing. The only variance in the price for consumers will be the margin each operator chooses to take. This approach is not without issues, as you have to apply all elements of trade spend (such as ad fees, etc.) to the promotional unit costs to ensure you are truly capturing the dead net cost of the retailer.

    Still, the advantage of utilizing this approach is that when a retailer complains about the price another is offering to consumers, the conversation turns to margins being taken and not the cost of the product. At the very least, this approach provides a common ground on which to have a constructive conversation with all retailers.

    So why does this all matter so much to a brand?

    The road to selling online is littered with disaster and missed expectations for sales. Most manufacturers that jumped to online sales without considering pricing quickly learned that abandoning one channel for another does not lead to increased sales. Conversely, we have seen a few brands go from online only to in-store as well. These brands seem to have learned from the others’ mistakes and rarely will you find price variances between the online and offline channels. Instead, you tend to see these brands growing, as online consumers start experiencing the brand in-store.

    A Business To Community study by Larisa Bedgood in 2019 showed that “lower price” was second to only “convenience” for why consumers shop online, while 51% of consumers said that the biggest drawback to shopping online was not being able to touch and feel the product. Brands that are able to bridge the gap and provide consumers with the convenience of online while also showing up well in-store at the right price point will be able to break out of the stagnate 1-2% (if they are lucky) growth most CPG companies are experiencing. If online selling is growing 40-50% a year, why are these companies only managing brand declines and flat growth? I believe it is mainly due to the lack of a proper pricing parity strategy for the two channels along with a lack of actionable e-commerce data.

    Brands that do not focus on all three areas listed above often find themselves in a constant churn of conversations with retailers on all sides, which will typically lead to either online marketplaces or bricks-and-mortar stores deprioritizing the brand in promotions or search. Finding and setting a level playing field will allow for a balanced trade spend and growth for brands on both platforms, while also enabling a brand to break out of the net 1% growth that is plaguing a lot of CPG brands today.

    Outside of deploying basic pricing principles for your brand, I would also suggest early and strong investments in data, systems and people to monitor your brand’s health and pricing. Many brands jumped online without any way to monitor the consumer conversation around the brand or the pricing of the brand online. Not having the tools and resources in place to do this can lead to a quick and long-lasting erosion of brand equity and sales. Most, if not all, large manufacturers have subscribed to POS data for years and fully understand how to analyze this data. But the world has shifted. If your organization has not invested in digital shelf analytics, you may be driving blind and unaware that your brand is losing equity, which equals losing consumers and sales.

    Using a combination of pricing principles and e-commerce data mining tools will help you maintain price parity and brand relevance, while keeping you from becoming the last brand of choice for consumers, regardless of where they shop.

  • Thanksgiving Weekend Sale: How Top US Consumer Brands Fared

    Thanksgiving Weekend Sale: How Top US Consumer Brands Fared

    Online retailers in the US have enjoyed an impressive turnover during 2018’s Thanksgiving weekend sale. Over the last few weeks, DataWeave has published deep-dive reports on the performance of top US retailers in fashion and consumer electronics during this period, detailing their discounting and product strategies across several product types.

    In continuation of our series of articles on the Thanksgiving weekend sale, this article focuses specifically on the top brands across all retailers analyzed.

    Read Also:

    A Study of Fashion Retail Pricing Across Thanksgiving, Black Friday and Cyber Monday 2018

    How Consumer Electronics Was Priced Across Thanksgiving, Black Friday and Cyber Monday 2018

    While a lot of attention from the media and analysts during these sale events is often focused on the strategies and performance of retailers, the festive sale period is equally vital for consumer brands. Both established brands and new entrants across all categories compete aggressively to gain market share during a period that accounts for a substantial portion of annual sales turnover.

    For brands, the two primary drivers of conversion specific to sale events are competitive pricing and prominent brand visibility. At DataWeave, we went about analyzing which brands came out on top across retailers and categories during the Thanksgiving weekend sale, based on these two factors.

    Our Methodology

    We tracked the pricing of 6 leading fashion retailers and 5 major consumer electronics retailers to study the pricing strategies of brands during the sale events. Our analysis focused on additional discounts offered during the sale period to evaluate the true value of the sale event to customers. To calculate this effect, we compared the pricing of products on Thanksgiving Day, Black Friday and Cyber Monday to the pricing of products prior to the sale commencing. We considered the Top 500 ranked products on 11 product types across Men’s and Women’s Fashion and 11 popular consumer electronics products for this analysis.

    Consumer Electronics Brands

    In digital cameras, Canon’s traditional role as a discount leader was on show, featuring on both Best Buy (14%) and Target (20%), the two most aggressive price discounters in consumer electronics. Nikon took Canon’s place in DSLR cameras, for Best Buy (13%), New Egg (10%) and Walmart (4%), albeit at a comparatively low additional discount point.

    Razor benefited from Amazon’s strategy of promoting its lower-priced products, promoting a modest 9% additional discount but across its entire range of laptop products. The competitiveness of this category between brands is shown by Samsung’s decision to give an additional 53% discount across 36% of its product line at Best Buy.

    The strategic approach brands take with different retailers was illustrated by HP’s 30% additional discount on 31% of its products at Target while over at Walmart, HP had a dire a 4% additional discount on a mere 13% of its products. A similar strategy was employed by LG with its televisions. On Amazon, its TVs had a 10% additional discount applied to 46% of its products, while at New Egg that translated to 25% and 8% respectively.

    Among the fast emerging wearables category, under-pressure Chinese firm Huawei dropped an aggressive 46% additional discount on 100% of its product range at Best Buy. By comparison, the next highest in this category was Marc Jacobs at Target with 33% and 40% respectively.

    Most Visible Brands Across Product Types

    In our analysis, brand visibility is represented in terms of both the number of products for each brand, as well as the average rank of all its products (“lower” the rank value, higher is the visibility).

    The influence an online retailer exerted on a brand’s average ranking is illustrated by Canon’s digital cameras. On Amazon, its 296 products had an average ranking of 272, while on Best Buy it was 30 and 48, 73 and 212 on New Egg and 20 and 69 on Walmart. For all these retailers, Canon was the most visible brand in digital cameras, despite such variation.

    It was a similar story on laptops, with HP’s Amazon ranking of 298 based on 166 products, contrasting with a Target ranking of 14 on 18 products and Walmart ranking of 21 on 20 products.

    These patterns appear to play out in TVs too, with Samsung’s Amazon average ranking of 292 based on 150 products contrasting with Walmart average ranking of 10 across 7 products.

    Unsurprisingly, across our analysis of additional discounts and brand visibility, the top brands are well known and recognizable brands in each product type, with very few new entrants breaking out from the pack. This story, though, takes a turn in the following analysis on visibility growth.

    Brands With Highest Growth in Visibility

    To perform this analysis, we developed an index for the visibility of a brand based on the number of products available per brand as well as the average rank of those products. We then compared this score for each brand between before and during the sale period, and subsequently calculated the percentage growth.

    The list of brands that showed the highest growth in visibility for each product type is an interesting mix of well established and newer brands. The usual suspects included the likes of Philips, Fitbit, Sony, Kodak, Nikon, etc. The presence of brands like Apple, Google, and Bose is surprising as they would be expected to command strong visibility even before the sale. Some of the newer brands include Rha, Westinghouse, Garmin, Lanruo, and more.

    Some brands showed a dramatic increase in visibility. Examples include Bose on Walmart (698%), HTC on New Egg (657%), Galanz on Amazon (657%), and Jlab on Target (608%).

    Kodak’s digital cameras (2% growth) on Best Buy took the honors for the lowest increase in visibility, just ahead of HP laptops (3%) on Walmart, Nostalgia Electrics refrigerators (4%) and Belkin Tablets (7%) both on sale at Target. These numbers indicate a relatively static assortment for the respective retailers and product types.

    Fashion Brands

    Moving over to the Fashion category, we observed significantly more aggressive discounting activity, as expected. Parent’s Choice T-shirts recorded the highest additional discount (80%) applied to the widest product range (Walmart 91%). Similarly, Fruit of the Loom saw Amazon promote a 78% additional discount applied across 20% of its products.

    In shoes, Macy’s promoted a 60% additional discount on 50% of Kenneth Cole’s product range. In watches, Amazon featured a 57% additional discount on 50% of Kate Spade New Year branded products. Meanwhile, in sunglasses, Ray Ban in Bloomingdale’s enjoyed a 20% additional discount spread across a whopping 95% of its products, compared to just a 14% additional discount applied to a mere 10% of Ray Ban products in New Egg.

    In stark contrast to what was observed in Electronics, the Fashion category saw fewer large brands dominate the discounting landscape across categories. This isn’t surprising given how the Fashion category tends to be cluttered with a plethora of brands, while the Electronics category usually consists of a leaner set of popular brands in each product type.

    Most Visible Brands Across Product Types

    In casual shoes, Nike’s ranking of 264 on 93 and Converse’s ranking of 239 on 89 products contrasted with Vision Street Wear’s ranking of 8 on 9 products and Time And Tru’s 15 ranking on 14 products.

    Another point of contrast was Micheal Kors (Handbags) cross-retailer platform performance - its average ranking of 184 on 102 products on Macy’s while its average ranking on New Egg was 20 across 12 products. Still, it appears the brand discounted heavily in New Egg to compensate for its relatively low visibility on the website.

    Ray Ban recorded a category high ranking of 209 based on 321 products on Macy’s. By comparison, Ray Ban had a ranking of 17 on 34 products at New Egg. Over at Amazon, Ray Ban managed a creditable 189 ranking on 124 products and a 163 ranking on 120 products at Bloomingdale’s.

    Brands With Highest Growth in Visibility

    Compared to the Electronics category, Fashion consists of certain brands that skyrocketed in their visibility. Examples include Next Level T-shirts (Amazon 2,000%), Michael Kors Watches (Walmart 1,424%), Dakota Watches (Target 751%) and Adidas sports shoes (Amazon 516%).

    Bloomingdale’s delivered amazing visibility growth for key brands, with Burberry (527%), Reiss (500%), The Kooples (%00%), Tory Burch (500%), J Brand (475%), and Adidas (300%) all enjoying strong visibility growth.

    At the other end of the visibility growth spectrum, the growth rates of Lucky shirts (New Egg, 11%), Micheal Kors (New Egg, 20%) Dickies jeans (Target, 22%), Tasso Elba shirts (Macy’s, 23%), and Puma Casual Shoes (Target, 25%) indicate a relatively more static assortment in their respective product types.

    Depth Of Product Range And Discounting Strategy Matters

    Across the three sales, DataWeave identified several different additional discounting and product assortment strategies by both the retailers and the brands.

    While retailers are increasingly discounting the lower priced products to shape price perceptions among shoppers (take a bow Amazon), what are the implications for brands? Firstly, a thin product range is going to make achieving visibility more challenging. Secondly, brand strategies across online retailing platforms will need to be more clearly defined and executed. Thirdly, those brands that treated Thanksgiving, Black Friday and Cyber Monday as discrete events are going to have to rethink their approach as these lines increasingly blur with time.

    If you’re interested to learn more about how DataWeave aggregates and analyzes data from online sources as massive scale, as well as how we provide competitive intelligence to retailers and consumer brands, visit our website!

  • Evolution of Amazon’s US Product Assortment

    Evolution of Amazon’s US Product Assortment

    As with many other product categories, Amazon has made a significant incursion in Apparel — a key battleground category in retail today. Recently, DataWeave once more collaborated with Coresight Research, a retail-focused research firm to publish an in-depth report revealing insights on Amazon’s approach to its US fashion offerings.

    Since our initial collaborative report in February this year, we have witnessed some seismic shifts in the category at both the brand and the product-type level.

    Research Methodology

    We aggregated our analytical data on more than 1 million women’s and men’s clothing products listed on Amazon.com in two stages:

    Firstly, we identified all brands included in the Top 500 featured product listings for each product subcategory in both the Women’s Clothing and Men’s Clothing sections featured on Amazon Fashion (e.g., the top 500 product listings for women’s tops and tees, the top 500 product listings for men’s activewear, etc.). We believe these Top 500 products reflect around 95 percent of all Amazon.com’s clothing sales. This represents 2,782 unique brands.

    We then aggregated the data on all product listings within the Women’s Clothing and Men’s Clothing sections for each of those 2,782 brands. This generated a total of 1.12 million individually listed products. This expansive list forms the basis for our highlights of the report.

    Third-Party Seller Listings Are Rising Sharply

    We identified a total of 1.12 million products across men’s and women’s clothing — a significant increase of 27.3 percent in the seven months between February and September 2018. The drivers of this sharp spike are third-party seller listings. In contrast, the report indicates only a 2.2 percent rise in first-party listings over the same period, compared to a 30.5 percent jump in third-party listings.

    In addition, Amazon has listed just 11.1 percent of all clothing products for sale, with third-party sellers offering the remaining 88.9 percent — an indication of the strength of Amazon’s open marketplace platform.

    A Major Brand Shift On Amazon Fashion Is Underway

    In just over six short months, major brand shifts on Amazon Fashion have taken place. The number of Nike listings has plummeted by 46 percent, driven by a slump in third-party listings following Amazon’s new partnership with Nike — a story recently covered by Quartz. Limited growth in Nike clothing first-party listings failed to compensate for this decline.

    Gildan’s spike in total product listings appears to be fueled by increased first-party listings off a low base. Calvin Klein’s 2017 agreement to supply Amazon with products appears to be driving the Calvin Klein brand’s double-digit uptick in first-party listings on Amazon Fashion.

    Aéropostale’s decline appears to be entirely driven by a drop in its third-party listings. The brand itself is not listed as a seller on Amazon.com.

    Amazon Is Rebalancing Its Apparel Portfolio and Switching Its Focus from Sportswear To Suits

    As its Fashion footprint rapidly matures, Amazon now appears to be rebalancing its portfolio with strong growth being shown in listings for formal categories such as suits and away from sportswear. We recorded a 98.6 percent increase in listings of women’s suits and blazers complemented by a 52.2 percent rise in men’s suit and sports coat listings between February and September 2018.

    Generic “Non-Brands” Are Surging On Top 25 Brands List

    Over the past six months, low-price generic brands have made major inroads into Amazon’s listings. Four unknown “brands” captured the top positions on the list of brands offered on Amazon Fashion. The WSPLYSPJY, Cruiize and Comfy brands appear to be shipped directly to customers from China.

    Source: Coresight/DataWeave (Amazon Fashion: Top 25 Brands’ Number of Listings, February 2018 vs. September 2018)

     

    Source: Coresight/DataWeave (Amazon Fashion: Top 25 Brands’ Number of Listings, February 2018 vs. September 2018)

    WSPLYSPJY alone accounts for fully 8.6 percent of Amazon men’s and women’s clothing listings. Cruiize accounts for a further 3.2 percent of listings while Comfy chips in another 3.1 percent.

    Amazon Appears To be Executing A Strategic Pivot

    Amazon’s fashion offering is fast maturing. We saw substantial growth in the number of listings for more formal categories. The realignment in third-party listings by Nike together with increased first-party listings for Calvin Klein and Gildan appear to be driven by alliances with Amazon.

    Simultaneously, ultralow-price generic clothing items delivered on order from China have inundated the “Most-Listed Products” rankings. Third parties now represent nearly 90 percent of Amazon Fashion’s offering.

    While Amazon Fashion shoppers enjoy a wider choice than they did even six months ago, we believe a stronger emphasis on first-party listings would grow the products eligible for Prime delivery. This tactic could strengthen Amazon Fashion’s long-term appeal as a shopping destination.

    If you’re interested in DataWeave’s technology, and how we aggregate data from online sources to provide unique and comprehensive insights on eCommerce products and pricing, check us out on our website!

  • Baahubali 2: Dissecting 75,000 Tweets to Uncover Audience Sentiments

    Baahubali 2: Dissecting 75,000 Tweets to Uncover Audience Sentiments

    Why did Katappa kill Baahubali?

    Two years ago, not many would have foreseen this sentence capturing the imagination of the country like it has. Demolishing all regional barriers, the movie has grossed over INR 500 crores across the world in only its first three days.

    While the first movie received lavish praise for its ambition, technical values, and story, the sequel, bogged by bloated expectations, has polarized the critics fraternity. Some critics compare the movie’s computer graphics favorably to Hollywood productions like Lord of the Rings. Others find the movie lacking in pacing and plot.

    The masses, however, have reportedly lapped the movie up. Social media channels are brimming with opinions, and if one is to attempt finding out the aggregate views of audiences, Twitter is a good place to start.

    At DataWeave, we ran our proprietary, AI-powered ‘Sentiment Analysis’ algorithm over all tweets about Baahubali 2 the first three days of its release, and observed some interesting insights.

    Twitterati Reactions to Baahubali 2

    Overall, the Twitterati’s views on the movie were overwhelmingly positive. We analysed over 75,000 tweets and identified the sentiments expressed on several facets of the movie, such as, Visuals, Acting, Prabhas, etc. The following graphic indicates how the movie fared in some of these categories.

    The Baahubali team, Anushka (actor), Rajamouli (director), and Prabhas (actor), are all perceived as huge positive influences on the movie. Rajamouli, specifically, met with almost universal approval for his dedication and execution. Several viewers cheered the movie on as a triumph of Indian cinema, one which has redefined the cinema landscape of the country. There was considerable praise for the story, Rana (actor), and acting performances, as well.

    The not-so-positive sentiments were reserved for the reason behind Katappa killing Baahubali (no spoilers!), the visuals, and the second half of the movie. Many viewers found the second half to be slow, with unrealistic visuals and action sequences. For example, one of the tweets read:

    “First half was good, but the second half is beyond Rajnikanth movies: humans uprooting trees!”

    While these insights seem simple enough to understand, the technology to filter inevitably chaotic online content and extract meaningful information is incredibly complex.

    Unearthing Meaning from Chaos

    At DataWeave, we provide enterprises with Competitive Intelligence as a Service by aggregating and analyzing millions of unstructured data points on the web, across multiple sources. This enables businesses to better understand their competitive environment and make data-driven decisions to grow their business.

    One of our solutions — Sentiment Analysis — helps brands study customer preferences at a product attribute level by analyzing customer reviews. We used the same technology to analyze the reaction of audiences globally to Baahubali 2. After data acquisition, this process consists of three steps –

    Step 1: Features Extraction

    To identify the “features” that reviewers are talking about, we first understand the syntactical structure of the tweets and separate words into nouns, verbs, adjectives, etc. This needs to account for complexities like synonyms, spelling errors, paraphrases, noise, etc. Our AI-based technology platform then uses various advanced techniques to generate a list of “uni-features” and “compound features” (more than one word for a feature).

    Step 2: Identifying Feature-Opinion Pairs

    Next, we identify the relationship between the feature and the opinion. One of the reasons this is challenging with twitter is, most of the time, twitter users treat grammar with utter disdain. Case in point:

    “I saw the movie visuals awesome bad climax felt director unnecessarily dragged the second half”

    In this case, the feature-opinion pairs are visuals: awesome, climax: bad, second half: unnecessarily dragged. Clearly, something as simple as attributing the nearest opinion-word to the feature is not good enough. Here again, we use advanced AI-based techniques to accurately classify feature-opinion pairs.

    We classified close to 1000 opinion words and matched them to each feature. The infographic below shows groups of similar words that the AI algorithm clustered into a single feature, and the top positive and negative sentiments expressed by the Twitterati for each feature.

    While our technology can associate words with similar meaning, such as, ‘part after interval’ and ‘second half’, it can also identify spelling errors by identifying and grouping ‘Rajamouli’ and ‘Raajamouli’ as a single feature.

    Adjectives like ‘magnificent’ and ‘creative’ were used to describe the Baahubali team positively, while words like ‘boring’, ‘disappointed’, and ‘tiring’ were used to describe the second half of the movie negatively.

    Step 3: Sentiment Calculation

    Lastly, we calculate the sentiment score, which is determined by the strength of the opinion-word, number of retweets and the time of tweet. A weighted average is normalized and we generate a score on a scale of 0% to 100%.

    A Peephole into the Consumer’s Mind

    As more and more people express their views and opinions in the online world, there is more of an opportunity to use these data points to drive business strategies.

    Consumer-focused brands use DataWeave’s Sentiment Analysis solution as a key element of their product strategy, by reinforcing attributes with positive sentiments in reviews, and improving or eliminating attributes with negative sentiments in reviews.

    Click here to find out more about the benefits of using DataWeave’s Sentiment Analysis!

     

  • Dissonance in Online MRP Prices Across Retailers | DataWeave

    Dissonance in Online MRP Prices Across Retailers | DataWeave

    We all know, online shopping offers a lot of benefits to shoppers. Apart from the convenience it offers access to a wide-assortment base and, of course, discounts are an added benefit. Often we see, retailers claiming large discounts on products.

    Many-a-time, the percentage discount that is mentioned drives price perception. Customers when comparing prices across stores view larger percentage discounts as a better deal. However, this is not necessarily the case. To present this case, let us look into how discounts are calculated:

    Percentage discounts are a function of the MRP / MSRP and the Selling Price. The MRP / MSRP is set by the manufacturer and the selling price is more often than not determined by the retailer.

    Selling price of products being different across retailers is a well-known fact. When the MRP of the same products tend to vary across retailers, it gets confusing for a customer, which in turn leads to a brand equity dilution of the brand or manufacturer.

    To analyse how deep this discord is, we decided to dive deeper into its working dynamics. Amongst all the data that we aggregate at DataWeave, analysing discounts of the same product across retailers gives us the ability to discern pricing strategies of retailers. We used this dataset to monitor and analyse MRPs.

    What we found

    1. We analysed MRPs of around 400 brands across 10 categories. Around 44% of products in these brands have no variance in MRPs across retailers

    2. This also means there is a variance in 56% of products

    3. Products in the ‘Mobile Phones and Tablets’ category have the most price variance; 65% of the products have price variance

    4. Fashion and Fashion accessories have the least price variance; around 20%

    5. Brands having the most variance:

    6. Brands having the least variance:

    What are the implications of the above insights?

    1. Brands & manufacturers need to be aware of how their brand products are being represented and sold online
    2. Consumers shopping online need to look at end prices, and not focus on the discount percentage, before making a purchase-decision on a particular store

    This article was previously published on Yourstory

    DataWeaves Brand Intelligence provides consumer brands with the ability to track their products, pricing, discoverability vis-a-vis their competitors across e-commerce platforms.

  • Mining Twitter for Reactions to Products & Brands | DataWeave

    Mining Twitter for Reactions to Products & Brands | DataWeave

    [This post was written by Dipanjan. Dipanjan works in the Engineering Team with Mandar, addressing some of the problems related to Data Semantics. He loves watching English Sitcoms in his spare time. This was originally posted on the PriceWeave blog.]

    This is the second post in our series of blog posts which we shall be presenting regarding social media analysis. We have already talked about Twitter Mining in depth earlier and also how to analyze social trends in general and gather insights from YouTube. If you are more interested in developing a quick sentiment analysis app, you can check our short tutorial on that as well.

    Our flagship product, PriceWeave, is all about delivering real time actionable insights at scale. PriceWeave helps Retailers and Brands take decisions on product pricing, promotions, and assortments on a day to day basis. One of the areas we focus on is “Social Intelligence”, where we measure our customers’ social presence in terms of their reach and engagement on different social channels. Social Intelligence also helps in discovering brands and products trending on social media.

    Today, I will be talking about how we can get data from Twitter in real-time and perform some interesting analytics on top of that to understand social reactions to trending brands and products.

    In our last post, we had used Twitter’s Search API for getting a selective set of tweets and performed some analytics on that. But today, we will be using Twitter’s Streaming API, to access data feeds in real time. A couple of differences with regards to the two APIs are as follows. The Search API is primarily a REST API which can be used to query for “historical data”. However, the Streaming API gives us access to Twitter’s global stream of tweets data. Moreover, it lets you acquire much larger volumes of data with keyword filters in real-time compared to normal search.

    Installing Dependencies

    I will be using Python for my analysis as usual, so you can install it if you don’t have it already. You can use another language of your choice, but remember to use the relevant libraries of that language. To get started, install the following packages, if you don’t have them already. We use simplejson for JSON data processing at DataWeave, but you are most welcome to use the stock json library.

    Acquiring Data

    We will use the Twitter Streaming API and the equivalent python wrapper to get the required tweets. Since we will be looking to get a large number of tweets in real time, there is the question of where should we store the data and what data model should be used. In general, when building a robust API or application over Twitter data, MongoDB being a schemaless document-oriented database, is a good choice. It also supports expressive queries with indexing, filtering and aggregations. However, since we are going to analyze a relatively small sample of data using pandas, we shall be storing them in flat files.

    Note: Should you prefer to sink the data to MongoDB, the mongoexportcommand line tool can be used to export it to a newline delimited format that is exactly the same as what we will be writing to a file.

    The following code snippet shows you how to create a connection to Twitter’s Streaming API and filter for tweets containing a specific keyword. For simplicity, each tweet is saved in a newline-delimited file as a JSON document. Since we will be dealing with products and brands, I have queried on two trending products and brands respectively. They are, ‘Sony’ and ‘Microsoft’ with regards to brands and ‘iPhone 6’ and ‘Galaxy S5’ with regards to products. You can write the code snippet as a function for ease of use and call it for specific queries to do a comparative study.

    Let the data stream for a significant period of time so that you can capture a sizeable sample of tweets.

    Analyses and Visualizations

    Now that you have amassed a collection of tweets from the API in a newline delimited format, let’s start with the analyses. One of the easiest ways to load the data into pandas is to build a valid JSON array of the tweets. This can be accomplished using the following code segment.

    Note: With pandas, you will need to have an amount of working memory proportional to the amount of data that you’re analyzing.

    Once you run this, you should get a dictionary containing 4 data frames. The output I obtained is shown in the snapshot below.

    Note: Per the Streaming API guidelines, Twitter will only provide up to 1% of the total volume of real time tweets, and anything beyond that is filtered out with each “limit notice”.

    The next snippet shows how to remove the “limit notice” column if you encounter it.

    Time-based Analysis

    Each tweet we captured had a specific time when it was created. To analyze the time period when we captured these tweets, let’s create a time-based index on the created_at field of each tweet so that we can perform a time-based analysis to see at what times do people post most frequently about our query terms.

    The output I obtained is shown in the snapshot below.

    I had started capturing the Twitter stream at around 7 pm on the 6th of December and stopped it at around 11:45 am on the 7th of December. So the results seem consistent based on that. With a time-based index now in place, we can trivially do some useful things like calculate the boundaries, compute histograms and so on. Operations such as grouping by a time unit are also easy to accomplish and seem a logical next step. The following code snippet illustrates how to group by the “hour” of our data frame, which is exposed as a datetime.datetime timestamp since we now have a time-based index in place. We print an hourly distribution of tweets also just to see which brand \ product was most talked about on Twitter during that time period.

    The outputs I obtained are depicted in the snapshot below.

    The “Hour” field here follows a 24 hour format. What is interesting here is that, people have been talking more about Sony than Microsoft in Brands. In Products, iPhone 6 seems to be trending more than Samsung’s Galaxy S5. Also the trend shows some interesting insights that people tend to talk more on Twitter in the morning and late evenings.

    Time-based Visualizations

    It could be helpful to further subdivide the time ranges into smaller intervals so as to increase the resolution of the extremes. Therefore, let’s group into a custom interval by dividing the hour into 15-minute segments. The code is pretty much the same as before except that you call a custom function to perform the grouping. This time, we will be visualizing the distributions using matplotlib.

    The two visualizations are depicted below. Of course don’t forget to ignore the section of the plots from after 11:30 am to around 7 pm because during this time no tweets were collected by me. This is indicated by a steep rise in the curve and is insignificant. The real regions of significance are from hour 7 to 11:30 and hour 19 to 22.

    Considering brands, the visualization for Microsoft vs. Sony is depicted below. Sony is the clear winner here.

    Considering products, the visualization for iPhone 6 vs. Galaxy S5 is depicted below. The clear winner here is definitely iPhone 6.

    Tweeting Frequency Analysis

    In addition to time-based analysis, we can do other types of analysis as well. The most popular analysis in this case would be frequency based analysis of the users authoring the tweets. The following code snippet will compute the Twitter accounts that authored the most tweets and compare it to the total number of unique accounts that appeared for each of our query terms.

    The results which I obtained are depicted below.

    What we do notice is that a lot of these tweets are also made by bots, advertisers and SEO technicians. Some examples are Galaxy_Sleeves and iphone6_sleeves which are obviously selling covers and cases for the devices.

    Tweeting Frequency Visualizations

    After frequency analysis, we can plot these frequency values to get better intuition about the underlying distribution, so let’s take a quick look at it using histograms. The following code snippet created these visualizations for both brands and products using subplots.

    The visualizations I obtained are depicted below.

    The distributions follow the “Pareto Principle” as expected where we see that a selective number of users make a large number of tweets and the majority of users create a small number of tweets. Besides that, we see that based on the tweet distributions, Sony and iPhone 6 are more trending than their counterparts.

    Locale Analysis

    Another important insight would be to see where your target audience is located and their frequency. The following code snippet achieves the same.

    The outputs which I obtained are depicted in the following snapshot. Remember that Twitter follows the ISO 639–1 language code convention.

    The trend we see is that most of the tweets are from English speaking countries as expected. Surprisingly, most of the Tweets regarding iPhone 6 are from Japan!

    Analysis of Trending Topics

    In this section, we will see some of the topics which are associated with the terms we used for querying Twitter. For this, we will be running our analysis on the tweets where the author speaks in English. We will be using the nltklibrary here to take care of a couple of things like removing stopwords which have little significance. Now I will be doing the analysis here for brands only, but you are most welcome to try it out with products too because, the following code snippet can be used to accomplish both the computations.

    What the above code does is that, it takes each tweet, tokenizes it and then computes a term frequency and outputs the 20 most common terms for each brand. Of course an n-gram analysis can give a deeper insight into trending topics but the same can also be accomplished with ntlk’s collocations function which takes in the tokens and outputs the context in which they were mentioned. The outputs I obtained are depicted in the snapshot below.

    Some interesting insights we see from the above outputs are as follows.

    • Sony was hacked recently and it was rumored that North Korea was responsible for that, however they have denied that. We can see that is trending on Twitter in context of Sony. You can read about it here.
    • Sony has recently introduced Project Sony Skylight which lets you customize your PS4.
    • There are rumors of Lumia 1030, Microsoft’s first flagship phone.
    • People are also talking a lot about Windows 10, the next OS which is going to be released by Microsoft pretty soon.
    • Interestingly, “ebay price” comes up for both the brands, this might be an indication that eBay is offering discounts for products from both these brands.

    To get a detailed view on the tweets matching some of these trending terms, we can use nltk’s concordance function as follows.

    The outputs I obtained are as follows. We can clearly see the tweets which contain the token we searched for. In case you are unable to view the text clearly, click on the image to zoom.

    Thus, you can see that the Twitter Streaming API is a really good source to track social reaction to any particular entity whether it is a brand or a product. On top of that, if you are armed with an arsenal of Python’s powerful analysis tools and libraries, you can get the best insights from the unending stream of tweets.

    That’s all for now folks! Before I sign off, I would like to thank Matthew A. Russell and his excellent book Mining the Social Web once again, without which this post would not have been possible. Cover image credit goes to TechCrunch.