Sentiment Analysis of β-Hydroxybutyrate (BHB) Supplements’ Consumer Online Reviews

of β-Hydroxybutyrate (BHB)

Recognizing those functions, scientists spend efforts to commercialize BHB supplement products for the massive consumers. BHB products are currently commercialized more as weight loss and energy enhancer on the dietary supplement market. Thanks to their efforts, the consumers nowadays can easily get access to those products through channels such as retail stores, online platforms (e.g., Amazon), local clinics and comprehensive hospitals. BHB supplement is still a small-sized emerging market compared with traditional supplements such as vitamin C, whey protein, and etc. Further understanding consumers' BHB shopping behavior, especially online, provides us with first-hand consumer shopping data, guides R&D to design more targeted BHB supplement products or derivatives. In a broad sense, such study helps to develop cost-effective healthcare solutions for new product development.
The obtained consumers' online reviews served as the critical building blocks of this research piece. Based on those building blocks, sentiment analysis has been developed and applied to mine the text of consumer feedbacks. The technology of sentiment analysis is also found under terms such as emotion detection, [7] semantic analysis, [8] opinion mining [9] and etc. Those terms are more or less similar to the term "sentiment analysis" used here, a computational study of the text content of people's opinions, sentiments, emotions, and attitudes. In detail, it is regarded as a classification assignment as it classifies the orientation of a text into either positive, negative, neutral or compound [10] In the era of big data, it is useful for companies and individuals to monitor their reputation and get timely feedback about their products, activities, events, and policies [11]. It was also quoted as one of the hottest fields in computer science [11].
Both machine learning-based and lexicon -based approaches have been developed to realize the sentiment analysis of text data [12]. Machine learning-based analysis depends on large volume of data for accurate prediction. The more training data, the better the performance of the latter analysis. Meanwhile, lexicon-based approaches consult lexicons, the online or off-line dictionaries, to classify the polarities or emotional orientations. It relies on the consulting dictionary during which a fairly large number of data is good but not a must condition. The previous studies show that lexicon-based sentiment analysis work well on social media type text, [13] does not require large training data, and perform rapidly with streams of data [14]. For instance, Paltoglou and Thewall proposed their algorithm for unsupervised, lexicon-based sentiment analysis of web-based textural communication such as online discussions, tweets, and social network comments [13].
Under the wave of supervised, machine learning approaches in recent years, their results of extensive tests on three real-world datasets demonstrated that the developed algorithm outperformed machine learning solutions in the majority of cases. It suggested that lexicon-based sentiment analysis could be a robust and reliable approach to conduct sentiment analysis of informal communication on the internet. In another research, Kaushik and Mishra utilized a Hadoop-based technique to carry out the sentimental analysis and opinion mining in a speedy and quantitative manner [14] Their results showed that the Hadoop-based method was a speedy and accurate technique ready for scaled data sets. Hence, amid the pool of different data analytical tools, sentiment analysis is suitable for analyzing the consumers' feedback on an emerging market with a rapid growth. Bearing such background, this paper illustrated the application of lexicon-based sentiment analysis to systematically analyze the consumers' online reviews on various BHB products, an emerging dietary supplement market. The resultant analysis helps us understand consumers' shopping behavior of innovative dietary supplements. Figure 1: Framework of online review data sentiment analysis.

Methods
The framework of online reviews' sentiment analysis is displayed in Figure 1. It shows that the process of sentiment analysis including scraping the customer review data from Amazon. com, data cleaning, word-level sentiment analysis, sentence-level sentiment analysis, and text complexity analysis.

Online Review Scrape
The Web Scraper, a Chrome extension is used to extract reviews' texts from dynamic web pages. A sitemap that displays how the website should be traversed and what data should be extracted is created prior to online reviews' scrape. A series of JSON codes are developed and modified to scrape online customers' reviews from Amazon.com. The original code can be found in Scrapehero package on Github.com. The modified JSON code was inserted into the sitemap JSON box under Web Scraper extension before data collection. The request interval is set at 2000 ms during online review scrape. Depending on the complexity of the reviews, the reviews' scrape time for one product on Amazon varies from less than 1 minute to 30 minutes. Text data sometimes require pre -process or cleaning before text mining to minimize the noises or biases [10]. For the online reviews in this research, most users expressed their comments in a brief and straightforward way.

Volume 27-Issue 3
There are not many noise and uninformative parts as HTML tags, scripts and advertisements as other online texts [10]. We simply cleaned the text data by removing special characters and reorganizing the content for further analysis. On another side, we also tried maintaining the originality of the review contents as much as possible.

Word-level Sentiment Analysis
An external lexicon or dictionary served as resource to judge the text sentiment or polarity [15,16]. The words in online reviews of one product are obtained with NLTK tokenization before sentiment classification [17]. Then, they are classified into categories of positive and negative for further analysis. Besides, word clouds are generated based on the word-tokenized text contents with the wordcloud function in NLTK [17]. The word-level sentiment analysis gives us a direct observation of the sentiment expressed from the text comments.

Sentence-level Sentiment Analysis
Vader sentiment analysis of sentence-tokenized text of online reviews of one product is performed to gain sentiments including positive, negative, and polarity score [18]. This approach provides how positive or negative a snippet under analysis is. In details, the sentence-level snippets are then classified into the categories of positive, negative, neutral, and compound, during which scores are assigned to each snippet. Among the four categories, the compound score measures the sum of all the lexicon ratings (positive, negative, and neutral) that have been normalized between -100% (most extreme negative) and +100% (most extreme positive). It is also called 'Normalized, weighted composite score'. The higher the compound score, the more overall positive we obtain. It provides us with another angle to view the overall sentiment analysis.

Text Complexity Analysis
Text complexity analysis gives a statistical summary of the text data we collected. The text complexity analysis summarizes the number of online reviews for one product, number of characters, number of words, number of sentences, and number of unique words in those reviews. The text complexity analysis enables us to take one more dimension to view those text data, judge the text feature, and predict the product market confidently.  Table 1 shows the statistics of Hydroxybutyrate (BHB)

Review Data Summary
products' review data collected on Amazon.com. The BHB product reviews in text were collected within 2 months of the year 2019.
The entire text dataset include 30877 reviews, 105703 sentences, and 1574171 words. Those product reviews reflect the clients' comments on 71 products under 26 brands. Note: *#1 to #6 are flavored BHB products.

Word-level Sentiment Analysis
Word-level sentiment analysis utilizes lexicon to classify the words in the online reviews of one product into positive and negative categories. The process put all the recognized positive words and recognized negative words into two separate classes. Since the human language is abundant with the complicated expressions, the portions of positive and negative words are relatively small. We then viewed those numbers comparatively. Table 2  lemon raspberry. Most of the flavored items were assigned with positive/negative ratios higher than the unflavored items. Only one lemon item had the positive/negative ratio 1.68. Three out of four capsule items received less than 2 or even lower positive/negative ratios. Such results suggest that appropriate flavoring improves the consumer acceptance of BHB products. Besides, word clouds were generated based on the online reviews of flavored/unflavored BHB products listed in Table 2.
The word size in the word cloud is proportional to the frequency of that word occurring in the reviews. For instance, the word cloud of product #2 clearly shows that consumers care about the taste and ketosis functions of the BHB product (i.e., increasing body ketone). The big words such as "flavor" and "great" suggested that the consumers who used product #2 expressed highly-positive feedback on the flavoring part of the product. We then summarized the top-3 high-frequency words in the word clouds of analyzed BHB products in Table 3. High-frequency words such as "keto", "taste", "product", "flavor", and "great" can be found in the word clouds of products #2, #3, #4, and #7 (See Table 3). The patterns of top-3 high-frequency words were not identical from product to product, however, the same words such as "taste" occurred repeatedly in different orders of BHB products' high-frequency word lists.
From those highly-repeated words, we speculate that the product development team behind those products attempted to grab their consumers by making tasty functional drink mix. In the word cloud of product #7, we found the word "diarrhea" clearly on the corner indicating the occurrence of such side effect in body. Similar side effects caused by magnesium citrate over intake were found in clinics, and it worsened gastrointestinal load. It also illustrates that not everyone adapts to the BHB supplements, and appropriate daily intake (e.g., amount, dosage, and intake approach) should be recommended.

Figure 3:
Word clouds of BHB capsule products' reviews including product #8, #9, #10, and #11 from Table 2. We then extracted the online reviews of the rest BHB capsule products (i.e., product #8, #9, #10, and #11) in Table 2, and generated their word clouds (See Figure 3). We saw different big or small words on each word cloud in Figure 3. "lost pounds", "weight loss", "lose", "weight" can be observed in the word clouds of product #8, #9, and #11, respectively, which reflects the weight loss function of BHB as supplement. The word "appetite" in the word cloud of product #9 might be related to the appetite disturbance caused by BHB. Then, we captured the word "energy" in the word clouds of product #9, #10, and #11 suggesting the energy enhancer function of BHB. Plus, the words "help", "great", "will", "work", "helped", and "happy" in the word clouds in Figure 3 give us confidence in the BHB's supplement functions. Their corresponding high-frequency words pattern is straightforward. The words "keto", "help", "supplement", "product" in the word clouds of product #8 to #11 appeared in different frequency orders (See Table 3). We can see that the massive consumers put emphasis upon BHB's functions.
Many of them reflect positive feedbacks on BHB's functions, especially weight loss and energy enhancer.   Figure 4A. We selected some of the representative BHB products and placed them together in Figure 4B. The compound scores of those flavored BHB powder products were high, more than 20%. Besides, we grouped all the flavored BHB products together in Figure 4C. From Figure 4C, we found that most of the flavor categories have average compound scores higher than 20%. Five flavor categories among them have average compound scores even higher than 25%, and three of them got average compound scores close to or higher than 30%.

Sentence-level sentiment analysis: Flavor and Price
Those top products were flavored with apple, berry, cherry, lemonlime, and lemon-strawberry flavors, most of which fell in the category of the citrus flavors. It is not too surprising. Citrus flavors have been investigated for a long time [19]. This flavor category has been widely-accepted to people all over the world, more importantly, is available and tastes similar globally. Those flavors have clean, refreshing tasting note, and are well-compatible with many other flavors and ingredients. Thus, it is relatively easy to commercialize citrus flavor-involved products around the world.
The feasibility of design, process, and production also makes the application of citrus flavors convenient [20,21]. to $40 (See Figure 4A). The compound scores of unflavored BHB products had a broad range from -22% to 37% (See Figures 4A, 4B).
The compound scores of BHB capsule products fluctuated heavily.
Among them, quite a few BHB capsule products had the compound scores above 20%, while others had the compound scores below 5%. The other few unflavored BHB powders and liquid had the compound scores of 12.21% (powder), 15.06% (powder), and 13.94% (liquid). Those items were good products, but they were not as competitive as those with higher compound scores. Although the unflavored BHB products were sold at low prices, they were still less popular or competitive than others. For instance, flavored BHB products under the brand C and E clearly exhibited the compound scores higher than the other unflavored BHB products (See Figure   4B). More additives such as flavors and sweeteners increase the product price, however the high quality of the resultant products still drive consumers back to the products. Besides, those products' compound scores were calculated based on the certain amounts of consumers' reviews. The compound scores of the BHB products such as A Capsule 2 and B Powder Unflavored 1 from Figure 4B were generated based on 39 and 170 reviews, respectively. The less appealing compound scores of those unflavored powder/capsule products are based on the common agreement among consumers.
The monotonous products cannot arouse consumers' continuous shopping desire.

Sentence-level sentiment analysis: Dosage and Package
The current BHB market provides our consumers with different packages. The packages of 1 oz to 55 oz were used to bottle BHB products with different dosage forms. Figure 5 shows the sentencelevel sentiment analysis of BHB products' online reviews with package focus.
All Dosage: Figure 5A shows the bubble chart of BHB products plotted with package versus dosage format. From Figure 5A Nevertheless, when we zoomed into each review of BHB capsules with low compound scores, we found the reviews such as "Gives you a lot of energy but no weight loss", "It did nothing and there was no information with it to tell me what I should do to make it work, sorry I was very disappointed.", and "Did nothing at all. No change, not even a pound dropped combined with diet and exercise.
Save your money." among all the other positive reviews. The body metabolisms of BHB products still deserve further investigation.
More clinic studies of BHB products are needed to address this issue. The consumers might also be subjective and with incorrect expectation to some extent. On the other hand, the supplement manufacturers should review those insufficient customers' feedbacks and educate their clients with the appropriate use of their products. Unlike capsules, BHB powder products were commercialized with various packages shown in Figure 5A. There is no unified package size for the powder products. More creation becomes possible in bottling BHB powder products. consumption orientation, especially for BHB powder products.
All package items have experienced both high and low compound scores. Package size is not a dominant factor in BHB product design.
Other factors including formula, price, and label all play a role in the market performance of the BHB products.

Complex Analysis
The sentiment analysis provides us with the polarity information of the text data, while the complex analysis summarizes the word number, sentence number, and character number in the reviews of each BHB product. The combination of both analyses enables us to understand the robustness of the consumers' feedbacks on BHB products. Table 4 lists the complex analysis of three brands' BHB products. It is part of the 71 BHB products' complex analyses.
The three products under brand A received more than 300 reviews referring to over 1000 sentences, 14000 words, and 65000 characters. The consumers paid certain amount of attention to the BHB products under brand A. Interestingly, the products under brand A had very similar patterns of text complexity, which was also found in the products of brand C. Five out of six products under brand C had almost overlapping numeric values in # of reviews, # of sentences, # of words, and # of characters with small deviations.
Similar text complexity is observed under one brand, while large differences appear among brands. We can see that brands help differentiating products among companies. Rational consumers are willing to trust products with higher reputation more than others. Those products with high reputation automatically form a marketing event for their brands. Note: *#1 to #6 are flavored BHB products.
To have a comprehensive understanding, we mapped the BHB products' distributions under the confinement of sentiment and complexity analyses. As such, Figure 6 presents the mapping of BHB products labeled with flavors under the combined conditions of online reviews' polarity and their complexity. In Figure. 6A and Figure 6B, the marks' colors indicate different flavors, while the same or similar color suggests the same flavor or similar ones. Figure 6A shows the flavor impact on product reviews' polarity in the context of the # of reviews. Data points scattered all over the plot in Figure 6A. For instance, we observed an unflavored BHB product with compound score -22% generated from 1 review, a lemon-lime BHB powder with compound score 55% from 3 reviews, and another lemon -lime BHB powder with compound score 23% from over 235 reviews. Products had very distinctive analytical results. For convenience, we divided the entire plot into 4 sections by using the boundary of compound% = 20 and # of Reviews or # of Sentences = 100. In Figure 6A, most of the unflavored BHB products were located in the center of the plot, while a certain amount of the unflavored BHB products sat in the left-down part of the figure, the low compound score and low # of Reviews region. On the contrary, quite a few flavored BHB products gathered in the right-top part, the high compound score and high # of Reviews region. Besides, almost all the flavored BHB products had the compound scores higher than 20%, nevertheless, there were more unflavored BHB products with compound scores < 20% than those with compound scores > 20%. Such observations fortified our previous observation that flavored BHB products were more easily accepted by the massive consumers.
It should be noticed that the compound score is assigned based on the entire pool of reviews for one product. We should then take the text statistics of those reviews into consideration. Unlike the lengthy text in books, most of the online reviews for one product involve less than 5 sentences, and only a few of them directly express sentiment. We utilized # of Sentences as substitute of # of Reviews for the same analysis. It gives us another angle to view the product polarity in the context of the alternative text statistics. Figure 6B shows the plot of compound score versus # of Sentences with flavor label. Compared with Figure 6A, Figure 6B with # of

Volume 27-Issue 3
Sentences has similar range of text statistics. The data in Figure 6B overally shifted to the right side of higher order magnitude. It was because large portion of the BHB products' reviews contained more than one sentence. Most of the flavored BHB products assembled in the right-top part of Figure 6B, the high compound score and high # of Reviews region. Many of the flavors in that region were combined citrus flavors such as lemon-lime, lemon-strawberry, lemonraspberry, and orange-mango. Other flavored BHB products were found outside of that region. For instance, chocolate, apple, and another lemon-lime BHB powder products received less than 10 either # of Reviews or # of Sentences. Those products were located in the left-top region. It is likely that they would receive more positive online reviews, and shift towards right-top region. Similar to Figure 6A, more unflavored BHB products with compound score < 20% was observed in Figure 6B. Meanwhile, the relative pointto-point distances of the unflavored BHB products were changed also due to the fact that many reviews contain more sentences than others. Packaging is another dimension we previously mentioned in sentence-level sentiment analysis. We applied the same approach in mapping the BHB products' distributions labeled with packages under the confinement of sentiment and complexity analyses (see Figure 7A and Figure 7B). The colors in Figure 7A and Figure 7B indicate different packages, especially volumes. Among all the packages, similar volumes such as 20 oz and 25 oz, 1 oz and 4 oz share the same blue and voilet in Figure 7, respectively. Figure   7A displays the BHB products' distribution labeled with packages under the combined conditions of compound score and # of Reviews. Large volume packages appeared on the right-top side of the plot. Those BHB products' bottles had volumes from 20 oz to 55 oz. Those items overlapped with the flavored BHB products with compound scores > 20% in Figure 6A. give more positive comments to their full-size items. Figure 7B shows the BHB products' distribution labeled with packages under the combined conditions of compound score and # of Sentences.
Likewise, the BHB products' distribution in Figure 7B is similar to that in Figure 7A. The only difference lies in the relative positions of each data point as mentioned before.

Discussion
This piece of research conducted sentiment analysis of BHB products' online reviews to understand how the current BHB products be accepted by the massive consumers. Two factors

Volume 27-Issue 3
including flavors and packages were taken into consideration during analysis. In terms of flavors, they are not only edible ingredients but also multisensory phenomena with the integration of taste, olfactory, and other sensory information into a perceived property of the product rather than a collection of individual sensory attributes. For clients, the sensory pleasure is their motivation to consume a product and experience the flavoring journey again and again [22]. The sensory qualities from flavors reduce the product risk and increase its consumer affinity. When we consume supplements, especially sensory products, flavors become more important than the sum of other parts. In fact, flavor can be regarded as a primary factor in driving consumption behavior [23].
It was demonstrated that liking with flavored products increases chewing and swallowing rates [24]. In the current research, we observed that flavored BHB products were more popular than unflavored BHB products. The popularity of flavored BHB products is independent on the package volume. Among the flavored BHB products, the products with the combined citrus flavors such as lemon-raspberry can further increase the polarity of their online reviews. In other words, those particular flavored BHB products stand out among competitors.
Package is another factor when we conducted sentiment analysis of BHB products. It is the first visualization of product to the consumers. It asserts a critical role in product marketing and sales. Research shows that even the position of an image on the packaging affects consumers' perception of the product weight and package evaluation [25]. A considerable amount of investigations on multisensory product perception suggest that packaging features can bias consumers' flavor evaluations [26]. Another consumer research shows that altering packaging materials affect not only sustainability perceptions but also several other aspects including perceived taste and quality [27]. Our observation indicates that the packages of the flavored BHB products are distinct from those of the unflavored BHB products, especially packaging size. For the most of unflavored BHB products, their packaging sizes are limited in design. Compared with those unflavored BHB products, flavored BHB products have a wider range of package volumes which offers more possibilities for product marketing and education. In addition to the major labels of flavor and package, the sentiment analysis enables us to notice the fact that massive consumers emphasize on the BHB's functions such as weight loss and energy enhancer.
Consumers are not willing to seeing that products lose those functions and work only as placebos. Some side effects of BHB products such as diarrhea suggest that we should continue more clinical studies and user education. From the complex analysis, brand implicitly deploys product differentiation and credit enhancement on this emerging market. The product differentiation refers to the business strategy of highlighting the unique features and benefits to separate it from competitors. When it functions, brand can create additional intangible value, consumer loyalty, and even market trend.

Conclusion
Sentiment analysis of β-Hydroxybutyrate (BHB) products' online consumer reviews was carried out to explore consumer insights on the emerging marketing of dietary supplements. Our observations demonstrate that flavoring plays a key role in the BHB product market together with other factors such as packaging and brand. Flavored BHB products are more popular than unflavored BHB products, and they are more acceptable by massive consumers despite of their high prices. Creative flavors such as lemonraspberry enable BHB products to stand out among competitors.
High volume packaging provides consumers with more possibilities of marketing and education. Meanwhile, we cannot ignore other factors such as active functions and brand building.