2024-12-30
Lotte Shopping e-commerce's MSA Watap monitoring optimization example
Company name
LOTTE SHOPPING
Industry
retail
Website
www.lotteon.jp/display/main/lotteon
The main reason I chose Watab was because of its intuitive and familiar UI. Speaking of APM, I thought it should be easy for developers and anyone to access and check.
LOTTE SHOPPING
IT Operations Manager

Lotte Shopping e-commerce's MSA Watap monitoring optimization example

'Lotte Shopping E-Commerce', a customer company that uses WTAP's monitoring, operates LOTTE ON, a platform that provides a convenient customer experience by integrating online and offline services. LOTTE ON is an e-commerce integration platform created by integrating 7 online and offline business divisions by utilizing its strengths as a traditional distribution powerhouse. LOTTEON surpassed 2 million MAUs for the first time in December of last year and January of this year, and maintained an average daily traffic of 330,000.

It is said that LOTTE ON previously used a different monitoring service but felt uncomfortable, so they switched to Watap monitoring. We conducted an interview with CEO Jung Sung-min to hear more about why the enterprise company Lotte Shopping e-commerce chose Watab and how they are using it.

Introduction: Introduction to customers and the state of infrastructure

Please give us a brief introduction to LOTTE ON's companies and services.

LOTTE ON is an e-commerce integration platform created by bringing together Lotte's leading shopping malls to make them easily accessible to users. It is a service that provides a convenient customer experience by making LOTTE HOME SHOPPING, Hi-Mart, and Super Fresh all accessible on a single platform, and we are working to deliver fresh stories to customers by integrating online and offline.

Lotte companies are providing various services, and I'm curious about the environment in which they have been carried out.

We have been providing services through various digital journeys. Starting with building the Lotte Internet Department Store service in an on-premise environment in 1996, we changed to a cloud environment in '15 and '16 to open Nike and UNIQLO services. After that, Lotte Internet Duty Free was built as an MSA in a cloud environment. In 2018, ELOTTE was built in a cloud-native environment, and finally the Lotte ON service was built.

Challenge: From introduction to WhaTap to user reviews

I'm curious about the reason behind the introduction of the Watap monitoring service and why they chose it.

The LOTTEON service is configured as an EKS and MSA service within the AWS cloud. Above all, LOTTEON's architecture is very complex, so how to monitor it was a mission given to us from the time of construction. In particular, along with the complex architecture, Kubernetes also needs to be monitored. At the time, there weren't many Kubernetes monitoring services, and how to quickly communicate the areas divided by MSA to practitioners was a big concern. I considered other overseas monitoring products, but ultimately chose Watap. The main reason I chose Watab was because of its intuitive and familiar UI. Speaking of APM, I thought it should be easy for developers and anyone to access and check. I chose WaTap because its intuitive dashboard and familiar UI are very powerful because of being able to quickly check and share issues.

I'm curious about how you've been using WhaTap since it was introduced.

There are three features we use a lot. First, the dashboard status. We are monitoring a dashboard that graphs various data such as number of payments and number of orders, which are the main business indicators of the LOTTEON service. This helps LOTTEON to respond quickly to other business services that may be affected. For example, if an order fails due to an issue with another credit card company, the issue can be handled quickly by controlling the payment method.

The second is a flexible alarm function. Since our service is split into MSA and each person in charge is divided, there is also a separate Slack channel. Thresholds can be set to match each MSA service issue to each channel, so each person in charge can receive appropriate notifications.

The third and final one is the statistics/report function. The exceptions that occur for each MSA can be checked on a weekly basis, and these are shared with each person in charge to check and follow up on areas requiring action. Along with the issue prevention function, this statistical information is also very helpful in analyzing the cause of the problem. It helps you to check what exceptions occurred most at the time of the failure.

Please tell us about your experience solving problems while using Watab.

I think I can tell you about two experiences in this section. First, there was an issue that occurred during the point earning event. It was an event where customers who filled out surveys for specific products earned points. The event settings were incorrect and the event was applied to all products. When you write a review for every product, you get 3000 points. As the content spread rapidly to various Internet community sites, a situation instantly attracted a large amount of traffic. As a result, the CPU of a specific POD rose significantly, and transactions could not be processed, and delays occurred.

At the time, this situation was solved by making very good use of WTAP's EKS POD monitoring dashboard. The dashboard was able to check the resource status for each POD container in real time, making it possible to monitor intuitively. Also, since each threshold setting can be set, an alarm is generated according to the conditions so that the relevant practitioner can check it immediately, which is very helpful. Also, using statistical indicators, it was possible to identify specific URLs that were problematic at that time, and measures could be taken to immediately control those URLs in the event of future issues.

The second issue occurred during a large-scale event held about twice a year. This is an event that lasts about a week, and a large number of coupons are issued to customers during that period. As the number of coupons held by customers increased, there was a heavy load on the logic of applying maximum discounts when paying for products. As a result, OOM occurred for PODs related to discount application. In this issue, we used Watap's Heap monitoring. When I checked the graph that compared the time of the issue and the usual situation, the problem was that the number of SQL patches increased significantly. By comparing and monitoring the number of SQL patches and the current number of SQL patches at the time of failure, we were able to respond so that the same issue did not occur. Also, by setting the heap memory threshold for each container so that action can be taken immediately when OOM signs are seen, it has been useful until now.

Management: Watap customer support services and future plans

How do you plan to use Watab in the future?

In addition to the examples mentioned above, we are also experiencing various issues after switching the structure to MSA. This is because it changed to MSA, multiple services are connected to one transaction, and potential risk factors are scattered. Instead of simply detecting issues with APM dashboards, various analyses are often required in architectures with increased complexity. This is a mission to move from simple monitoring to observability. In order for LOTTEON to move forward with observability in the future, LOTTEON uses all metrics collected by WTAP and is actively working to secure visibility. We are still collaborating with Watap engineers to make it possible to utilize additional metrics.

Please tell us why you should use WhaTap monitoring solutions.

Renowned business scholar Peter Drucker said, “If you can't measure, you can't manage.” I said that. In monitoring to manage simple quality management, and furthermore, in order to apply observability, it is necessary to collect as much data as possible and use it in the right place. In order to secure and utilize this observability, Watap's monitoring solution is essential.

WhaTap, the integrated monitoring platform trusted by over 1,200 companies. Experience it today.