Early this April, SenseTime (商汤科技) announced a fresh funding round of $600 million, just nine months after its then record-breaking (in the AI scene) Series B round of $410 million. The latest financing brought in new shareholders including Alibaba, Suning (苏宁) and Singapore’s sovereign fund Temasek.
Recently, XU Li, SenseTime’s CEO, sat down for an exclusive interview with 36Kr, a Chinese biztech media and parent company to KrASIA, in which he talked about the company’s future strategies, the competitive landscape in the field of computer vision this year, as well as what he thinks a world that’s increasingly driven by data will look like.
The following is an English translation of the transcript of the interview. Some parts have been edited by 36Kr and KrASIA for brevity and clarity.
This is the Part 2 of a 2-part series. Read Part 1 here.
Application of AI in four vertical focuses
Q: Facing increasingly fierce competition, do you feel pressured about expanding the application of your AI solutions this year?
A: Honestly, I’m always under pressure. The good thing though is that we’ve focused on the right thing every step of the way. Our sales figures have kept climbing up. And our revenue has kept growing by 400% three years in a row. Our company turned profits last year. And the company also saw a comparative growth of nearly 300% in Q1 2018.
Our four vertical focuses, i.e. smart city, mobile application, smartphone-oriented application and vehicle-oriented application, have all registered dramatic growth, with each vertical generating a total of about 100 million yuan revenue. And the growth rate is projected to rise further this year.
Q: Which one of the verticals presents the most challenges for AI application?
A: The application of AI in the four verticals is actually rather smooth, except that a longer time frame is needed for vehicle-oriented application. The application of AI in self-driving vehicles is more than just offering some algorithm-powered systems, so its mass adoption will most certainly take time. That being said, we can start off by offering integration services and eventually come up with a suitable product mix which encompasses all driving assistant solutions. Now that our ship has sailed in this area, we must focus on sales. That’s also where our investment will go this year. The SenseTime’s way of expanding in a vertical is “diving into it first, then we talk about building product mix”.
Q: What emerging AI application scenarios are you most interested in this year?
A: I have always been bullish on the application of AI in video recognition. In two years, video recognition may evolve to meet the industrial application standards. The reason why video recognition wasn’t applied earlier is that video recognition technology was not sophisticated enough for application in real-life scenarios. The realization of the concepts like “what you see is what you get” or “the process from seeing to knowing to purchasing” is all dependent upon the video recognition technology’s coming into fruition. Additionally, the algorithms for AR and VR products are bound to be refined by AI. For scenario recognition, the norm of the past was to rely on the algorithms configured by human staff. But, that is now powered by big data. This will definitely generate some completely different results.
We also created an open AR platform together with OPPO, which may be primed to reshape the future of games and interaction market. If everything fares well, a hit AR game should be rolled out in the ensuing one or two years.
Q: The unmanned Suning Sports Biu store SenseTime helped build in Nanjing in August 2017 has been in operation for over six months now, how is it performing so far?
A: Quite good actually. The six Biu stores, including the one mentioned above, have so far garnered a conversion rate of over 70%. This year, we expect to open 200 more stores of this sort. Overall, the earlier Biu stores are now operating with a healthy stream of customer flow and efficiency. Human staff intervention is rare. We chose to start with clothes first, because clothes come with their own RFID labels, which means that no additional expense is needed on digitizing labels. All you have to do is simply put the clothes in there and the rest is automated. The problem about selling other stuff in unmanned Biu stores is digitizing labels, which is not required for clothes.
Q: The idea of smart city sounds a bit distant, in which areas will represent a smart city first?
A: Smart city, in my view, refers to big data-powered city management. The big data, in this case, includes that of the public sectors including transportation, finance, healthcare and consumption. Some others include surveillance data, signal data and audio & video data. The application of big data solutions, including data recognition and data processing, in traffic control or city management harbors huge growth potential.
Q: The data of mass profile and personal data are currently controlled separately by different operators. What is the challenge for generating a general profile?
A: The challenge resides in two aspects: the first is the data are generated in different scenarios and the second is non-uniform data standards. Overcoming this challenge calls for concerted efforts of all relevant operators.
In addition, the visual information, I think, can be captured through the following three mediums. First, smartphones, mobile devices like pad and mobile cameras. In the future, everything, whether it is human face or other objects, can be coded and transmitted between devices. Second, fixed security surveillance cameras in such areas as buildings and airports. In the future, these cameras will definitely be interconnected with each other. Third, the satellite camera, which captures data from a holistic and worldwide angle. Anyway, the data, in the future, will be able to flow unfetteredly among different spaces, eventually enabling the idea of smart city.
How to make money from the digitization of the physical world?
Q: What role does AI play in the digitization of physical world?
A: AI, I think, serves two purposes: mass data-enabled forecast and differentiated services.
More data oftentimes means more accurate forecast for the future. This is a boon of the data of mass profile. In a smart city, for instance, we can use big data to predict the passenger flow and its direction of an entrance in the future 24 hours and, in turn, guide the traffic. Differentiated services are a benefit that comes with the application of personal data.
And by leveraging the data of mass profile, individual experience can also be enhanced. For example, with an accurate prediction of the traffic conditions of the road ahead, the big data-powered system can then recommend the optimal route for a person based on her or his preferences and planned stops. This represents the combined use of data of mass profile and personal data. Big data, obviously, is the combination of both data of mass profile and personal data.
Q: What role does SenseTime play in the commercialization of big data?
A: We provide standardized technology platforms, which are basically the infrastructures for big data. You can think of it as a “universal language”, it’s just that the applications vary from business to business. The businesses using our platform can standardize the definition, like the nature of the items, and descriptions, like hair color, hometown and clothes, of varied things including human, vehicles and other objects. These information on the platform could translate into various kinds of applications.
Q: Does this make widening the application of your standardized platform a priority at the moment? As your platforms penetrate into a wider range of application scenarios, you will build a barrier to entry?
A: Well, that depends on if the standardized platform is applicable to a wide array of scenarios. If it is, it is then out of question that it can be introduced to the widest array of scenarios. This is up to how you look at it.
Q: To attain a comprehensive profile and to benefit businesses overall, data integration and sharing is essential. Which company will spearhead this undertaking?
A: This is a particularly tough task, even for BAT. The enterprises may not be willing to just hand over their data. To that end, a public data sharing platform is needed, so the government might play a central role in this case. The government is in possession of a mammoth amount of data, so it can, after data masking, put these data on the open platform for the use of the public, while, at the same time, encouraging them to contribute their own data. This will help create more applications.
This is not an undertaking that can be done with only a leading organization. It entails voluntary participation and exchange out of real needs. It can’t be forced. In short, data sharing is not that simple. That said though, we can at least build a way of communication first, establishing a universal “language”. This is what we are doing right now.
Q: As for data of mass profile and personal data, which one of them is ahead in terms of commercialization?
A: The answer is certainly data of mass profile, because it involves no personal data and can be managed more efficiently.
The sensitive nature of personal data has determined that it is hard to commercialize. Well, I am not saying that its monetization n is entirely impossible. For example, a company may know the location of a person’s residence and the fact that the residence’s door is broken. It then goes on to promote doors to the residence owner. But, this coincidence may startle the person so much that he or she may suspect if her or his residence is under constant surveillance.
This simply can’t form a closed loop business model. The actual personal profile (user profile) is devoid of any identity information and all it includes are only the users’ attributes. For example, a person may look more like someone from the northeast of China. Based on that information, a company can then promote dumplings to him or her. As for who the person is, it is not specified in the user profile.