Alibaba Cloud is the largest cloud computing company in China. It integrates Alluxio with its OSS(open storage service), and leverages Alluxio as a fast data-access layer on top of OSS.
Arimo leverages Alluxio’s in-memory capability, improving time-to-results for deep learning models up to 60%.
Baidu uses Alluxio for running fast SQL queries over globally-distributed databases. Petabyte-scale data was distributed over multiple data centers, and Alluxio accelerates the remote data access and stores the frequently used "hot" data that would be local to the compute nodes.
Barclays describes how they iteratively process raw data directly from the central data warehouse into Spark and how Alluxio is their key enabling technology.
Bazaarvoice is a digital marketing company based in Austin, Texas. It stores massive amount of data on AWS S3 and leverages Alluxio in production to speed up their big data analytics. In this architecture, Alluxio enables data locality, data caching and fixes the semantic differences of AWS S3 storage, achieving 5-10x speed up for their Hive queries running on S3.
Comcast brings Alluxio into its framework stack for operationalizing predictive ML models to improve customer experience, to eliminate bottlenecks in the process from model inception to deployment and monitoring. Alluxio provides the universal data plane in the stack on top of various under-stores (Ex. S3, HDFS, RDBMS).
Cray has fused supercomputing with an open, standards-based framework to deliver an industry first: the Cray® Urika®-GX agile analytics platform. Alluxio provides a unified view of enterprise data that spans disparate storage systems, locations and clouds, allowing any big data compute framework to access stored data at memory speed. Alluxio gives Cray customers the ability to co-locate compute and data with memory-speed access to data while virtualizing across different storage systems under a unified namespace.
Ctrip is a leading Chinese provider of travel services including accommodation reservation, transportation ticketing. It uses Alluxio to boost performance of Spark SQL workloads and alleviate the pressure on HDFS Name Node. In addition, Alluxio is deployed as the single entry to unify two HDFS clusters.
Didi Chuxing is a major Chinese ride-sharing, AI and autonomous technology company. It leverages Alluxio for several purposes inside the data analytics platform: (1) accelerating data access from the remote data centers (2) integrating the data from several different data sources from different data centers (3) sharing the data across the jobs and compute frameworks
ESRI leverages Alluxio in its mapping and spatial analytics software to read and write geospatial data to a plethora of distributed data stores, such as Amazon S3, HDFS, or OpenStack Swift, including data stores are not natively supported by the ArcGIS platform.
Guardant Health is the world leader in comprehensive liquid biopsy. With Alluxio, Minio, and Spark, Guardant Health is able to create a performant and robust yet scalable system to perform large scale data processing in a cloud-native manner.
Huatai deploys Alluxio Enterprise as the storage layer that unifies data from disparate sources at memory speed, providing high performance and a predictable SLA for leveraging even petabytes of data.
Huawei bands together with Alluxio to release a big data storage acceleration solution, integrating Huawei’s FusionStorage with Alluxio’s memory-speed virtual distributed storage system, to realize unified data management, improved analysis efficiency, faster application performance and popularize big data for processes including storage, analysis, and archiving.
IBM deploys Alluxio over Swift and SoftLayer to build a flexible and efficient big data analytics platform
Intel uses Alluxio in several scenarios to share data across different applications and computing frameworks, reduce application's memory consumption and GC overhead, and cache remote data as a local storage manager
JD.com is China’s largest online retailer. It uses uses Alluxio to provide support for ad hoc and real-time stream computing, using Alluxio-compatible HDFS URLs and Alluxio as a pluggable optimization component. One example of their computing framework, JDPresto, has gained a 10x performance improvement on average by deploying Alluxio.
Kyligence is a big data intelligence company that offers solutions for big data analytics. Alluxio enables effective data management across different storage systems through its use of transparent naming and mounting API. With Alluxio, Kyligence Analytics Platform gained a good balance between performance, cost and management effort in the Cloud.
Lenovo, the number one manufacturer of personal computers and one of the largest smartphone vendors in the world, can now seamlessly and securely access data from data centers worldwide without labor intensive and error-prone ETL, making it available to analytics running in a single data center at in-memory speeds with Alluxio.
Lucidworks leverages Alluxio in the cloud to accelerate remote Solr data access and cloud recovery.
MOMO is a leading mobile pan-entertainment social platform in China. It leverages Alluxio with Spark SQL to Speed Up Ad-hoc Analysis
Myntra is a leading Indian fashion e-commerce marketplace company focused on a mobile app business model wherein customers can buy and transact in their site through smartphones. Myntra adds Alluxio to its data processing pipeline, and it is able to improve the customer experience, as well as reduce the time needed to generate actionable business intelligence from the data. Alluxio is a critical component of this architecture and the Myntra team has contributed to Alluxio open source by documenting how to use Alluxio with Azure blob store for other users.
Nvidia leverages Alluxio as part of its GPU-accelerated data analytics framework to manage different storage systems, and provide a quick and easy access to information within various data lakes.
Oracle's Big Data File System is based on Alluxio, and it is designed to accelerate data access for data pipelines with features that significantly improve the runtime performance of Spark applications. BDFS accelerates data access to and from Oracle Cloud Infrastructure Object Storage Classic by providing an active caching layer.
PerceptIn designed and implemented a cloud architecture using Alluxio that manages enormous amount of incoming data in different storage systems with high throughput and low latency.
Qunar leverages Alluxio in product to boost the performance of real-time data analytics, resulting in 15x speedup on average. In addition, it uses Alluxio's unified namespace to enables different applications and frameworks to easily interact with the data from different storage systems.
Samsung uses Alluxio with different storage media available in systems including NVME SSDs while providing in‐line performance consistent with the speed of the underlying storage media.
SF Express (Group) Co., Ltd. is one of the two leading couriers in China, and provides domestic and international express delivery solutions to a wide array of customers. SF Express leverages Alluxio as data acceleration module for Spark, and it uses pinning feature from Alluxio to pin their critical transaction data in memory. As a result, they decrease the query completion time from multiple tens of seconds to sub ten seconds.
Suning is one of the largest non-government retailers in China. It uses Alluxio to unify storage systems and manage multiple HDFS clusters.
TalkingData is China’s largest data broker covering more than 600 million smart devices on a monthly basis. TalkingData leverages Alluxio as a single platform to manage all the data across disparate data sources on premise and in the cloud, removing the complexity of the big data environment by masking the different data sources and providing a unified interface.
Tencent is one of the largest technology companies in the world and a leader in multiple sectors such as social networking, gaming, e-commerce, mobile and web portal. Tencent News leverages Alluxio with Apache Spark to create a scalable, robust, and performant architecture to provide the best experience to more than 100 million monthly active users of Tencent News.
Vipshop is a leading online retailer in China. In a data-driven industry such as online retail, achieving timely insights through real-time data analytics is key to boosting sales and keeping customers happy. Vipshop processes and analyzes petabytes of data with advanced machine learning and analytics methods to answer complex questions such as how users are behaving, why a purchase was made, and what ads and recommendations are most effective. With Alluxio, Vipshop can access, store, and manage data across disparate storage systems on premise and in the cloud at memory speed.
Wells Fargo uses Alluxio in their data preparation and data exploration pipeline where data is merged based on context and analytical needs. Alluxio helps accelerate the large dataset for Spark, which otherwise would take dozens of minutes to load per each iteration. With Alluxio, data is loaded once and can be served from memory for the subsequent accesses. By using Alluxio, Wells Fargo saves hours in workload processing time.