Big Data Cluster

Avoid Storage Bottlenecks and Improve Analytics

Don’t Let Legacy Storage Stop You from Delivering Results

Workloads that extract value from massive data sets with accelerated computing (HPC or AI/ML), while highly desirable, can suffer from computing bottlenecks and poor performance. And, even if you deploy all flash, using DAS and NAS can mean additional challenges. The Big Data Cluster removes bottlenecks via a shared pool of NVMe over fabric (NVMeOF) that enables jobs to run up to 10x faster. And S3-compliant storage allows you to control costs.

Get More Information

Ideal Use Cases

  • Business Intelligence
  • Real-time HPC Analytics
  • ML Training
  • Predictive Analytics
  • Data Visualization

Relevant Industries

  • Oil & Gas
  • Financial Services
  • Life Sciences
  • Media & Entertainment
  • Aerospace & Defense

Combining Powerful Compute with Optimized Storage and Networking

  • Software-defined architecture based on massively scalable, parallel Weka file system (WekaFS) (via NVMe-over-fabric)
  • Optional S3 Compliant Tier
  • Scale-up and Scale-out with simple, pre-defined storage building blocks (6+ server configuration)
  • Ideal for HPC with large datasets or for AI training (which requires large datasets)
  • Storage in new enterprise & data center SSD form factor (EDSFF)
  • AMD EPYC-based, with PCIe 4 support

Inside the Big Data Cluster

Compute

AMD EPYC Processors

Networking

NVIDIA Networking Ethernet and InfiniBand

Storage

425TB of Capacity per Node

Software-defined storage using Weka.io file system (WekaFS), optimized for large datasets

High-speed, low-latency NVIDIA adapters


About Weka Software-Defined-Storage

Weka.io’s key offering, WekaFS, is an alternative to GPFS, IBM Spectrum Scale or Lustre. Because legacy storage designs forced customers to deploy different architectures to satisfy the needs of different workloads, WekaFS was built from the ground up to address the diverse requirements of modern workloads. This technology enables clients to pool all their data and manage it through a global namespace. With dramatically simplified administration of storage, Big Data Cluster users can easily access and manage data at scale and deliver better outcomes.

Are you trying to remove storage bottlenecks in your accelerated workloads?