Numpy

MemoryError: Unable to Allocate Array - How to Fix It

Answer

This error means NumPy tried to create an array larger than your available RAM. Fix it by using a smaller dtype (e.g., float32 instead of float64), processing data in chunks, using memory-mapped files with np.memmap, or switching to a machine with more memory.

Why This Happens

NumPy arrays live entirely in RAM. A 10,000 x 10,000 array of float64 uses 800MB. Scale that to 100,000 x 100,000 and you need 80GB. If your machine doesn't have enough free memory, NumPy fails before the array is even created.

Solution

The rule: calculate expected memory usage before creating large arrays. Use smaller dtypes, memory mapping, chunking, or sparse arrays depending on your use case.

import numpy as np

# ❌ Problematic: array too large for memory
huge_array = np.zeros((100000, 100000), dtype=np.float64)
# MemoryError: Unable to allocate 74.5 GiB

# ✅ Fixed: use smaller dtype
huge_array = np.zeros((100000, 100000), dtype=np.float32)  # half the memory

# ✅ Even smaller: use int8 or float16 if precision allows
huge_array = np.zeros((100000, 100000), dtype=np.float16)  # quarter the memory

# ✅ Fixed: use memory-mapped file (data lives on disk, not RAM)
huge_array = np.memmap('temp_array.dat', dtype=np.float32, 
                        mode='w+', shape=(100000, 100000))

# ✅ Fixed: process in chunks instead of all at once
chunk_size = 10000
for i in range(0, 100000, chunk_size):
    chunk = np.zeros((chunk_size, 100000), dtype=np.float32)
    # process chunk
    del chunk  # free memory

# ✅ Check memory before allocating
array_size_gb = (100000 * 100000 * 8) / (1024**3)  # float64 = 8 bytes
print(f"Array would use {array_size_gb:.1f} GB")

# ✅ Use sparse arrays if most values are zero
from scipy import sparse
sparse_array = sparse.csr_matrix((100000, 100000))

Better Workflow

Zerve runs each block on cloud-based serverless compute with up to 30 GB per execution. Instead of fighting your laptop's 8-16 GB RAM (minus what the OS needs), you can run memory-intensive operations on infrastructure built to handle them. Break large arrays into chunks across parallel blocks, each running on separate compute instances. No hardware limits, no MemoryError.