diff options
author | Erik Garrison <erik.garrison@gmail.com> | 2023-12-21 13:45:32 -0600 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-12-21 21:45:32 +0200 |
commit | 0f630fbc924aaabeea6eaf466bb4b47d13015c3e (patch) | |
tree | c1122dc11bd06d72fa4697dde3e1b0d45d874a29 /examples/export-lora | |
parent | 562cf222b5129e40b312877e928eac3a02e4ec33 (diff) |
cuda : ROCm AMD Unified Memory Architecture (UMA) handling (#4449)
* AMD ROCm: handle UMA memory VRAM expansions
This resolves #2797 by allowing ROCm AMD GPU users with a UMA to
dynamically expand the VRAM allocated to the GPU.
Without this, AMD ROCm users with shared CPU/GPU memory usually are
stuck with the BIOS-set (or fixed) framebuffer VRAM, making it
impossible to load more than 1-2 layers.
Note that the model is duplicated in RAM because it's loaded once for
the CPU and then copied into a second set of allocations that are
managed by the HIP UMA system. We can fix this later.
* clarify build process for ROCm on linux with cmake
* avoid using deprecated ROCm hipMallocHost
* keep simplifying the change required for UMA
* cmake: enable UMA-compatible allocation when LLAMA_HIP_UMA=ON
Diffstat (limited to 'examples/export-lora')
0 files changed, 0 insertions, 0 deletions