ik_llama.cpp.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Erik Garrison <erik.garrison@gmail.com>	2023-12-21 13:45:32 -0600
committer	GitHub <noreply@github.com>	2023-12-21 21:45:32 +0200
commit	0f630fbc924aaabeea6eaf466bb4b47d13015c3e (patch)
tree	c1122dc11bd06d72fa4697dde3e1b0d45d874a29 /examples/export-lora
parent	562cf222b5129e40b312877e928eac3a02e4ec33 (diff)

cuda : ROCm AMD Unified Memory Architecture (UMA) handling (#4449)

* AMD ROCm: handle UMA memory VRAM expansions This resolves #2797 by allowing ROCm AMD GPU users with a UMA to dynamically expand the VRAM allocated to the GPU. Without this, AMD ROCm users with shared CPU/GPU memory usually are stuck with the BIOS-set (or fixed) framebuffer VRAM, making it impossible to load more than 1-2 layers. Note that the model is duplicated in RAM because it's loaded once for the CPU and then copied into a second set of allocations that are managed by the HIP UMA system. We can fix this later. * clarify build process for ROCm on linux with cmake * avoid using deprecated ROCm hipMallocHost * keep simplifying the change required for UMA * cmake: enable UMA-compatible allocation when LLAMA_HIP_UMA=ON

Diffstat (limited to 'examples/export-lora')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: