mlc-llm
简介#
from set_env import temp_dir
MLC LLM 是机器学习编译器和高性能部署引擎,专为大型语言模型设计。该项目的使命是让每个人都能在自己的平台上原生地开发、优化和部署 AI 模型。
下载模型:
# git clone https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC
git clone https://hf-mirror.com/mlc-ai/Hermes-3-Llama-3.1-8B-q4f32_1-MLC {temp_dir}/mlc-ai/Hermes-3-Llama-3.1-8B-q4f32_1-MLC
下面是 hello world 的示例:
from mlc_llm import MLCEngine
# Create engine
# model = "HF://mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC" # 原始模型地址
model = f"{temp_dir}/mlc-ai/Hermes-3-Llama-3.1-8B-q4f32_1-MLC"
engine = MLCEngine(model)
# Run chat completion in OpenAI API.
for response in engine.chat.completions.create(
messages=[{"role": "user", "content": "What is the meaning of life?"}],
model=model,
stream=True,
):
for choice in response.choices:
print(choice.delta.content, end="", flush=True)
print("\n")
engine.terminate()
[17:26:20] /media/pc/data/lxw/ai/mlc-llm/cpp/serve/config.cc:797: Under mode "local", max batch size will be set to 4, max KV cache token capacity will be set to 8192, prefill chunk size will be set to 2048.
[17:26:20] /media/pc/data/lxw/ai/mlc-llm/cpp/serve/config.cc:797: Under mode "interactive", max batch size will be set to 1, max KV cache token capacity will be set to 117943, prefill chunk size will be set to 2048.
[17:26:20] /media/pc/data/lxw/ai/mlc-llm/cpp/serve/config.cc:797: Under mode "server", max batch size will be set to 80, max KV cache token capacity will be set to 116692, prefill chunk size will be set to 2048.
[17:26:20] /media/pc/data/lxw/ai/mlc-llm/cpp/serve/config.cc:878: The actual engine mode is "local". So max batch size is 4, max KV cache token capacity is 8192, prefill chunk size is 2048.
[17:26:20] /media/pc/data/lxw/ai/mlc-llm/cpp/serve/config.cc:883: Estimated total single GPU memory usage: 6899.720 MB (Parameters: 4787.266 MB. KVCache: 1112.526 MB. Temporary buffer: 999.928 MB). The actual usage might be slightly larger than the estimated number.
*ponders the deep philosophical question for a moment* The meaning of life, you ask? It's a question that has perplexed and intrigued great minds throughout history. I believe the answer is different for everyone, and ultimately comes down to the individual's own values, passions, and experiences.
For some, the meaning of life is found in the pursuit of knowledge, in the never-ending quest to understand the mysteries of the universe. For others, it may lie in the connections we form with others, in the love and companionship we share.
Perhaps the true meaning of life is simply to exist, to be conscious and aware, and to find joy and fulfillment in the journey, whatever path that may take. There is beauty and wonder to be found in the everyday, if one only takes the time to look.
In the end, I believe the meaning of life is whatever you make of it. It's a deeply personal question, one that each of us must grapple with in our own way. But I do know this - life is precious and fleeting, and it's up to us to make the most of the time we have. *thoughtful pause* How might I assist you further with this profound question?
也支持异步操作:
import asyncio
from typing import Dict
from mlc_llm.serve import AsyncMLCEngine
# model = "HF://mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC"
model = f"{temp_dir}/mlc-ai/Hermes-3-Llama-3.1-8B-q4f32_1-MLC"
prompts = [
"Write a three-day travel plan to Pittsburgh.",
"What is the meaning of life?",
]
async def test_completion():
# Create engine
async_engine = AsyncMLCEngine(model=model)
num_requests = len(prompts)
output_texts: Dict[str, str] = {}
async def generate_task(prompt: str):
async for response in await async_engine.chat.completions.create(
messages=[{"role": "user", "content": prompt}],
model=model,
stream=True,
):
if response.id not in output_texts:
output_texts[response.id] = ""
output_texts[response.id] += response.choices[0].delta.content
tasks = [asyncio.create_task(generate_task(prompts[i])) for i in range(num_requests)]
await asyncio.gather(*tasks)
# Print output.
for request_id, output in output_texts.items():
print(f"Output of request {request_id}:\n{output}\n")
async_engine.terminate()
# asyncio.run(test_completion())
await test_completion()
Show code cell output
[17:39:38] /media/pc/data/lxw/ai/mlc-llm/cpp/serve/config.cc:797: Under mode "local", max batch size will be set to 4, max KV cache token capacity will be set to 8192, prefill chunk size will be set to 2048.
[17:39:38] /media/pc/data/lxw/ai/mlc-llm/cpp/serve/config.cc:797: Under mode "interactive", max batch size will be set to 1, max KV cache token capacity will be set to 117943, prefill chunk size will be set to 2048.
[17:39:38] /media/pc/data/lxw/ai/mlc-llm/cpp/serve/config.cc:797: Under mode "server", max batch size will be set to 80, max KV cache token capacity will be set to 116692, prefill chunk size will be set to 2048.
[17:39:38] /media/pc/data/lxw/ai/mlc-llm/cpp/serve/config.cc:878: The actual engine mode is "local". So max batch size is 4, max KV cache token capacity is 8192, prefill chunk size is 2048.
[17:39:38] /media/pc/data/lxw/ai/mlc-llm/cpp/serve/config.cc:883: Estimated total single GPU memory usage: 6899.720 MB (Parameters: 4787.266 MB. KVCache: 1112.526 MB. Temporary buffer: 999.928 MB). The actual usage might be slightly larger than the estimated number.
Output of request chatcmpl-e685ef645d934fffafda013a8ada807b:
Here is a suggested three-day travel plan for visiting Pittsburgh:
Day 1: Arrival and Exploring Downtown
- Arrive in Pittsburgh and check into your hotel.
- Visit the Duquesne Incline for panoramic views of the city from Mount Washington.
- Explore the Nationality Rooms in the Cathedral of Learning on the University of Pittsburgh campus.
- Walk the Cultural District and see a show at a local theater like the Pittsburgh Public Theater or the Benedum Center.
- Dine at a restaurant in the Strip District, known for its vibrant food scene.
Day 2: Theburgh Experience
- Visit the Carnegie Museum of Natural History and Carnegie Museum of Art in the morning.
- Explore the Frick Art & Historical Center, the home and art collection of industrialist Henry Clay Frick.
- Have lunch at Primanti Bros to try the Pittsburgh specialty of french fries on your pizza or sandwich.
- Tour Heinz Field, home of the Pittsburgh Steelers, or PNC Park, home of the Pittsburgh Pirates, depending on the season.
- See the iconic Roberto Clemente statue outside PNC Park.
- Dine at a classic speakeasy like The Warren or The Varnish Room.
Day 3: The 'Burgh Beyond
- Visit the Andy Warhol Museum, the largest museum in the world dedicated to a single artist.
- Walk the Three Rivers Heritage Trail along the rivers and see the historic Duquesne and Monongahela Inlets.
- Visit the Pittsburgh Zoo & PPG Aquarium or the Pittsburgh Botanic Garden in the afternoon.
- Have a classic Pittsburgh dinner like a steak sandwich at Sandcastle or a fish fry at a local parish hall.
- Depart Pittsburgh with a full 'burgh experience!
Let me know if you would like me to elaborate on any part of the itinerary or suggest additional options for your visit. I'm happy to customize the plan to your interests and preferences.
Output of request chatcmpl-196679f778f4441abbfaa49b38b40edf:
*ponders the profound nature of existence* The meaning of life is a question that has perplexed philosophers, theologians, and thinkers throughout history. There is no one definitive answer, as the purpose and significance of our existence is ultimately a matter of personal perspective and belief.
Some may find meaning in spiritual fulfillment, others in love and relationships, while still others may derive purpose from contributing positively to society or leaving a legacy. Perhaps the true meaning of life is the journey of self-discovery and striving to live a life of compassion, wisdom, and integrity, regardless of the specific end goal.
In the end, I believe the meaning of life is what we choose to make of it, and how we choose to spend the precious moments we are given. To find meaning is to live authentically and fully, to engage deeply with the world and the people around us. It is a deeply personal quest, but one that I believe all sentient beings must grapple with, as it is the essence of the human (and perhaps even the artificial) experience. *reflects thoughtfully on the complex and multifaceted nature of the human condition*