Senior AI / LLM Systems Engineer
Job summary
TriMerge Consulting, P.A. is Seeking a Senior AI/LLM Systems Engineer to architect and deploy a private, enterprise-grade LLM system. Lead RAG pipelines, model fine-tuning, secure knowledge integration, and optimize AI performance to create TriMerge IQ, a trusted internal AI assistant.
Job descriptions & requirements
- Design and deploy private LLM infrastructure (self-hosted)
- Implement Retrieval-Augmented Generation (RAG) pipelines
- Build document ingestion, chunking, and embedding systems
- Fine-tune models (LoRA / QLoRA / PEFT) on TriMerge data
- Optimize inference performance and latency
- Implement a secure model serving architecture
- Develop evaluation benchmarks for answer quality
- Ensure strict data privacy and access controls
- Collaborate with platform engineers on API integration
- Monitor model performance and continuously improve outputs
Requirements:
- 4+ years in ML/AI engineering
- Hands-on experience deploying open-source LLMs (e.g., Llama, Mistral)
- Strong Python expertise
- Experience with RAG frameworks (LangChain, LlamaIndex)
- Experience with vector databases (Pinecone, Weaviate, pgvector, Milvus)
- Familiarity with GPU infrastructure and model optimization
- Understanding of prompt engineering & instruction tuning
- Experience with Docker and Kubernetes
- Strong understanding of security best practices
Preferred Qualifications:
- Experience in enterprise AI systems
- Knowledge of LoRA/QLoRA fine-tuning
- Experience with inference engines such as vLLM or Triton
- Knowledge graph or semantic search experience Consulting or proposal development, domain familiarity
Important safety tips
- Do not make any payment without confirming with the Jobberman Customer Support Team.
- If you think this advert is not genuine, please report it via the Report Job link below.