Mjölnir User Documentation
Welcome to the Ruqola project NTU research server (aka Mjölnir)! This documentation provides comprehensive guidance for using our shared GPU computing resources effectively.
You can access a Jeckyll version of this documentation here.
🖥️ Server Specifications
- GPUs: 3x NVIDIA H200 (80GB HBM3e each)
- Total GPU Memory: 240GB
- Custom Queue Management: Fair resource allocation system
📚 Documentation Structure
For New Users
Server Users Creation and Deletion
- Create/Delete Users - Explain the two minimal scripts used to create and delete users either individually or from a csv file containing multiple users.
File System and Folders Structure
- Users Quota - Explain the quotas system used to limit the disk space used by each user and some essential bash commands to check and manage quotas.
- Scratch Folder - Guidelines and information for use the scratch folder to store large datasets and programmes artifacts.
GPU Queue System
Deep Learning Frameworks
Examples and Scripts
🚀 Quick Start
- First Time Setup: Read Bash Basics
- Familiarise yourself with file and folder structure: Read Users Quota and Scratch Folder
- Submit Your First Job: Check GPU Queue Guide
- Choose Your Framework: Select from PyTorch, TensorFlow, or JAX guides
- Optimize Your Code: Review Best Practices
⚡ Quick Commands
# Check GPU availability
gpuq status
# Submit a training job
conda activate $your_environment
gpuq submit --command "python train.py" --gpus 1 --time 8
# Monitor GPUs in real-time
nvidia-smi -l 1
# Check your running jobs
gpuq status | grep $USER
📞 Getting Help
- Technical Issues: Contact your server administrator
- Documentation Updates: Submit suggestions or corrections
- Queue System: Check gpuq/README.md for technical details
Last updated: August 2025