Python API¶
Use MediaLLM programmatically in your Python applications.
Quick Start¶
import mediallm
# Initialize MediaLLM
ml = mediallm.MediaLLM()
# Generate commands from natural language
commands = ml.generate_command("convert video.mp4 to MP3")
print("Generated commands:", commands)
# Scan for media files
workspace = ml.scan_workspace()
print(f"Found {len(workspace.get('videos', []))} videos")
MediaLLM Class¶
Initialization¶
ml = mediallm.MediaLLM(
workspace=None, # Pre-scanned workspace
ollama_host="http://localhost:11434", # Ollama server URL
model_name="llama3.1:latest", # LLM model to use
timeout=60, # Request timeout
working_dir=None # Working directory
)
Core Methods¶
generate_command()¶
Convert natural language to executable FFmpeg commands.
# Basic usage - returns executable commands
commands = ml.generate_command("convert video.mp4 to MP3")
for cmd in commands:
print(" ".join(cmd)) # Print the command
# Get raw plan object instead
plan = ml.generate_command("extract audio", return_raw=True)
print(plan.action) # Action enum
print(plan.inputs) # Input file paths
Parameters:
request(str): Natural language descriptionreturn_raw(bool): Return CommandPlan object vs executable commandsassume_yes(bool): Skip confirmation prompts
scan_workspace()¶
Discover media files in a directory.
# Scan current directory
workspace = ml.scan_workspace()
# Scan specific directory
workspace = ml.scan_workspace("/path/to/media")
Returns:
{
"cwd": "/path/to/directory",
"videos": ["video1.mp4", "video2.avi"],
"audios": ["audio1.mp3", "audio2.wav"],
"images": ["image1.jpg", "image2.png"],
"subtitle_files": ["subs.srt"]
}
Properties¶
available_files¶
Get categorized media files:
files = ml.available_files
print(files["videos"]) # List of video files
print(files["audios"]) # List of audio files
print(files["images"]) # List of image files
print(files["subtitles"]) # List of subtitle files
workspace¶
Access current workspace data:
Complete Example¶
import mediallm
import subprocess
from pathlib import Path
def process_media_batch():
# Initialize with custom settings
ml = mediallm.MediaLLM(
model_name="llama3.2:latest",
timeout=120
)
# Scan for media files
workspace = ml.scan_workspace("./input_media")
# Process each video
for video_path in workspace["videos"]:
video = Path(video_path).name
try:
# Generate commands for audio extraction
commands = ml.generate_command(
f"extract audio from {video} as high quality MP3"
)
# Execute commands
for cmd in commands:
result = subprocess.run(cmd, check=True, capture_output=True)
print(f"Processed {video}: {result.returncode}")
except Exception as e:
print(f"Error processing {video}: {e}")
if __name__ == "__main__":
process_media_batch()
Error Handling¶
from mediallm.utils.exceptions import TranslationError
try:
commands = ml.generate_command("invalid request that makes no sense")
except TranslationError as e:
print(f"Could not understand request: {e}")
except RuntimeError as e:
print(f"System error: {e}")
Next Steps¶
- CLI Usage → - Command-line interface
- MCP Server → - AI agent integration