Research Focus
At OktoSeek, our research focuses on building useful software that increases productivity with minimal effort. We investigate how to create tools that handle complex infrastructure automatically, enabling end users to work with descriptive languages rather than low-level implementation details.
Our core research question is: How can we abstract away infrastructure complexity so that users can express their intent through simple, descriptive languages?
"We handle all the infrastructure so you can focus on what matters: describing what you want to build."
This research direction has led to OktoScript — a descriptive language where users specify what they want to achieve, not how to implement it. We handle the infrastructure, compilation, optimization, and execution automatically.
Key Research Areas
- Infrastructure abstraction: Hiding complexity while maintaining power and flexibility
- Descriptive language design: Creating languages that express intent clearly
- Automated compilation: Translating high-level descriptions into optimized execution
- Productivity optimization: Reducing effort required to achieve results
- Model research: Developing small, medium, and large models for different use cases
- Instruction-based systems: Researching how models respond to descriptive instructions
Infrastructure Abstraction Research
A fundamental focus of our research is abstracting away infrastructure complexity. We investigate how to handle environment setup, dependency management, resource allocation, and execution optimization automatically, so users can focus on describing their goals.
Automatic Infrastructure Management
Our research explores:
- Environment orchestration: Automatically setting up and managing execution environments
- Dependency resolution: Handling package installation and version management transparently
- Resource allocation: Optimizing compute, memory, and storage usage automatically
- Execution optimization: Compiling descriptive specifications into efficient execution plans
- Error handling and recovery: Managing failures and retries without user intervention
From Complexity to Simplicity
Traditional AI development requires:
- Setting up Python environments and virtual environments
- Installing and managing dependencies manually
- Configuring training frameworks and libraries
- Writing hundreds of lines of boilerplate code
- Managing GPU/CPU allocation and memory optimization
- Handling checkpointing, logging, and monitoring
Our research eliminates all of this. Users describe what they want in OktoScript, and we handle everything else automatically.
PROJECT "my-model"
DATASET {
train: "data/train.jsonl"
validation: "data/val.jsonl"
}
MODEL {
base: "gpt2"
}
TRAIN {
epochs: 5
batch_size: 32
}
Descriptive Language Research
Our research in descriptive languages focuses on creating syntax that allows users to express their intent clearly and concisely. OktoScript represents our exploration of how to design languages that are both powerful and accessible.
Language Design Principles
We research how to design languages that:
- Express intent, not implementation: Users describe what they want, not how to achieve it
- Hide complexity: Abstract away technical details while maintaining control when needed
- Enable composition: Allow users to combine simple statements into complex systems
- Provide clarity: Make it easy to understand what a specification does
- Support iteration: Enable rapid experimentation and refinement
OktoScript: A Research Outcome
OktoScript is the result of our research in descriptive languages. It demonstrates how complex AI training workflows can be expressed through simple, declarative statements:
PROJECT "code-assistant"
ENV {
accelerator: "gpu"
precision: "fp16"
install_missing: true
}
DATASET {
train: "code/train.jsonl"
validation: "code/val.jsonl"
}
MODEL {
base: "gpt2"
}
TRAIN {
epochs: 10
batch_size: 32
learning_rate: 2e-5
}
EXPORT {
format: ["okm", "onnx"]
path: "export/"
}
This research enables users to focus on their goals rather than implementation details, dramatically reducing the effort required to build AI systems.
Model Research: Small, Medium, and Large
Our research includes developing models of different scales for community use. We investigate how to create models that are both useful and accessible, balancing capability with resource requirements.
Small and Medium Models for Community
We research how to create small and medium-sized models that:
- Run efficiently on consumer hardware: Models that work on standard CPUs and GPUs
- Provide useful capabilities: Focused models that excel at specific tasks
- Enable local deployment: Models that can run without cloud dependencies
- Support fine-tuning: Models that can be adapted to specific use cases
- Balance performance and size: Optimizing the trade-off between capability and resource usage
These models serve as research contributions to the community, enabling developers and researchers to build on our work and create their own solutions.
Large Models and Instruction Solutions
Our research in large models focuses on:
- Instruction following: How models respond to descriptive instructions and commands
- Multi-task capability: Models that can handle diverse tasks through instruction
- Reasoning and problem-solving: Large models that can reason through complex problems
- Instruction-based fine-tuning: Training models to better follow instructions
- Scalability research: Understanding how model size affects instruction-following capability
Instruction-Based Systems
A key research direction is how models interpret and execute instructions:
- Instruction parsing: How models understand natural language instructions
- Task decomposition: Breaking complex instructions into executable steps
- Context understanding: Models that maintain context across instruction sequences
- Instruction optimization: Researching how to write instructions that produce better results
- Multi-modal instructions: Extending instruction-following to different input types
Research Goal: Create models that understand descriptive instructions naturally, enabling users to interact with AI systems through simple, human-readable commands rather than complex APIs or programming interfaces.
Instruction-Based Solutions Research
Our research in instruction-based systems explores how to build AI that responds to descriptive instructions. This connects directly to our work in descriptive languages: just as OktoScript allows users to describe training workflows, instruction-based models allow users to describe tasks and goals.
Instruction Following Architecture
We research:
- Instruction encoding: How to represent instructions in model architectures
- Task-specific adaptation: Fine-tuning models to follow instructions in specific domains
- Few-shot instruction learning: Models that learn from examples of instruction-following
- Instruction generalization: Models that can follow instructions for tasks they haven't seen during training
- Multi-step instruction execution: Models that can execute complex, multi-part instructions
Connecting Instructions to Execution
Our research bridges the gap between descriptive instructions and actual execution:
- Instruction-to-code translation: Converting natural language instructions into executable code
- Instruction-to-workflow mapping: Translating task descriptions into training or execution workflows
- Dynamic instruction interpretation: Systems that adapt execution based on instruction context
- Instruction validation: Ensuring instructions are feasible and safe to execute
Research Applications
This research enables:
- Users describing tasks in natural language, with systems automatically executing them
- Models that understand context and can follow complex, multi-step instructions
- Systems that learn from instruction examples and improve over time
- Tools that bridge the gap between human intent and machine execution