Research at OktoSeek

Building Useful Software That Increases Productivity With Minimal Effort

Table of Contents

Research Focus

At OktoSeek, our research focuses on building useful software that increases productivity with minimal effort. We investigate how to create tools that handle complex infrastructure automatically, enabling end users to work with descriptive languages rather than low-level implementation details.

Our core research question is: How can we abstract away infrastructure complexity so that users can express their intent through simple, descriptive languages?

"We handle all the infrastructure so you can focus on what matters: describing what you want to build."

This research direction has led to OktoScript — a descriptive language where users specify what they want to achieve, not how to implement it. We handle the infrastructure, compilation, optimization, and execution automatically.

Key Research Areas

  • Infrastructure abstraction: Hiding complexity while maintaining power and flexibility
  • Descriptive language design: Creating languages that express intent clearly
  • Automated compilation: Translating high-level descriptions into optimized execution
  • Productivity optimization: Reducing effort required to achieve results
  • Model research: Developing small, medium, and large models for different use cases
  • Instruction-based systems: Researching how models respond to descriptive instructions

Infrastructure Abstraction Research

A fundamental focus of our research is abstracting away infrastructure complexity. We investigate how to handle environment setup, dependency management, resource allocation, and execution optimization automatically, so users can focus on describing their goals.

Automatic Infrastructure Management

Our research explores:

  • Environment orchestration: Automatically setting up and managing execution environments
  • Dependency resolution: Handling package installation and version management transparently
  • Resource allocation: Optimizing compute, memory, and storage usage automatically
  • Execution optimization: Compiling descriptive specifications into efficient execution plans
  • Error handling and recovery: Managing failures and retries without user intervention

From Complexity to Simplicity

Traditional AI development requires:

  • Setting up Python environments and virtual environments
  • Installing and managing dependencies manually
  • Configuring training frameworks and libraries
  • Writing hundreds of lines of boilerplate code
  • Managing GPU/CPU allocation and memory optimization
  • Handling checkpointing, logging, and monitoring

Our research eliminates all of this. Users describe what they want in OktoScript, and we handle everything else automatically.

# User describes intent: PROJECT "my-model" DATASET { train: "data/train.jsonl" validation: "data/val.jsonl" } MODEL { base: "gpt2" } TRAIN { epochs: 5 batch_size: 32 } # OktoEngine automatically handles: # - Environment setup # - Dependency installation # - Resource allocation # - Training execution # - Checkpointing and monitoring # - Error recovery

Descriptive Language Research

Our research in descriptive languages focuses on creating syntax that allows users to express their intent clearly and concisely. OktoScript represents our exploration of how to design languages that are both powerful and accessible.

Language Design Principles

We research how to design languages that:

  • Express intent, not implementation: Users describe what they want, not how to achieve it
  • Hide complexity: Abstract away technical details while maintaining control when needed
  • Enable composition: Allow users to combine simple statements into complex systems
  • Provide clarity: Make it easy to understand what a specification does
  • Support iteration: Enable rapid experimentation and refinement

OktoScript: A Research Outcome

OktoScript is the result of our research in descriptive languages. It demonstrates how complex AI training workflows can be expressed through simple, declarative statements:

# Complex training pipeline expressed simply: PROJECT "code-assistant" ENV { accelerator: "gpu" precision: "fp16" install_missing: true } DATASET { train: "code/train.jsonl" validation: "code/val.jsonl" } MODEL { base: "gpt2" } TRAIN { epochs: 10 batch_size: 32 learning_rate: 2e-5 } EXPORT { format: ["okm", "onnx"] path: "export/" }

This research enables users to focus on their goals rather than implementation details, dramatically reducing the effort required to build AI systems.

Model Research: Small, Medium, and Large

Our research includes developing models of different scales for community use. We investigate how to create models that are both useful and accessible, balancing capability with resource requirements.

Small and Medium Models for Community

We research how to create small and medium-sized models that:

  • Run efficiently on consumer hardware: Models that work on standard CPUs and GPUs
  • Provide useful capabilities: Focused models that excel at specific tasks
  • Enable local deployment: Models that can run without cloud dependencies
  • Support fine-tuning: Models that can be adapted to specific use cases
  • Balance performance and size: Optimizing the trade-off between capability and resource usage

These models serve as research contributions to the community, enabling developers and researchers to build on our work and create their own solutions.

Large Models and Instruction Solutions

Our research in large models focuses on:

  • Instruction following: How models respond to descriptive instructions and commands
  • Multi-task capability: Models that can handle diverse tasks through instruction
  • Reasoning and problem-solving: Large models that can reason through complex problems
  • Instruction-based fine-tuning: Training models to better follow instructions
  • Scalability research: Understanding how model size affects instruction-following capability

Instruction-Based Systems

A key research direction is how models interpret and execute instructions:

  • Instruction parsing: How models understand natural language instructions
  • Task decomposition: Breaking complex instructions into executable steps
  • Context understanding: Models that maintain context across instruction sequences
  • Instruction optimization: Researching how to write instructions that produce better results
  • Multi-modal instructions: Extending instruction-following to different input types
Research Goal: Create models that understand descriptive instructions naturally, enabling users to interact with AI systems through simple, human-readable commands rather than complex APIs or programming interfaces.

Instruction-Based Solutions Research

Our research in instruction-based systems explores how to build AI that responds to descriptive instructions. This connects directly to our work in descriptive languages: just as OktoScript allows users to describe training workflows, instruction-based models allow users to describe tasks and goals.

Instruction Following Architecture

We research:

  • Instruction encoding: How to represent instructions in model architectures
  • Task-specific adaptation: Fine-tuning models to follow instructions in specific domains
  • Few-shot instruction learning: Models that learn from examples of instruction-following
  • Instruction generalization: Models that can follow instructions for tasks they haven't seen during training
  • Multi-step instruction execution: Models that can execute complex, multi-part instructions

Connecting Instructions to Execution

Our research bridges the gap between descriptive instructions and actual execution:

  • Instruction-to-code translation: Converting natural language instructions into executable code
  • Instruction-to-workflow mapping: Translating task descriptions into training or execution workflows
  • Dynamic instruction interpretation: Systems that adapt execution based on instruction context
  • Instruction validation: Ensuring instructions are feasible and safe to execute

Research Applications

This research enables:

  • Users describing tasks in natural language, with systems automatically executing them
  • Models that understand context and can follow complex, multi-step instructions
  • Systems that learn from instruction examples and improve over time
  • Tools that bridge the gap between human intent and machine execution

From Research to Tools

Our research translates directly into tools that increase productivity with minimal effort:

OktoScript: Descriptive Language Research

Our research in descriptive languages became OktoScript — a language where users describe what they want to build, not how to build it. This research enables:

  • Expressing complex AI workflows through simple, declarative statements
  • Focusing on goals rather than implementation details
  • Reducing the effort required to build AI systems
  • Enabling rapid iteration and experimentation

OktoEngine: Infrastructure Abstraction Research

Our research in infrastructure abstraction became OktoEngine — a CLI tool that handles all infrastructure automatically. This research enables:

  • Automatic environment setup and dependency management
  • Resource allocation and optimization without user intervention
  • Error handling and recovery handled transparently
  • Execution optimization that improves automatically

OktoSeek IDE: Visual Descriptive Interface

Our research in visual interfaces became the OktoSeek IDE — an environment that makes descriptive languages even more accessible. This research creates:

  • Visual interfaces for building descriptive specifications
  • Real-time feedback on what specifications will do
  • Integrated tools that reduce the gap between intent and execution
  • Environments that make complex systems accessible
Research Philosophy: We handle all the infrastructure complexity so users can focus on describing what they want to achieve. Our research creates tools that increase productivity by reducing effort, not by adding complexity.

Through continued research in infrastructure abstraction, descriptive languages, and instruction-based systems, OktoSeek is building software that makes AI development accessible, efficient, and productive for everyone.